Artificial intelligence, machine learning and deep learning in neuroradiology: Current applications

Danilo Caudo; Aless; ro Santalco; Simona Cammaroto; Carmelo Anfuso; Alessia Biondo; Rosaria Torre; Caterina Benedetto; Annalisa Militi; Chiara Smorto; Fabio Italiano; Ugo Barbaro; Rosa Morabito

Review Article - Onkologia i Radioterapia ( 2023) Volume 17, Issue 2

Artificial intelligence, machine learning and deep learning in neuroradiology: Current applications

Danilo Caudo^*, Alessandro Santalco, Simona Cammaroto, Carmelo Anfuso, Alessia Biondo, Rosaria Torre, Caterina Benedetto, Annalisa Militi, Chiara Smorto, Fabio Italiano, Ugo Barbaro and Rosa Morabito

IRCCS Centro Neurolesi Bonino-Pulejo (Messina - ITALY), Italy

^*Corresponding Author:
Danilo Caudo, IRCCS Centro Neurolesi Bonino-Pulejo (Messina - ITALY), Italy, Email: danycaudo@gmail.com

Received: 04-Jan-2023, Manuscript No. OAR-23-86840; Accepted: 25-Jan-2023, Pre QC No. OAR-23-86840 (PQ); Editor assigned: 06-Jan-2023, Pre QC No. OAR-23-86840 (PQ); Reviewed: 20-Jan-2023, QC No. OAR-23-86840 (Q); Revised: 22-Jan-2023, Manuscript No. OAR-23-86840 (R); Published: 01-Feb-2023

Abstract

Artificial intelligence is rapidly expanding in all medical fields and especially in neuroimaging/neuroradiology (more than 5000 articles indexed on PubMed in 2021) however, few reviews summarize its clinical application in diagnosis and clinical management in patients with neurological diseases. Globally, neurological and mental disorders impact 1 in 3 people over their lifetime, so this technology could have a strong clinical impact on daily medical work. This review summarizes and describes the technical background of artificial intelligence and the main tools dedicated to neuroimaging and neuroradiology explaining its utility to improve neurological disease diagnosis and clinical management.

References

Tools Technologies Articles Blog | Casino Sites in Sweden

Keywords

deep learning, artificial intelligence, machine learning, neuroradiology, neuroimaging

Introduction

The ever-increasing number of diagnostic tests requires rapid reporting without reducing diagnostic accuracy [1], and this could lead to misdiagnosis. In this context, the recent exponential increase in publications related to Artificial Intelligence (AI) and the central focus on artificial intelligence at recent professional and scientific radiology meetings underscores the importance of artificial intelligence to improve neurological disease diagnosis and clinical management.

Currently, there are many well-known applications of AI in diagnostic imaging, however, few reviews summarized its applications in Neuroimaging/neuroradiology. Thus, we aim at providing a technical background of AI and an overview of the current literature on the clinical applications of AI in neuroradiology/neuroimaging highlighting current tools and rendering a few predictions regarding the near future.

Technical background of AI

Any computer technique that simulates human intelligence is considered AI. AI is composed of Machine Learning (ML) and Deep Learning (DL) (Figure 1). ML designs systems to learn and improve from experience without being preprogrammed based on statistical data using computer technology. ML uses observations and data which are taken as examples to create some models and algorithms which are then used to make future decisions. In ML, some “ground truth” exists, which is used to train the algorithms. One example is a collection of brain CT scans that a neuroradiologist has classified into different groups (ie, haemorrhage versus no haemorrhage). The goal is to design software to learn automatically and independently of any human intervention or assistance for further intended decisions (Figure 2). DL, representing ML processing, instead applies artificial Convolutional Neural Networks (CNNs) to accelerate the learning process [2, 3]. CNNs are non-linear structures of statistical data organized as modelling tools. They can be used to simulate complex relationships between inputs and outputs using several steps (layers) of nonlinear transformations, which other analytic functions cannot represent [2].

Figure 1: AI uses computers to mimic human intelligence

Figure 2: Automated Aneurysm detection on time-of-flight MRI.Ueda et al. [44]

CNNs can be trained to classify an image based on its characteristics through the observation of different images. More specifically, DL can identify common features in different images to use them as a classification model. For example, DL can be trained to find common features in variable images with and without a given pathology to discriminate between both entities. Consequently, it is possible to determine a specific diagnosis without human intervention and therefore there is some potential to improve both the time efficiency and th e productivity of ra diologists. A strength of DL is that its learning is based on growing experience. As a result, DL has enormous potential because it could update its response models by collecting data from large databases such as the Internet or the Picture Achieving and Comunication System (PACS). However, a limit is that as a consequence, algorithm performance depends largely on both the quantity and quality of data on which it is trained [4]. For example, an algorithm for tumour detection trained on a data set in which there is no occipital tumour is likely to have a higher error rate for tumours in that location. For a more complete description of DL, the reader is directed to the paper by Montagnon et al. [5].

Image acquisition and image quality improvement

Deep learning methods can be used to perform image reconstruction and improve image quality. AI can "learn" standard MR imaging reconstruction techniques, such as Cartesian and non-Cartesian acquisition schemes [6]. Additionally, deep learning methods could be applied to improve image quality. If low and high-resolution images are available, a deep mesh can be used to improve the resolution [7]. This has already been applied to CT imaging to improve resolution in low-dose CT images [8]. Another approach to improve image quality is to use MR images acquired at different magnetic fi eld strengths and co upled from the same anatomy [9].

AI is also able to reduce image acquisition times, this is especially useful in the case of DTI sequences, where the need for more angular directions extends the examination beyond what many patients can tolerate. A deep learning approach can reduce imaging duration by 12 times by predicting final parameter maps (fractional anisotropy, mean diffusivity, and so on) from relatively few angular directions [10]. There a re a lso s tudies i n which DL has increased the signal-to-noise ratio in Arterial Spin-Labelling (ASL) sequences to improve image quality [11]. Finally, some applications of AI could improve resolution and image enhancement by providing a better resolution and signal-to-noise ratio reducing the dose of contrast needed to provide diagnostic images [2].

Clinical AI applications in neuroradiology/ neuroimaging

Recently, 37 AI applications were reviewed in the domain of Neuroradiology/Neuroimaging from 27 vendors offering 111 functionalities [12]. These AI functionalities mostly support radiologists and extend their tasks. Interestingly, these AI applications are designed for just one pathology, such as ischemic stroke (35%), intracranial haemorrhage (27%), dementia (19%), multiple sclerosis (11%), or brain tumour (11%) to mention the most common [12]. In our review, we found miscellaneous clinical applications of AI in neuroradiology/neuroimaging ranging from the detection and classification of anomalies on imaging to the prediction of outcomes with disease quantification by estimating the volume of anatomical structures, the burden of lesions, and the volume of the tumour. In particular, regarding detecting tools, primary emphasis has been placed on identifying urgent findings that enable worklist prioritization for abnormalities such as intracranial haemorrhage [13-19], acute infarction [20-23], large-vessel occlusion [24, 25], aneurysm detection [26-28], and traumatic brain injury [29-31] on non-contrast head CT.

Other AI detecting tools are in brain degenerative disease, epilepsy, oncology, degenerative spine disease (to detect the size of the spinal canal, facet joints alterations, disc herniations, size of conjugation foramina, and in scoliosis the Cobb angle), fracture detection (vertebral fracture such as compression fracture), and in multiple sclerosis to identify disease burden over time and predicting disease activity. In glioma, some DL algorithms were tested to predict glioma genomics [32-36]

Regarding segmentation tools, we found tools able to segment vertebral disc, vertebral neuroforamina, and vertebral body for degenerative spine disease, brain tumour volume in the neurooncological field, and white and grey matter in degenerative brain diseases.

In the following paragraphs, we report the main diseases where AI is useful with the more significant relative studies.

Intracranial haemorrhage:

Intracranial haemorrhage detection has been widely studied as a potential clinical application of AI [37-39], able to work as an early warning system, raise diagnostic confidence [40], and classify haemorrhage types [41]. In particular, in these studies, Kuo et al. developed a robust haemorrhage detection model with an area under the receiver operating characteristic curve (ROCAUC) of 0.991 [14]. They trained a CNN on 4396 CT scans for classification and segmentation concurrently on training data which were labelled pixel-wise by attending radiologists. Their test set consisted of 200 CT scans obtained at the same institution at a later time. They report a sensitivity of 96% and a specificity of 98%.

Ker et al. developed a 3-Dimensional (3D) Convolutional Neural Network (CNN) model to detect 4 types of intracranial haemorrhage (subarachnoid haemorrhage, acute subdural haemorrhage, intraparenchymal hematoma, and brain polytrauma haemorrhage) [41]. They used a data set consisting of 399 locally acquired CT scans and experimented with data augmentation methods as well as various threshold levels (i.e, window levels) to achieve good results. They measured performance as a binary comparison between normal and one of the 4 haemorrhage types and achieved ROC-AUC values of 0.919 to 0.952. The RSNA 2019 Brain CT Hemorrhage Challenge was another milestone, in which a data set of 25 312 brain CT scans were expert-annotated and made available to the public [42]. The scans were sourced from 3 institutions with different scanner hardware and acquisition protocols. The submitted models were evaluated using logarithmic loss, and top models achieved excellent results on this metric (0.04383 for the first-place model). However, it is difficult to di rectly compare th is re sult to other st udies th at utilize the ROC-AUC of haemorrhage versus no haemorrhage as their performance metric. Approved commercial software for haemorrhage detection now exist on the market and has been evaluated in clinical settings. Rao et al. evaluated Aidoc (version 1.3, Tel Aviv) as a double reading tool for the prospective review of radiology reports [40]. They assessed 5585 non-contrast CT scans of the head at their institutions which were reported as being negative for haemorrhage and found 16 missed haemorrhages (0.2%), all of which were small haemorrhages. The software also flagged 12 false positives. The Aidoc software was also tested as a triage tool by Ginat in which the software, evaluating 2011 non-contrast CT head scans, contained both false positive and false negative findings of haemorrhage [13]. The st udy re ports sensitivity and specificity of 88.7% and 94.2% for haemorrhage detection, respectively. The author however described a benefit of false-positive flags for haemorrhage as these studies sometimes contained other hyperattenuating pathologies. On the flip side, the author reports a drawback in flagging in patient scans in which a haemorrhage is stable or even improving, which may unnecessarily prioritize nonurgent findings.

Aneurysm detection:

Detecting unruptured intracranial aneurysms has significant clinical importance considering that they account approximately for 85% of non-traumatic subarachnoid haemorrhages and their prevalence is estimated at approximately 3% [43]. MRI with Time- Of-Flight Angiography sequences (TOF-MRA) is the modality of choice for aneurysm screening, as it does not involve ionizing radiation nor intravenous contrast agents. Deep learning has been used to detect aneurysms on TOF-MRA. Published methods demonstrate high sensitivity but poor specificity, resulting in multiple false positives per case. Although this fact necessitates a close review of all aneurysms flagged by the software, the resulting models may nevertheless be useful as screening tools. Ueda et al. [44], tested a DL model on 521 scans from the same institution as well as 67 scans from an external data set. DL model achieved 91% to 93% sensitivity with a rate of 7 false positives per scan. Despite the high false positive rate, the authors found that this model helped them detect an additional 4.8% to 13% of aneurysms in the internal and external test data sets (Figure 2). In another study, Faron et al., trained and evaluated a CNN model on a data set of TOF-MRA scans on 85 patients. CNN achieved 90% sensitivity with a rate of 6.1 false positives per case [45]. Similarly, Nakao et al. trained a CNN on a data set of 450 patients with TOF-MRA scans on 3-Tesla magnets only [46]. They were able to achieve a better result at 94% sensitivity with 2.9 false positives per case. Yang et al. used a different approach, in which they produced 3D reconstructions of the intracranial vascular tree using TOF-MRA images [47]. They annotated aneurysms on the 3D projections and used them to train several models. In this manner, they achieved a good discriminatory result between healthy vessels and aneurysms.

Stroke:

In stroke imaging 3 components of stroke imaging are explored: Large Vessel Occlusion (LVO) detection (Figure 3), automated measurement of core infarct volume and the Alberta Stroke Program Early CT Score (ASPECTS) (Figure 4), and infarct prognostication (Figure 5).

Figure 3: Brainomix e-CTA tool demonstrating identification and localization of an LVO of the right MCA, collateral score and collateral vessel attenuation, and a heat map of the collateral deficit (orange)

Figure 4: A 74-year-old with a right-sided stroke due to MCA occlusion a) Two neuroradiologists assessed ASPECTS of 9 on the initial CT images b) Rapid aspects software evaluation is 4 on the same initial CT images

Figure 5: AI mobile interface showing a left MCA territory infarction with a mismatch on perfusion CT [59]

The timely detection of STROKE is critical in brain ischemic treatment, in this context AI has shown the potential in reducing the time to diagnosis. In particular, there are some AI applications able to detect LVO. You et al. developed an LVO detection model using clinical and imaging data (non-contrast CT scans of the head) [48]. AI detects the hyperdense Middle Cerebral Artery (MCA) sign which is a finding suggestive of the presence of an MCA thrombus. The AI, based on the U-Net architecture, was trained and tested on a local data set of 300 patients. It achieved a sensitivity and specificity of 68.4% and 61.4%, respectively. On NCCT, an SVM algorithm detected the MCA dot sign in patients with acute stroke with high sensitivity (97.5%) [49]. A neural network that incorporated various demographic, imaging, and clinical variables in predicting LVO outperformed or equalled most other prehospital prediction scales with an accuracy of 0.820 [50]. A CNN-based commercial software, Viz-AI-Algorithm v3.04, detected proximal LVO with an accuracy of 86%, a sensitivity of 90.1%, a specificity of 82.5, AUC of 86.3% (95% CI, 0.83-0.90; P # .001), and intraclass correlation coefficient (ICC) of 84.1% (95% CI, 0.81-0.86; P # .001), and Viz-AI-Algorithm v4.1.2 was able to detect LVO with high sensitivity and specificity (82% and 94%, respectively) [51]. Unfortunately, no study has yet shown whether AI methods can accurately identify other potentially treatable lesions such as M2, intracranial ICA, and posterior circulation occlusions.

Establishing infarct volumes is important to triage patients for appropriate therapy. AI has been able to establish core infarct volumes on MRI sequences through automatic lesion segmentation [52-54]. One reported limitation was the reliance on FLAIR and T1 images that do not fully account for the timing of stroke occurrence. Another limitation was a tendency to overestimate the volume of small infarcts and underestimate large infarcts compared with manual segmentation by expert radiologists and difficulty in distinguishing old versus new strokes [54]. Discrepancies in volumes were attributed to nondetectable early ischemic findings, partial volume averaging, and stroke mimics on CT [55].

ASPECTS is an important early predictor of infarct core for middle cerebral artery (MCA) territory ischemic strokes [56]. It assesses 10 regions within the MCA territory for early signs of ischemia and the resulting score ranges from 0 to 10, where 10 indicates no early signs of ischemia, while 0 indicates ischemic involvement in all 10 regions. The score is currently a key component in the evaluation of the appropriateness of offering endovascular thrombectomy. Several commercial AI applications perform automated ASPECTS evaluation and they have been assessed in clinical settings. In particular, Goebel et al. compared Frontier ASPECTS Prototype (Siemens Healthcare GmbH) and e-ASPECTS (Brainomix) to 2 experienced radiologists and found that e-ASPECTS showed a better correlation with expert consensus [57]. Guberina et al. compared 3 neuroradiologists with e-ASPECTS and found that the neuroradiologists had a better correlation with infarct core as judged on subsequent imaging than the software [58]. Maegerlein et al. compared RAPID ASPECTS (iSchemaView) to 2 neuroradiologists and found that the software showed a higher correlation with expert consensus than each neuroradiologist individually [59]. An example output of RAPID ASPECTS is shown in Figure 2. Accuracy varies widely and depends on the software and chosen ground truth. An interesting result suggested by one study, however, was that the RAPID software produced more consistent results when the image reconstruction algorithm was varied compared to human readers [60]. The i nterclass c orrelation c oefficient bet ween multiple reconstruction algorithms was 0.92 for RAPID, 0.81 to 0.84 for consultant radiologists, and 0.73 to 0.79 for radiology residents.

Prognostication in Stroke treatment is critical to detect patients who are most likely to benefit f rom t reatment c onsidering t he risks related. For this reason, AI has been studied as a tool for predicting post-treatment outcomes. In this context developed a CNN model to predict post-treatment infarct core based on initial pre-treatment Magnetic Resonance Imaging (MRI) [23]. The authors used a locally acquired data set of 222 patients, 187 of whom were treated with tPA. The model was evaluated using a modified version of the ROC-AUC, where the true positive rate was set to the number of voxels correctly identified as positive, the true negative rate was set to the number of voxels correctly identified as negative, and so on. The reported modified ROC-AUC is 0.88. In another study, Nishi et al. developed a U-Net model to predict clinical post-treatment outcomes using pretreatment diffusion-weighted imaging on patients who underwent mechanical thrombectomy [61]. The clinical outcome was defined using the modified Rankin Sc ale (mRS) at 90 days after the stroke. The outcomes were categorized as ‘‘good’’ (mRS < 2) and ‘‘poor’’ (mRS > 2). After training on a data set of 250 patients, the model was validated on a data set of 74 patients and found to have a ROC-AUC of 0.81.

Multiple sclerosis:

In multiple sclerosis deep learning has been investigated as a tool to estimate disease burden and predict disease activity through MRI imaging. MRI is used to assess disease burden over time but this requires comparison with prior scans, which can be burdensome and error-prone when the number of lesions is large. Nair et al. evaluated a DL algorithm for MS lesion identification on a private multicenter data set of 1064 patients diagnosed with the relapsingremitting variant, containing a total of 2684 MRI scans. In this study, the DL performance was worst with small lesions [62]. The algorithm was tested on 10% of the data set where it achieved a ROC-AUC of 0.80 on lesion detection. In another study, Wang et al. trained a CNN on 64 Magnetic Resonance (MR) scans to detect MS lesions which were able to achieve a sensitivity of 98.77% and specificity of 98.76% for lesion detection, respectively [63].

Regarding predicting disease activity, Yoo et al. developed a CNN that combined a data set of 140 patients, who had onset of the first demyelinating symptoms within 180 days of their MR scan, with defined clinical measurements [64]. CNN achieved a promising result with a ROC-AUC of 0.746 for predicting progression to clinically definite MS.

Another application of AI in MS regards efforts made to reduce gadolinium use where possible considering emerging evidence that repeated administrations of gadolinium-based contrast agents lead to their deposition in the brain [65]. In particular, Narayana et al. [66] used DL to predict lesion enhancement based on their appearance on non-contrast sequences (precontrast T1-weighted imaging, T2-weighted imaging, and fluid-attenuated inversion recovery) (Figure 6). They used a data set of 1008 patients with 1970 MR scans acquired on magnets from 3 vendors. DL achieved a ROC-AUC of 0.82 on lesion enhancement prediction, suggesting that this approach may help reduce contrast use.

Figure 6: Examples of images input to the network (T2-weighted, fluidattenuated inversion recovery and pre-contrast T1-weighted images). Post-contrast T1-weighted images demonstrating areas of true-positive (white arrow) and false-negative (black arrow) enhancement are shown for comparison [66]

Fracture detection:

Regarding fracture detection tools we found that Tomita et al. tested a DL model to detect osteoporotic vertebral fractures in a data set of 1432 CT scans. The outcome was a binary classification of whether or not a fracture was present [67]. Using 80% of the data set for training, a ROC-AUC of 0.909 to 0.918 was achieved with an accuracy of 89.2%. This was found to be equivalent to radiologists on the same data set. In a similar study by Bar et al. a CNN was trained with a data set of 3701 CT scans of the chest and/or abdomen to detect vertebral compression fractures. The model was able to detect vertebral compression fractures with 89.1% accuracy, 83.9% sensitivity, and 93.8% specificity [68]. Furthermore, [69] used a data set of 12 742 dual-energy X-ray absorptiometry scans to train a binary classifier for the detection of vertebral compression fractures (Figure 7); 70% of the data set was used for training, which yielded a ROC-AUC of 0.94. The optimal threshold achieved a sensitivity of 87.4% and a specificity of 88.4%.

Figure 7: Images of a 77-year-old female patient evaluated for vertebral fracture. Hear map shows one severe vertebral compression fracture (upper arrow) and one mild fracture (lower arrow). Heat maps are unitless low-resolution images showing relative contributions of general areas in images to the prediction. The heat map has been overlaid on the original vertebral fracture assessment image (left side). Arrows denote the corresponding locations of vertebral fractures on the original images as presented to the convolutional neural network (right side)[69]

Brain tumour:

For brain tumours there AI application for segmentation, that can be used as a stand-alone clinical tool, such as in contouring targets for radiotherapy, or it can also be used to extract tumours as a preliminary step for further downstream ML tasks, such as diagnosis, pre-surgical planning, follow-up and tumour grading [16, 70-74]. Unfortunately, there is a limit to the AI segmentation, usually, only a minority of voxels represent tumours and the majority represent healthy tissue, however, in a recently published study, Zhou et al. trained an AI model with a publicly available MRI data set of 542 glioma patients and they were able to tackle this limit [71]. Their results demonstrate excellent performance with a Dice score of 0.90 for the whole tumour (entire tumour and white matter involvement) and 0.79 for tumour enhancement.

Another AI clinical application for brain tumours is predicting glioma genomics Isocitrate Dehydrogenase (IDH) mutations that are important prognosticators [75, 76]. In 2019 multiple studies investigated the prediction of IDH mutation status from MRI. Zhao et al. published a meta-analysis of 9 studies totalling 996 patients . The largest data set used for training had 225 patients. These studies developed binary classification models and ha d a ROC-AUC of 0.89 (95% CI: 0.86-0.92). Pooled sensitivities and specificities were 87% (95% CI:76-93) and 90% (95% CI:72-97), respectively. Since then, another study was published by Choi et al. [34] using a larger MRI data set of 463 patients. It showed excellent results with ROC-AUC, sensitivity, and specificity of 0.95, 92.1%, and 91.5%, respectively. This model used a CNN to segment and as a feature extractor to predict IDH mutation risk. Haubold et al. used 18F-fluoro-ethyl-tyrosine Positron Emission Tomography (PET) combined with MRI to predict multiple tumour genetic markers using a data set of 42 patients before biopsy or treatment [77]. They trained 2 different classical ML models and used biopsy results as the ground truth. They achieved a ROC-AUC of 0.851 for predicting the ATRX mutation, 0.757 for the MGMT mutation, 0.887 for the IDH1 mutation, and 0.978 for the 1p19q mutation.

Degenerative brain disease:

Regarding dementias numerous AI networks have been trained from large longitudinal datasets such as the Alzheimer’s Disease Neuroimaging Initiative (ADNI), resulting in many diagnostic DL tools for Alzheimer's Disease (AD), such as models using 18F Fluoro-Deoxy-Glucose (FDG) PET76 and structural MRI of the hippocampus to predict AD onset from 1 years to 6 years in advance [78]. Furthermore, AI may assist in the diagnosis of dementia types. AI can differentiate AD from Lewy body and Parkinson’s dementia [32]. Similarly, other AI tools can differentiate between Mild Cognitive Impairment (MCI) and AD [79]. In addition to diagnosis, AI can also probe neurobiology. New ML techniques such as Subtype and Stage Inference have provided novel neuroimaging and genotype data-driven classifications of diagnostic subtypes and progressive stages for AD and Fronto Temporal Dementia (FTD). SuStaIn has localized distinct regional hotspots for atrophy in different forms of familial FTD caused by mutations in genes [80].

In other studies, DL has integrated MRI, neurocognitive, and APOE genotype information to predict conversion from MCI to AD [81]. Combining several AI systems (including structural MRI and amyloid PET) may augment the diagnosis and management of the complex natural history of AD. In the future, the integration of AI tools for imaging with AI systems designed to examine serum amyloid markers mortality prediction from clinicians’ progress notes and assessments of cognition, and postmortem immunohistochemistry images [82-85], may improve many facets of care in neurodegenerative disease. In Huntington's Disease, an autosomal dominant movement disorder, diagnosis, and management may be enhanced by incorporating CAG repeat length data with CNN developed for caudate volumetry, and objective gait assessment [86]. Such multi-approaches may improve risk stratification, p rogression monitoring, a nd c linical management in patients and families.

Epilepsy:

The u se o f A I i n the d iagnosis o f e pilepsy c ould i mprove the diagnostic capabilities of this condition as the symptoms are not specific and often overlap with other conditions [87, 88].. In particular, the integration of anamnestic, clinical, electroencephalographic and imaging information is fundamental for an accurate diagnosis and subtype differentiation [89]. Neuroimaging plays an important role in both diagnosis and follow-up and prognosis. In particular, structural Magnetic Resonance Imaging (sMRI) can help identify cortical abnormalities (e.g. temporal mesial sclerosis, Focal Cortical Dysplasia [FCD], neoplasms, etc.), while functional Magnetic Resonance Imaging (fMRI), emission tomography positron imaging and Magneto-Encephalo-Graphy (MEG) can help localize brain dysfunction.

Park et al. (2020) used an SVM classifier on bilateral hippocampi [90]. The model obtained an area under the receiver operating characteristic curve (AUC) of 0.85 and an accuracy of 85% in differentiating e pileptic p atients f rom h ealthy c ontrols, b etter than human evaluators. Mesial sclerosis is often subtle and invisibile. Such cases can lead to a misdiagnosis and consequently delay the surgical treatment. Therefore, recent machine learning models have been proposed to identify MRI-negative patients and lateralize foci. For example, Mo et al. (2019) used an SVM classifier based o n clinically e mpirical f eatures, a chieving 8 8% accuracy in detecting MRI-negative patients and an AUC of 0.96 in differentiating MRI-negative patients from controls [91]. The most important feature was the degree of blurring of the greywhite matter at the temporal pole. Similarly, Beheshti et al. (2020) used an SVM to diagnose epileptic patients for mesial sclerosis and lateralize foci in a cohort of 42 MRI-negative PET-positive patients [92]. Focusing on FLAIR, a simple and widely available sequence, the authors extracted signal strength from Regions Of Interest (ROIs) a priori. The model achieved 75% accuracy in differentiating right and left epileptics from controls. The best performance was obtained in identifying right epilepsy, with an accuracy of 88% and an AUC of 0.84. The most important ROIs were the amygdala, the inferior, middle and superior temporal gyrus and the temporal pole.

However, analyzing only the temporal lobes may not reveal a more global pathology, for this reason, Sahebzamani et al. (2019), using unified segmentation and an SVM classifier, found that whole brain features are more diagnostic than hippocampal features alone (94% vs 82% accuracy) [93]. In particular, global contrast and white matter homogeneity were found to be the most important, along with the clustering tendency and grey matter dissimilarity. In particular, the best performances were obtained based on the mean sum of the whole brain’s white matter.

Another possibility of AI on sMRI is to lateralize the temporal epileptic focus. In a study with an SVM, the combination of hippocampus, amygdala, and thalamic volumes was more predictive of the laterality of the epileptic focus. The combined model achieved 100% accuracy in patients with mesial sclerosis (Mahmoudi et al., 2018) [94]. Furthermore, Gleichgerrcht et al. (2021) used SVM deep learning models to diagnose and lateralize temporal epileptic focus based on structural and diffusionweighted MRI ROI data [95]. The models achieved an accuracy of 68%-75% in diagnosis and 56%-73% in lateralization with diffusion data. Based on the sMRI data, ipsilateral hippocampal volumes were the most important for prediction performance. Based on the dwMRI data, ipsilateral tracked beams had the highest predictive weight.

Machine learning techniques have also been used to diagnose cortical dysplasia, the most common cause of medically refractory epilepsy in children and adults second most common cause (Kabat and Król, 2012) [96]. In one study, Wang et al. (2020) trained a CNN to exploit the differences in texture and symmetry on the boundary between white matter and grey matter (Figure 8) [97]. The model achieved a diagnostic accuracy of 88%. Similarly, Jin et al. (2018) trained surface-based morphometry and a non-linear neural network model [98]. Based on six characteristics of a 3D cortical reconstruction, the model achieved an AUC of 0.75. The contrast of intensity between grey matter and white matter, local cortical deformation and local cortical deformation of cortical thickness were the most important factors for classification. Notably, the model worked well in three independent epilepsy centres. However, data extraction limits have been reported regarding variations in image quality between different clinical sites of the tests.

Figure 8: An example of FCD detection a) an axial slice with FCD lesion labelled in red b) patch extraction results c) classification results and numbers stand for probabilities of being FCD patches d) detection mapped onto the inflated cortical surface (shown in yellow) [97]

Another possibility is to combine features derived from sMRI with those derived from fMRI for a variety of clinical applications, such as diagnosing epilepsy and predicting the development of epilepsy following traumatic events. In one study, Zhou et al. (2020) found that the combination of fMRI and sMRI functions was more useful than either modality alone in identifying epileptic patients [99]. In another study, Rocca et al. (2019) used random forest and SVM models to help predict the development of seizures following temporal brain injuries [100]. The highest AUC of 0.73 was achieved with a random forest model using functional characteristics. Therefore, additional studies that directly compare the additive utility of fMRI and sMRI, perhaps using framework models, may be useful.

In summary, a variety of machine learning approaches have been used for the automated analysis of sMRI data in epilepsy. Given the limited number of publicly available sMRI datasets, models tend to be trained on small single-centre cohorts, this and the lack of external validation limits the interpretation of widespread clinical utility. Future work with larger, multicenter data sets is needed.

Functional MRI, a method that measures changes in blood flow to assess and map the magnitude and temporospatial characteristics of neural activity, is gaining popularity in the field of epilepsy [101]. A growing body of evidence suggests that epilepsy is likely characterized by complex and dynamic changes in the way neurons communicate (i.e., changes in neural networks and functional connectivity), both locally and globally. AI applied to fMRI is useful in recent epilepsy studies. Mazrooyisebdani et al. (2020) used an SVM to diagnose temporal epilepsies based on functional connectivity characteristics derived from graph theory analysis [102]. The model achieved an accuracy of 81%. Similarly, Fallahi et al. (2020) constructed static and dynamic matrices from fMRI data to derive measurements of global graphs [33]. Then, the most important characteristics were selected using random forest and the classification was performed with SVM. The use of dynamic features led to better accuracy than the use of static features (92% versus 88%) in the lateralization of temporal epilepsy. In another study, Hekmati et al. (2020) used fMRI data to quantify mutual information between different cortical regions and insert these quantities into a four-layered perceptual classifier [103]. The model achieved 89% accuracy in locating seizure foci. Finally, Hwang et al. (2019) used LASSO feature selection to extract functional connectivity features, which were then used to train an SVM, linear discriminant analysis, and naive Bayes classifier to diagnose temporal epilepsy [104]. The best accuracy of 85% was achieved with the SVM model. Recent work by Bharat et al. (2019) with machine learning provided further evidence that epilepsy arises from impaired functional neural networks. Using resting-state fMRI (rs-fMRI), the researchers were able to identify connectivity networks specific to temporal epilepsy [105]. The model differentiated temporal epilepsy patients from healthy controls with 98% accuracy and 100% sensitivity. The networks were also found to be highly correlated with diseasespecific clinical features and hippocampal atrophy. Although this evidence provides new proof of concept for the existence of specific epilepsy networks, future work is needed, as impaired functional activity can occur secondary to the effects of antiepileptic drugs or a variety of other confounding factors.

In summary, fMRI-based machine learning can be used to identify complex alterations in functional neural networks in the epileptic brain and further exploit these differences for classification purposes. In many cases of epilepsy, structural and functional anomalies of the network probably coexist. Current machine learning models with fMRI are limited by small sample sizes, probably because there are few publicly available data sets. However, fMRI is increasingly being integrated into routine clinical practice, particularly for lateralization before surgery [106]. Recent studies have shown that fMRI may be specifically useful in pre-surgical lateralization [107]. With technological advances and further methodological refinements, fMRI could become the standard of care in epilepsy and AI will be increasingly used to assist in diagnostic and prognostic tasks.

Another field of AI application is Diffusion Tensor Imaging (DTI), which has advantages in detecting subtle structural abnormalities of epileptogenic foci. Machine learning in DTI can use for classification improving the diagnosis and treatment of epilepsy, particularly when used for pre-surgical planning and postsurgical outcome prediction [108].

Degenerative Spine Disease:

The rate of MRI examinations is stressfully increased due to the significant number of patients suffering from degenerative spine disease [109]. Consequently, radiologists face a work overload who need to evaluate numerous parameters (size of the spinal canal, facet joints, disc herniations, size of conjugation foramina, etc) in all spinal levels in a short time. Accordingly, different DL and ML algorithms that can automatically classify spinal pathology may help to reduce patient waiting lists and examination costs. In this context, Jamaludin et al. evaluated an automatic disc disease classification system that yielded an accuracy of 96% compared to radiologist assessment. Notably, the main sources of limitation were either poor scan quality or the presence of transitional lumbosacral anatomy [109]. Furthermore, Chen et al. evaluated a DL tool to measure Cobb angles in spine radiographs for patients with scoliosis. They used a data set of 581 patients and were able to achieve a correlation coefficient of r ¼ 903 to 0.945 between the DL-predicted angle and the ground truth. [110]

Regarding spine segmentation, some models have been developed with good results. Huang et al. achieved intersection over-union scores of 94.7% for vertebrae and 92.6% for disc segmentations on sagittal MR images using a training set of 50 subjects and a test set of 50 subjects [111]. Whitehead et al. trained a cascade of CNNs to segment spine MR scans using a data set of 42 patients for training and 20 patients for testing. They were able to achieve Dice scores of 0.832 for discs and 0.865 for vertebrae [112]. In this context DL has been used to answer research questions, for example, Gaonkar et al. used DL to look for potential correlations between the cross-sectional area of neural foramina and patient height and age, showing that the area of neural foramina is directly correlated with patient height and inversely correlated with age [35] (Figure 9).

Figure 9: A random sample of automated neural foraminal segmentations used for generating measurements. Original MRI Images (left) and overlaid computer-generated segmentation(right)

AI tools for ultrasound in neuroimaging

AI applications in neuroimaging and ultrasound (US) are mostly focused on the identification of anatomical structures such as nerves. A large number of algorithms have shown to be able to segment US images for these aims. For example, Kim et al. developed a neural network that accurately and effectively segments the median nerve. To train the algorithm and evaluate the model, 1,305 images of the median nerve of 123 normal subjects were used [113]. However, the proposed neural network yielded more accurate results in the wrist datasets, rather than forearm images, with a precision respectively of 90.3% and 87.8%. Different studies showed that AI may help to automatically segment nerve and blood vessels to facilitate ultrasound-guided regional anaesthesia [114–116]. Automated medical image analysis can be trained to recognize the wide variety of appearances of the anatomical structures and could be used to enhance the interpretation of anatomy by facilitating target identification (e.g., peripheral nerves and fascial planes) [117, 118]. For example, a model has been well developed for peripheral nerve block in the adductor canal. In this model, the sartorius and adductor longus muscles, as well as the femur, were first identified as landmarks. The optimal block site is chosen as the region where the medial borders of these two muscles align. The femoral artery is labelled as both a landmark and a safety structure. The saphenous nerve is labelled as a target. AI-applications assist the operator in identifying the nerve and the correct target site for the block (Figure 10) [118-120].

Figure 10: Sono-anatomy of the adductor canal block a) Illustration showing a cross-section of the mid-thigh b) Enlarged illustration of the structures seen on ultrasound during performance adductor canal block c) Ultrasound view during adductor canal block d) Ultrasound view labelled by AnatomyGuide [121]

Conclusion

This review explores important recent advances in ML and DL within neuroradiology. There have been many published studies exploring AI applications in neuroradiology, and the trend is accelerating. AI applications may cover multiple fields of neuroimaging/neuroradiology diagnostics, such as image quality improvement, image interpretation, classification of disease, and communication of salient findings to patients and clinicians. The DL tools show an outstanding ability to execute specific tasks at a level that is often compared to those of expert radiologists. In this context, AI may indeed have a role in enhancing radiologists’ performance through a symbiotic interaction which is going to be more likely mutualistic. However, the existing AI tools in neuroradiology/neuroimaging have been trained for single tasks so far. This means that an algorithm trained to detect stroke would not be able to show similar accuracy to detect and classify brain tumours and vice-versa. This is a great limit since patients often suffer from multiple pathologies, a complete AI assessment that integrates all different algorithms would be favourable. As far as we know, ML or DL models which are capable of simultaneously performing multiple interpretations have not yet been reported. We believe that this technology development may represent the key requirement to shift AI from an experimental tool to an indispensable application in clinical practice. AI algorithms for combined analysis of different pathologies should also warrant an optimized and efficient integration into the daily clinical routine. Furthermore, rigorous validation studies are still needed before these technological developments can take part in clinical practice, especially for imaging modalities such as MRI and CT, for which the accuracy of DL models highly depends on the type of scanner used and protocol performed. In addition, the reliability of AI techniques requires the highest validation also considering the legal liabilities that radiologists would hold for their usage and results.

Nowadays, only specific DL applications have demonstrated accurate performance and may be integrated into the clinical workflow under the supervision of an expert radiologist. In particular, AI algorithms for intracranial haemorrhage, stroke, and vertebral compression fracture identification may be considered suitable for application in daily clinical routines.

Other tasks, such as glioma genomics identification, stroke prognostication, epilepsy foci identification and predicting clinically definite MS, have shown significant progress in the research domain and may represent upcoming clinical applications in the not-so-distant future.

However, the majority of AI algorithms show that there is still a range of inaccuracies for example in labelling anatomical structures, especially in the context of atypical or complex anatomy. Moreover, another challenge will be to ensure the presence of highly skilled practitioners, since machine learning systems are not guaranteed to outperform human performance and these systems should not be relied upon to replace the knowledge of doctors.

Probably, the ongoing development of DL in neuroradiology/ neuroimaging will significantly influence the work of future radiologists and other specialists, which will need a specific AI education to begin during residential training, to deeply understand the mechanisms and potential pitfalls. Furthermore, knowledge of AI could be an opportunity to improve training in radiology and other specialities. AI can assign specific cases to trainees based on their training profile, to promote consistency in the trainees' individual experiences, and, in the context of anaesthetic procedures, to facilitate an easier understanding of anatomy.

Declarations

Funding

Current Research Funds 2023, Ministry of Health, Italy

Conflicts of interest/competing interests

The authors have no competing interests to declare relevant to this article’s content.

All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.

The authors have no financial or proprietary interests in any material discussed in this article.

Author Contributions

All authors whose names appear on the submission

1. Made substantial contributions to the conception and design of the work; and to the acquisition,analysis, and interpretation of data;

2. Drafted the work and revised it critically for important intellectual content;

3. Approved the version to be published; and

4. Agree to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.