Machine learning models analyze routine clinical variables to determine the risk of delayed ischemic stroke and other outcomes.
A recent study published in Neurology has suggested that machine learning (ML) models used to predict delayed ischemic stroke (DIC) and functional outcomes significantly outperformed standard models (SMs) when used in subarachnoid hemorrhage (SAH) care.
In the study, the area under the receiver operating curve (AUC) was 0.75 (standard deviation [SD], 0.07; 95% CI, 0.64–0.84) for ML model predictions of DCI outcome, 0.85 (SD, 0.05; 95% CI, 0.75–0.92) for discharge outcome, and 0.89 (SD, 0.03; 95% CI, 0.81–0.94) for 3-month outcome. The ML model outperformed the SM in AUC by 0.20 (95% CI; -0.02 to 0.4) for DCI, 0.07 (SD, 0.03; 95% CI, 0.0018–0.14) for discharge outcomes, and by 0.14 (95% CI, 0.03–0.24) for 3-month outcomes. Additionally, ML models matched physician performance in predicting 3-month outcomes.
Principal author of the study Jude Savarraj, PhD, research scientist, Department of Neurosurgery, The University of Texas Medical School at Houston, and colleagues stated in their paper that “unlike EEG [electroencephalogram] and TCD [transcranial doppler], the ML approach uses only routine clinical variables (which are already available as part of standard care); it avoids the need for expensive instrumentation (as in the case of EEG and TCD). The output of the ML model (which is simply a probabilistic score between 0–1), is easily translatable in most settings since unlike EEG and TCD expertise of trained technicians is not required.”
Of the 451 patients who participated in the study, 64% (n = 290) were female and the mean age was 54 years (interquartile range [IQR], 45–63). Some participants had histories of hypertension (60%; n = 223), hyperlipidemia (17%; n = 60), or diabetes (12%; n = 44). The median Hunt-Hess (HH) scale score was 3 (IQR, 2–3) and the median modified Fisher scale (mFS) score was 3 (IQR, 3–3). Intraventricular hemorrhage (IVH) was present in 66% (n = 240) of participants upon admission and 21% (n = 88) of participants developed DCI. The mean modified Rankin scale (mRS) score was 3 (IQR, 1–4) on discharge and 1 (IQR, 0–4) after 3 months.
Of the ML models, the artificial neural network model (ANN) performed the best against the standard model (P = .08). At a determined optimal cutoff threshold, the ML model assessments (sensitivity, 0.75; specificity, 0.87) were significantly better than SM assessments (sensitivity, 0.58; specificity, 0.90; P <.05) when assessed with the McNemar test. AAN models are difficult to interpret, so gradient boost and random forest models were also used for analysis.
Savarraj and colleagues used the ML models to rank 31 variables from the EMR based on relative importance in outcome and found the most important variables to include sodium, white blood cells, and neutrophils. These variables have previously been shown to be associated with DCI and discharge outcomes.
ML models performed best when they used clinician-determined HH score in addition to standard electronic medical record (EMR) variables (AUC, 0.85; SD, 0.05; 95% CI, 0.75–0.92) when compared to only using EMR data (AUC, 0.81; SD, 0.05; 95% CI, 0.71–0.89; P < .05).
Savarraj and colleagues also found that in clinical teams, the attending physician (sensitivity, 0.88; specificity, 0.95) outperformed the nurse (sensitivity, 0.86; specificity, 0.85) and the fellow (sensitivity, 0.81; specificity, 0.75) in predicting outcomes. When compared against the physician’s predictions, the ML model insignificantly performed slightly worse (sensitivity, 0.91; specificity, 0.64; P >.05).
The authors also noted that “ML can offer unique perspective on the patient’s condition and can serve as a decision support tool in the management of SAH. However, clinical judgement is necessary to interpret the ML results and implement a corresponding plan of action.”