An automated deep neural network, SpikeNet, performed at or above the accuracy, sensitivity, and specificity of fellowship-trained clinical experts in identifying interictal epileptiform discharges.
M. Brandon Westover, MD, PhD, associate professor of neurology, Massachusetts General Hospital
M. Brandon Westover, MD, PhD
New study findings suggest that an automated deep neural network may be capable of performing at or above the accuracy, sensitivity, and specificity of fellowship-trained clinical experts in identifying interictal epileptiform discharges (IEDs) on electroencephalogram (EEG).1
Study author M. Brandon Westover, MD, PhD, associate professor of neurology, Massachusetts General Hospital, and colleagues underwent a diagnostic study of IEDs, in which they trained a neural network, dubbed SpikeNet, to read EEGs with 9571 scalp recordings. SpikeNet did so by surpassing an industry-standard commercial IED detector as well as clinical experts.
“This computer program appeared to be able to classify electroencephalograms and detect individual interictal epileptiform discharges more accurately than human experts and may help with diagnostic testing for epilepsy and warn of clinical decline in critically ill patients, particularly in settings without available electroencephalogram expertise,” Westover and colleagues wrote.
The statistical performance of SpikeNet was assessed by measuring calibration error and area under the receiver operating characteristic curve (AUC), using 10-fold cross-validation. Compared to the industry standard detector’s calibration error of 0.066 (95% CI, 0.060-0.078), and experts’ mean of 0.183 (range, 0.081—0.364), SpikeNet had a calibration error of 0.041 (95% CI, 0.033–0.049; P <.05). Expert calibration was somewhat varied, as 2 were over-callers (above diagonal) and 2 were under-callers (below diagonal).
Additionally, binary classification performance based on AUC for SpikeNet was 0.980 (95% CI, 0.977-0.984) compared to the industry standard of 0.882 (95% CI, 0.872-0.893; P <.05).
When exploring the neural network’s ability to classify whole EEGs as either containing IEDs or not, SpikeNet achieved a calibration error of 0.126 (range, 0.109-0.1444) compared to the experts mean of 0.197 (range, 0.099-0.372), and achieved an AUC of 0.847 (95% CI, 0.830-0.865).
“This may be the first time an algorithm has been shown to exceed expert performance for IED detection in a representative sample of EEGs and may thus be a valuable tool for expedited review of EEGs,” Westover and coinvestigators wrote.
In a corresponding study2 from the group seeking to determine the reliability of subspecialty-trained clinical neurophysiologists in detecting IEDs in routine EEGs, the findings offer that they can do so with substantial reliability. Additionally, they suggest that disagreements about IEDs can be “largely explained by various experts applying different thresholds to a common underlying statistical model.”
These data may add to the significance of SpikeNet’s success in identifying IEDs. The results showed an expert incidence rate ratio (IRR) for 13,262 individually annotated IED candidates as deemed fair, with the mean percent agreement (PA) being 72.4% (95% CI, 67.0—77.8) and mean beyond-chance agreement (κ) being 48.7% (95% CI, 37.3–60.1). The EEG-wise IRR was notably substantial, with the mean PA being 80.9% (95% CI, 76.2–85.7) and mean κ being 69.4% (95% CI, 60.3–78.5).
And, as stated prior, a statistical model built on waveform morphological features explained the median binary scores of all experts with high accuracy, at 80% (range, 73—88), when provided with individualized thresholds.
“Future work should address how automated IED detection can be integrated into practice. Initial work will likely focus on systems that amplify clinical neurophysiologists' capabilities,” Westover et. al noted with regard to the SpikeNet findings. “Subsequent work should focus on removing the need for clinicians to be involved in low-level, pattern-recognition aspects of EEG interpretation, such as IED detection.”
REFERENCES
1. Jing J, Sun H, Kim JA, et al. Development of Expert-Level Automated Detection of Epileptiform Discharges During Electroencephalogram Interpretation. JAMA Neurol. Published online October 21, 2019. doi: 10.1001/jamaneurol.2019.3485.
2. Jing J, Herlopian, A, Karakis I, et al. Interrater Reliability of Experts in Identifying Interictal Epileptiform Discharges in Electroencephalograms. JAMA Neurol. Published online October 21, 2019. doi: 10.1001/jamaneurol.2019.3531.