The web-implemented algorithm misclassified and misdiagnosed seizures in only 16.8% of the patients with epilepsy—lower than rates previously reported in the literature.
A group of investigators has developed a modified pragmatic web-based algorithm that has shown its ability to accurately classify types of epileptic seizures or combinations of seizures in a recent assessment. The hope is that it will be used to improve diagnosis and management of the disease in resource-poor settings, and ultimately improve clinical outcomes.1
Senior author Michael Sperling, MD, director, Jefferson Comprehensive Epilepsy Center, and colleagues constructed the algorithm system using a modified Delphi method that classified patients into 9 seizure type profiles. It implemented in a web-based application (Epipick.org) and was validated in a prospective, multicenter fashion among 262 patients, 217 of which had epilepsy. The age at seizure onset ranged between 10 and 82 years (median, 21 years).
At the conclusion of the trial, investigators found that the algorithm correctly classified 83.2% (95% CI, 78.7-87.8) of the conditions. The coefficient of agreement between the index test and the reference standard was .82 (95% CI, .77-.87), which qualified as almost perfect agreement according to Landis and Koch criteria. A percent agreement of 74.4% (95% CI, 69.1-79,7) and an agreement coefficient of .72 (95% CI, .67-.78), both corresponding to a substantial agreement, were observed after reanalyzing for low-income countries where MRI may not be accessible.
"Because the algorithm is ultimately aimed at facilitating ASM [antiseizure medication] selection, only seizure types or combinations of seizure types that were considered relevant for medical management were included, an approach that differs from phenomenological and syndromic classifications that are clinically important but not particularly relevant for drug selection," Sperling et al wrote. "Although identifying specific seizure combinations is the first step toward syndromic classification in patients, classifying an epilepsy syndrome (with implications regarding etiology, natural history, and prognosis) is more complex and beyond the scope of this algorithm."
The Delphi process began with experts asking a set of questions to identify seizure types. Questions comprised of seizure semiology, as well as results of neuroimaging investigations. Because electroencephalogram (EEG) requires expertise in each setting and overreading of EEG is a common cause of misdiagnosis in epilepsy, investigators did not include it as part of the diagnostic algorithm.
Although, in a subset of patients with epilepsy, the percent agreement was 87.2% (95% CI, 82.7%-91.6%), with an agreement coefficient of .86 (95% CI, .82-.91), corresponding to an almost perfect agreement on seizure classification. Similarly, those with focal seizures (n = 157) and generalized seizures (n = 51) had 93% (95% CI, 87.8%-96.4%) and 80.4% agreement (95% CI, 75.3%-93.5%), respectively.
There were 44 patients who had nonepileptic paroxysmal events, including psychogenic nonepileptic seizures (n = 22), syncope (n = 20), migraine (n = 1), and drug-induced paroxysmal event (n = 1). Of this subgroup, the percent agreement was 63.6% (95% CI, 48.8-78.4%), with an agreement coefficient of .59 (95% CI, .41-.78), corresponding to a high–moderate, beyond-chance agreement. The number of patients in this analysis was too low to draw conclusions though, according to study authors.
The algorithm misclassified and misdiagnosed seizures in 16.8% (44 of 262) of the patients, 28 of which had epilepsy and 16 with nonepileptic paroxysmal events. Notably, in 22 patients (8.4%), the misclassification/misdiagnosis would have affected ASM selection negatively. Comparatively, the existing literature suggests that the misdiagnosis rate for those with epilepsy ranges anywhere from 20% to 30%,2,3 to as high as 71%.4
A total of 32 health care professionals from 14 countries evaluated the algorithm in the setting of medically underserved areas and claimed they felt it provided useful output and was feasible for clinical implementation (Likert scale: median, 6.5).