How Big Data Can Drive Advancements in MS Research

February 21, 2021
Nicola Davies, PhD

NeurologyLive, February 2021, Volume 4, Issue 1

Big data is on the precipice of revolutionizing multiple sclerosis knowledge and treatment.

About 2.8 million individuals worldwide, including 1 million in the United States, currently live with multiple sclerosis (MS), the leading cause of nontraumatic neurological disability.1 Investigators have been striving for decades to find more effective ways to diagnose and manage MS, but the “gold standard” of evidence-based care—use of randomized controlled trials—has not been as beneficial for this disease as it has for others. “MS is a condition that has a great degree of variability across people and across time,” said Aaron Boster, MD, president of the Boster Center for Multiple Sclerosis in Columbus, Ohio. As such, he explained to NeurologyLive®, it can be difficult to draw evidence from the relatively small number of patients recruited for clinical trials and standard observational studies, as they do not accurately reflect the status of the disease in the real world.

Could big data provide a solution to this problem?

Big data refers, essentially, to a large volume of data that can either be structured or unstructured. In the medical context, these data are obtained from a variety of sources: medical records, hospital records, or data generated from wearable devices and health care apps. Over the past few years, efforts by players in the field have focused on creating repositories of information by establishing online registries for MS.

“Such volumes of information are not normally accessible to the average neurophysician, who, over the course of 20 to 30 years, usually ends up seeing only a couple of thousand patients with MS,” Boster said.

When analyzed statistically, big data can be used to spot patterns and generate evidence that is applicable to the entire MS population. “Big datasets essentially help us understand health. The greater the number of data points involved, the greater the learning,” explained Ava Battles, MPsychSc, chief executive officer of Multiple Sclerosis Ireland.

Big Data’s Potential in MS Management

Multiple factors play a role in the severity of MS, a disease that is not completely understood. Its exact cause is poorly grasped as well, and boundaries remain extremely blurred between the 2 distinct disease forms: relapsing-remitting MS (RRMS) and secondary progressive MS (SPMS). These deficits, understandably, can hinder disease management.

MS registries can collect patient details and synthesize them into information that can help improve understanding of the disease process. For example, Boster cites the most accepted definition of SPMS, which was created by identifying 576 versions of the definition and comparing them with a cohort of patients from MSBase, an international database that contains information on patients with MS throughout the world.2 The definition that performed the best in terms of timely diagnosis, disease activity, and overall disease burden was selected from the data. Using the Expanded Disability Status Scale (EDSS), SPMS is defined by: a disability progression of 1 step in patients with EDSS 5.5 or less, and of 0.5 step in patients with EDSS 6 or higher in the absence of a relapse; a minimum EDSS score of 4; a pyramidal functional system (FS) score of 2; and confirmed progression over 3 months, including confirmation within the leading FS.

Big data may also help in the identification of various factors that can predict disease progression. “Currently, the standard for monitoring patients with MS is through clinical assessments and radiographic monitoring using MRI scans. This needs to be done yearly to look for the formation of new lesions,” Gauruv Bose, MD, clinical research fellow in MS at the Brigham Multiple Sclerosis Center in Boston, Massachusetts, explained to NeurologyLive®.

However, meeting this standard may not be feasible in locations where access to care is limited. MRI scans, in particular, can be expensive and are not always covered by insurance. Another use for these datasets is to identify cost-effective prognostic markers to help predict disease progression in different individuals. For instance, results of a longitudinal observational study published in 2020 explored risk factors for conversion of RRMS to SPMS by utilizing datasets from more than 15,000 patients.3 The investigators found that older age, a longer duration of illness, higher disability scores, and more relapses in the previous year could predict progression to SPMS. Conversely, other factors that were previously thought to be associated with progression, including oligoclonal bands in cerebrospinal fluid and evidence of spinal cord lesions, were found to not be associated with progression.

Another recent observational study attempted to identify early clinical markers for aggressive MS by collating data from MSBase datasets. After screening almost 10,000 patients whose data were recorded over 10 years, specific markers were pinpointed: older age at symptom onset and higher disability scores and occurrence of pyramidal signs during the first year of MS. Based on these data, clinicians encountering these features in a patient may choose to be more aggressive in their early intervention.4

Patients with MS often suffer from other medical comorbidities, which may or may not influence MS progression, and big data helps comprehend the various effects of these comorbidities. For instance, an observational study identified that patients with psychiatric comorbidities such as mood or anxiety disorders were more likely to have increased neurological disability.5 Another recent cohort study retrieved data from MSBase and explored the association between pregnancy and clinically isolated syndrome (ie, the first episode of neurological symptoms in MS). Preliminary findings from this study suggest that pregnancy can forestall the onset of MS.6

The detection increased risk of certain comorbidities in patients with MS may offer another role for these large datasets. A matched cohort study from 2 different databases showed that patients diagnosed with MS have a 2-fold increase in rates of venous thromboembolism and peripheral vascular disease when compared with patients without MS. Additionally, the incidence of myocardial infarction in female patients with MS was 2.5 times higher than in those without MS.7

Increasingly, decision-making is a process shared between physician and patient. Patients cannot make informed treatment choices unless they are aware of the risks and prognosis of their condition. Particularly in young patients who are newly diagnosed with MS, the risks may not be immediately apparent. “There is a significant disease lag in MS,” Boster explained. “The brain damage that occurs in a 30-year-old does not cause clinical concern at that time because of neuroplasticity and a large neurological function reserve. Although there might be temporary loss of function, this is usually regained— but the brain damage is permanent. Over the years, increasing bouts of brain damage will eventually lead to accelerated volume loss and progression of disability.” Study results corroborate that MS risk knowledge can be relatively low among patients,8 confirming the importance of educating patients on self-care. Results gleaned from large datasets of patients with MS can be valuable in this endeavor.

There is a significant disease lag in MS. The brain damage that occurs in a 30-year-old does not cause clinical concern at that time because of neuroplasticity and a large neurological function reserve. Although there might be temporary loss of function, this is usually regained—but the brain damage is permanent.


One such tool that can be used to educate patients is the MS Severity Rank Calculator, developed by investigators at MSBase by plotting the EDSS scores of a consolidated large volume of patients over time. The calculator can be used to categorize individual patients into a specific decile. Patients can be shown where they rank on the graph, which is a powerful indicator of the severity of their disease status. “Showing patients their own natural history over the course of time is extremely valuable in terms of treatment adherence,” said Boster.

Randomized trial results may not always represent the real-world efficacy of a drug, but for MS agents, the differences may be even starker. Patients in randomized trials are usually treatment-naïve and therefore have less potential for interactions that could impact the efficacy of a drug.9 Assessment of the real-world efficacy of a given MS medication can be done on a much larger scale through big data than is possible with randomized trials. For example, research from registry datasets show that ocrelizumab is more associated with respiratory infections than are other drugs for MS— a finding that was not apparent in clinical trials.10

“As opposed to almost no treatments 20 years ago, we now have 24 FDA-approved therapies to treat MS. But we still don’t know the best way of using them all,” said Boster. This situation is due in large part to the drugs’ diversity. “While some drugs have similar mechanisms of action, many have unique pharmacological properties, which in turn creates differences in efficacy, as well as different [adverse] effect risks, routes of administration, and monitoring requirements,” added Bose.

Consequently, patients aren’t always prescribed the medication that is best for them. “Not all countries allow patients to have the medication their neurologist recommends, without first ‘failing’ another choice, one that is often cheaper and relatively safer,” said Bose. “If clinicians could use evidence derived from big data to characterize and prognosticate their patients’ condition accurately, policymakers could be persuaded to [pay for] the right medication for higher-risk patients when it is first prescribed.”

All of these factors—correctly diagnosing and prognosticating the patient, educating patients about the importance of adhering to treatment, and choosing the right drug for treatment—contribute to making informed health care decisions. “With appropriate data, a clinician can take an individual’s physiology and lifestyle into account, and they can potentially provide a detailed, holistic treatment plan for a person living with MS,” said Battles.

Making informed treatment choices also includes timing treatments correctly. Although disease-modifying therapy is often only initiated after obvious neurological disability, such as walking issues, information from big data has shown that cognitive decline—a relatively less visible parameter—can often signal the beginning of MS.11 This is an emerging advance for care, in that it offers parameters to assess cognitive function earlier in the disease process.

Platforms Dedicated to Collating MS Data

While data is obtainable from various sources, the key sources of these datasets thus far for MS research have been disease registries. Unlike patient and hospital records, which capture data passively, disease registries actively encourage data entry and the sharing of information, and while relatively small, they have been growing. Several European MS registries have more than 500,000 total patients in their databases, potentially covering a large swath of the estimated 700,000 people with MS on the continent. As well, attempts have been made to harmonize the data between registries for additional insight, with the EUReMS and BMSD projects currently analyzing the results of this for datasets from over 60,000 and 100,000 patients, respectively.12 North America has been catching up slowly, and now several registries capture information from patients on this continent as well.

Four of the key big data sources for MS are as follows:

  • MSBase. An international registry, MSBase is open to neurophysicians worldwide. Patient data can be shared on this registry with appropriate consent, and these data can be used by investigators to track and evaluate outcomes. “The MSBase registry has capture visits from sites across many countries, and several excellent analyses have been done using this and similar datasets,” said Bose. MSBase currently has records from approximately 74,000 patients in 38 countries.
  • Atlas of MS. Supported by the MS International Foundation, this database actively collects epidemiological information on MS worldwide. In addition to prevalence across various patient demographics, the database also offers information on health care resources’ availability and accessibility.
  • The North American Registry for Care and Research in MS. This database links MS centers across the United States and Canada. A collaborative registry, it has repositories of clinical, genetic, and radiographic data of more than 800 patients with MS from 24 institutions.
  • COViMS. This new registry, whose name is an acronym for COVID-19 Infections in MS & Related Diseases, was formed in wake of the coronavirus disease 2019 (COVID-19) pandemic. It was established by the Consortium of Multiple Sclerosis Centers and the National Multiple Sclerosis Society to help investigate how COVID-19 affects the MS population, and it already contains data from about 900 patients with MS who have had COVID-19.

Challenges in Using Big Data for MS Research

The most common fear shared by both physicians and patients regarding these collections of information is breach of privacy. “Sharing your most private, most intimate information about your health has fear attached to it,” said Boster. “First is the fear that your data might fall into the wrong hands or be used inappropriately, and second, that datasets might actually limit your options for treatment.” Clinicians can mitigate patient fear by being open about how big data is used. Indeed, Battles stressed, “People consenting to their data being shared should be informed of who will have access to the data and what [they] will be used for. They should also know how to have their data removed, should they wish to do so at any time.”

Another key challenge is the inherent bias that these datasets may have. “There may be differences in the examination and rating tools used, although efforts are underway to standardize these,” said Bose. He added that incomplete or missing data can also create a bias. For instance, not all countries have the resources to allow frequent MRI examinations. Certain countries may not contribute to databases at all, creating a geographical bias. The Atlas of MS database records data from 138 countries, but it has excluded at least 80 countries where coordinators could not be identified. The world’s largest MS database, MSBase, has data from only 38 countries.

Despite these challenges, most patients are inclined to share their experiences and health data on global registries. “We see that patients are willing to fight back and do what it takes to contribute toward the understanding of the disease, whether this is participating in clinical trials or sharing their health information,” said Boster.

The Future of Big Data and Advances in MS

The wealth of insights that big data has the potential to offer is exciting, experts agree. According to Bose, one avenue of future research could be developing analytical tools that can be applied to big data to create and refine specific treatment algorithms. Another focus should be toward consolidating information across different databases, said Boster. “Increasing efforts at consolidating databases is essential so that we can have larger, robust, homogenous datasets to work from,” he noted.

Boster also expresses hope that consolidated big data can be applied to studying the prodromal stage of MS. “We know from the literature that for almost 5 years leading up to an MS diagnosis, patients have certain abnormal patterns, such as increased emergency [department] visits, pain, or psychiatric complaints. If we are ever to a priori identify the trappings of an MS prodrome, it is going to be through mining big data for clues and tips that lead us there,” he says. Identifying patients in the preclinical stages could certainly revolutionize the management of the disease, as the next logical step would be to identify treatment strategies that can stave off the clinical course altogether.

1. King R. Atlas of MS, 3rd Edition. Multiple Sclerosis International Federation; 2020. Accessed January 9, 2021.
2. Lorscheider J, Buzzard K, Jokubaitis V, et al; MSBase Study Group. Defining secondary progressive multiple sclerosis. Brain. 2016;139(Pt 9):2395-2405. doi:10.1093/brain/aww173
3. Fambiatos A, Jokubaitis V, Horakova D, et al. Risk of secondary progressive multiple sclerosis: a longitudinal study. Mult Scler J. 2020;26(1):79-90. doi:10.1177/1352458519868990
4. Malpas CB, Manouchehrinia A, Sharmin S, et al. Early clinical markers of aggressive multiple sclerosis. Brain. 2020;143(5):1400-1413. doi:10.1093/brain/awaa081
5. McKay KA, Tremlett H, Fisk JD, et al; CIHR Team in the Epidemiology and Impact of Comorbidity on Multiple Sclerosis. Psychiatric comorbidity is associated with disability progression in multiple sclerosis. Neurology. 2018;90(15):e1316-e1323. doi:10.1212/WNL.0000000000005302
6. Nguyen A, Vodehnalova K, Kalincik T, et al. Association of pregnancy with the onset of clinically isolated syndrome. JAMA Neurol. Published online September 14, 2020. doi:10.1001/jamaneurol.2020.3324
7. Persson R, Lee S, Ulcickas Yood M, et al. Incident cardiovascular disease in patients diagnosed with multiple sclerosis: a multi-database study. Mult Scler Relat Disord. 2020;37:101423. doi:10.1016/j. msard.2019.101423
8. Giordano A, Liethmann K, Köpke S, et al; AutoMS Group. Risk knowledge of people with relapsing-remitting multiple sclerosis – results of an international survey. PLoS One. 2018;13(11):e0208004. doi:10.1371/ journal.pone.0208004
9. Rojas JI, Pappolla A, Patrucco L, Cristiano E, Sánchez F. Do clinical trials for new disease modifying treatments include real world patients with multiple sclerosis? Mult Scler Relat Disord. Published online January 3, 2020. doi:10.1016/j.msard.2020.101931
10. Smoot KE, Stuchiner T, Lucas L, et al. Utilization, safety, and tolerability of ocrelizumab: year 2 data from the Providence Ocrelizumab Registry. Presented at: 35th Congress of the European Committee for Treatment and Research in Multiple Sclerosis; September 11-13, 2019; Stockholm, Sweden. Accessed January 14, 2021. stockholm/278213/ html?f=listing%3D3%2Abrowseby%3D8%2Asortby%3D2%2Amedia%3D3%2Asearch%3Docrelizumab
11. Cerqueira JJ, Compston DAS, Geraldes R, et al. Time matters in multiple sclerosis: can early treatment and long-term follow-up ensure everyone benefits from the latest advances in multiple sclerosis? J Neurol Neurosurg Psychiatry. 2018;89(8):844-850. doi:10.1136/jnnp-2017-317509
12. Glaser A, Stahmann A, Meissner T, et al. Multiple sclerosis registries in Europe—an updated mapping survey. Mult Scler Relat Disord. 2019;27:171-178. doi:10.1016/j.msard.2018.09.032
download issueDownload Issue : February 2021