Tackling Rare Diseases with an Arsenal of Real-World Data

Technological innovations have fostered the age of big data in healthcare – and the huge populations amassed afford the ability to mine for key subpopulations — getting down to ever more precise groups of likely responders to therapy, high-risk patients and other high-value subphenotypes. 

But what of the populations that are already, by definition, extremely small?  The role of big data in genomic and proteomic exploration, in phenotypic correlations and in drug discovery is obvious, but what about on the clinical and healthcare outcomes side?

Historically, rare disease patient registries have been developed by various stakeholders at the local, national or international level — targeting natural history, establishing eligible patient pools for clinical trials, and identifying and tracking meaningful clinical endpoints. With over 7000 rare diseases identified, many of which are chronic and complex, data collection requirements and duration of follow-up have often made these cumbersome and difficult to effectively maintain long term in such settings.

So how do we use the new tools in our arsenal to tackle the challenges that remain in rare disease?

Click to view a larger image.
  • Decreasing time to diagnosis. For many rare conditions, the path to an accurate diagnosis is long and filled with specialist visits, testing, misdiagnoses and in many cases delays in proper management. The ability to use the totality of data available for patients to empower artificial intelligence (AI) applications – allows us to go beyond a priori assumptions, standard decision trees and algorithms and truly aid clinicians.
  • Identifying under-appreciated variation in patient phenotypes, such as identifying female carriers of hemophilia A via patterns of bleeding disorder manifestations and heterozygous female Fabry patients by potential symptom clusters.
  • Tracking changes in our understanding of the disease over time. When treatments are developed that substantially alter the course of the disease, changes may occur in the previously accepted natural history of the disease.
  • Increasing the efficiency of genomic testing and other detection modalities, by generating AI-driven and other advanced analytic models to identify at-risk patient populations for further evaluation.
  • Informing a changing paradigm for risk management. Novel approaches, such as gene therapy, raise questions for long term outcomes, including the durability of therapeutic effect, tracking “known unknowns” and the desire, in some cases, to follow patients for years or even decades, if not a lifetime. Particularly in situations where the disease is either cured, or substantially controlled, by treatment, ascertaining patient outcomes long term can be difficult – the ability to assess outcomes over a broad catchment area, and across healthcare systems, is critical.

Access to big data that is of high quality and timely, including clinical data from electronic medical records, reliable death information, as well as patient-generated data, will be critical to tracking and improving rare disease outcomes – and it will be exciting to see as real-world data are harnessed to refine drug development for an expanding list of conditions, that we can in parallel improve the path to diagnosis and track meaningful improvement in the long term outcomes that are important to patients and their families.


Kathryn Starzyk is an epidemiologist and currently the Senior Director of Real World Evidence (RWE) at OM1. Previously, she worked in biopharma in rare disease clinical development and risk management and later was Senior Director of Epidemiology and Outcomes Research at Outcome and Quintiles.