A Magnet for the Haystack: Using AI to Find Rare Disease Patients

Correctly identifying a patient with a rare disease can be like looking for a needle in a haystack: by definition, each rare disease has fewer than 200,000 patients out of more than 300M people in the US. Even worse, many rare diseases hide from diagnosis by presenting with variable, highly nonspecific symptoms – almost like needles painted to look like the hay surrounding them. Identifying these patients often requires brute-force approaches, manually searching for a diagnosis through painstaking elimination of different options. Reaching a correct rare disease diagnosis after first experiencing symptoms can take years, with countless tests, referrals, and ineffective treatments along the way. Fabry Disease, for example, presents highly heterogeneously over time; patients can experience gastrointestinal, neurological, cardiac, or renal symptoms, or more. In many cases, years pass as patients endure progressively worse symptoms on their journey towards diagnosis and treatment.

But what if we could apply a magnet to the haystack, taking advantage of ‘hidden’ properties that – even without being visible – can be used to pull the true needles to the surface? This is exactly the role that artificial intelligence (AI) solutions can play in identifying rare disease patients. By identifying subtle patterns, signals, and themes at scale, powered by rich, broad, longitudinal data, AI can label at-risk patients beyond the capabilities of other methods based on patients’ health histories. AI is particularly useful for patient identification in several key areas:

  • Mapping trajectories: In many rare diseases, patients experience similar journeys on their way to eventual diagnosis. They might start out reporting some symptoms to their primary care physician, for example, and receive initial treatment that does not result in sustained improvement. Those steps could be followed by referrals to specialists, specific sequences of lab tests or imaging studies, and additional referrals, (mis)diagnoses, and treatments leading up to the eventual rare disease diagnosis. AI can characterize these trajectories – using all available data in patient histories – and assess when a patient is ‘most of the way’ down a path likely leading to a particular result. Communicated correctly, that information can provide a ‘shortcut’ to that result and condense the patient’s diagnostic journey considerably. 
  • Highlighting heterogeneity: If there are multiple paths to diagnosis of the same condition, and those paths are present in available data, AI can find them. While some rare diseases like cystic fibrosis present very similarly across most affected patients, many other conditions vary considerably in severity and how they impact the patient’s life. X-linked genetic diseases, for example, can present very differently in men than in women – and presentation within groups of female patients can vary as well due to random X-inactivation. These differences make it even more challenging to consistently identify individual patients quickly based on their specific symptoms alone. AI’s scale and sensitivity to pattern complexity can turn this heterogeneity into an asset – by constructing a richer set of disease parameters, AI captures symptom diversity and uses it to label higher-risk patients more effectively.
  • Exploring disease: One of the most exciting benefits of applying AI to identifying rare disease patients is the independent generation of evidence around patient characteristics. Without being ‘told’ anything about what’s known in a condition (like its symptoms, progression, or telltale signs), AI can often elicit much of this information from raw data itself. This is an amazing effect to observe, and agreement between the AI system’s findings, clinical experience, and scientific expertise provides solid plausibility and instant credibility to AI outputs. In these cases, there is often an additional benefit to the AI approach: associations, signals, and patterns highlighted will range from those known to everyone familiar with the disease to hypotheses at the frontier of understanding of the condition. These hypotheses are not definitive conclusions but can provide promising starting points for additional investigation.

AI approaches can capture, characterize, and combine subtle signals in patient histories. These tools can help rapidly process large amounts of health history data and identify small patient subsets at elevated risk of undiagnosed disease, while also contributing to understanding of the rare underlying conditions. Just like a magnet drawing hidden needles to the haystack’s surface, AI can help surface patients suffering from diseases they don’t yet know they have – and get the right treatments to the right patients more quickly than ever before.