Chief Strategist for Global Healthcare, Life Sciences, and Insurance
In a climate of growing consumer distrust for companies that have access to their personal information, organizations that generate real world evidence must join the likes of financial services institutions and the government to ensure that the sensitive data in their charge is well governed, selectively accessible, and highly secure.
Real world evidence (RWE) is here to stay. And it brings with it a world of possibility—and vulnerability. As value-based reimbursement slowly displaces fee for service, payers and providers are under mounting pressure to deliver better patient outcomes at less cost. This, in turn, increases the burden on life sciences companies to not only demonstrate the effectiveness and safety of their products to the FDA through the traditional clinical trial process, but to also show evidence of their product’s overall value to patient health and total cost of care when the drug gets out in into the real world. RWE can play a crucial role in fulfilling the requirements of each of these stakeholders, with life sciences companies having additional interest in leveraging RWE to expedite market access for their products.
Historically, available evidence on pharmaceutical and medical device products has come from tightly-controlled clinical trials and is delivered to the FDA by the product’s sponsor. However, with increasing insight into the limitations of clinical trials in predicting a product’s safety and efficacy in wider, unstudied populations, the FDA has come to recognize that an abundance of product data is routinely collected via doctor visits, insurance claims, device registries, wearable technology, and even social media channels. And, though this real world data may not have the same data quality and bias controls as data collected in clinical trials, high-quality real world data (RWD) sources can dramatically enhance understanding of both the benefits and adverse effects of a product over its lifecycle—in everyday use across larger demographics.
“We need to close the evidence gap between the information we use to make FDA’s decisions and the evidence increasingly used by the medical community, by payers, and by others charged with making healthcare decisions”1
– Scott Gottlieb, FDA Commissioner
Further paving the way for real world evidence initiatives, the 21st Century Cures Act authorized $6.3 billion in funding to expedite the process by which new drugs and devices reach approval. The bill incentivizes life sciences companies to invest in data strategies that supplement traditional clinical, and pragmatic trials, and to potentially support FDA approval of new indications for previously approved drugs.
AVOIDING RISKS IN REAL WORLD EVIDENCE DEVELOPMENT
But, for all of its clear advantages and widespread support, the process of turning data from varied sources—with varying degrees of quality and lineage—into real world evidence also opens the door to a multitude of data quality, governance, and security issues that must be addressed in order to protect the integrity of the evidence produced.
DATA GOVERNANCE AND LINEAGE
In the wake of the recent data privacy fiasco in which 87 million unsuspecting Facebook users had their data unknowingly collected by a political firm, Cambridge Analytica, my colleague, David Gorbet, authored an article, Don’t Let a Data Debacle Like Facebook’s Happen at Your Company, on Entrepreneur.com outlining key lessons that every organization should learn from the recent debacle that has wrecked consumer confidence in the social media giant.
Data governance challenges should likewise weigh heavily on the minds of RWE generators responsible for safeguarding mass quantities of potentially sensitive data. To start with, repurposing of routine medical data for additional analyses often requires data cleaning and cross-referencing. These techniques can confirm the data’s internal consistency and identify missing values, but they cannot determine its accuracy and authenticity. How do you ensure accountability in your real world data when the data is sourced from many different silos, each with its own notions of data governance? Comparing data from traditional clinical research, or pragmatic trials, to source documents through audits (i.e., external consistency) is an essential additional step in verifying the accuracy and completeness of the data. This type of verification is important for real world data intended for use in regulatory analyses. And speaking of regulations, another significant challenge in working with real world data is posed by regulations regarding patient privacy and confidentiality.
HIPAA requires healthcare entities and their business associates to protect the privacy of patient information. The sharing of healthcare data between departments and organizations is often frustrated by the need to protect this data. To work within these confines, or regulations, data scientists are forced to wrangle the data to meet requirements for de-identification. Attempts to “de-identify” real world data in this manner can obfuscate it to the point that it loses its value. And in certain situations, as in the case of the strict data residency and sovereignty laws of some European countries, researchers are altogether prohibited from moving or sharing any patient data.
In many traditional database systems, data must undergo a number of separate transformation processes before it can be loaded into the database. These processes can lead to data quality disaster when a user sees a value from the database but is not clear where it came from. Was the value part of the original record? Or, was it added in transformation Step 2B? It can quickly become impossible to make heads or tails of the lineage and therefore the reliability of the data.
Tracing data lineage becomes much easier when data is loaded as is, while maintaining all its original metadata and schema, and transformation processes are performed against a canonical version of the data (while maintaining the original data unchanged), instead of having to perform multiple transformation steps before the data is ingested into the database. In this way, data scientists can see both the original data and the transformed data—making the enrichment and harmonization process much more transparent.1
MASTER DATA MANAGEMENT
For RWE, a consistent and accurate view of real world data is critical for hypothesis testing and analysis. However, when data is spread across disparate silos, it is difficult and costly to ascertain what the data means, whether it’s correct, how it relates to other data, and whether or not it’s secure. The main objective of master data management (MDM) is to provide a single, unified source of truth that sheds light on all of this ambiguity. This will prevent users from using overlapping, and possibly invalid or inconsistent, versions of data. But, MDM can prove a challenging undertaking. The projects are commonly large, costly and can take years to execute—if they succeed at all (and most don’t).
Without MDM, data silos and duplicate copies of data make it nearly impossible to secure and track the use of your real world data—increasing the possibility of breaches and exposure of sensitive information. Federation and warehousing with relational technology have proven too costly and brittle to solve these master data problems at scale. To protect their data, organizations need a better approach to MDM.
ENSURING DATA SECURITY
At this very moment, hackers are tirelessly searching and probing for vulnerabilities in the networks of organizations across the healthcare ecosystem. Their objective is a simple one: to acquire the protected health information (PHI) of patients and consumers, which, according to Becker’s is 10-20 times more valuable than credit card information. Regulated by HIPAA to protect the privacy of individuals, the theft of PHI can jeopardize consumer privacy, reputation, and identity—as well as patient trust in the organization that failed to keep their private data safe. Despite worldwide security spending expected to reach $170 billion by 2020—cyber attacks, data breaches, and the unsanctioned use of consumer data will continue to occur at an alarming rate.
Hospital networks, biotech, pharmaceutical, and medical device manufacturers all are vulnerable:
- Data generated by clinical and pragmatic trials present risks on the level of patient privacy, and to the competitive objectives of the company
- Organizational intelligence and trade secrets related to the discovery and production of complex drugs
- Drug pricing and promotional information. Especially important in the face of budgetary pressures to ensure better patient outcomes while reducing healthcare costs2
As a former Deputy Chief of Economic and Cyber Crime at the Philadelphia District Attorney’s Office, and Special Assistant U.S. Attorney, I’ve been investigating, prosecuting and thinking about data security for 20 years. With the increase in inappropriate data sharing, and cyber attacks growing in sophistication, it’s clear to me that more vigilant methods must be employed to protect and govern the use of real world data assets.
PROTECT YOUR DATA AND YOUR REPUTATION
Traditional databases and traditional data security thinking focused on building walls around the network that would protect the data from outsiders, and lacked granular access controls offering too many all-or-none data access. This traditional cyber methodology is no longer sufficient to protect against modern threats to data security. Today’s RWE-generating organizations need the ability to clearly define who can see what data at the most granular level. Exposure of PHI, even to valid application users, may violate privacy regulations and put patient and consumer data at unnecessary risk. Controlling data access and security at the database-level addresses these risks more effectively and helps prevents exploitation of application or network flaws that may disclose sensitive data.
With security features that allow for policy-based management of how data is protected and exported in the database, an Enterprise NoSQL database platform can enable organizations to safeguard patient data with capabilities such as:
- Advanced encryption that protects data from hackers and insider threats using standards-based cryptography, advanced key management, and granular separation of duties.
- Redaction features that eliminate exposure of sensitive information by removing specifics to prevent sensitive data leaks.
- Element-level security. This allows specific information to be hidden from particular users, providing an even more granular level of security enforced in the database.
In an environment of increasing cyber threat and sliding-scale ethics around the use of personal data by some companies entrusted with it—trust is at an all-time premium for organizations engaged in RWE generation. Developing real world evidence on an agile technology platform grounded in enterprise reliability and government-grade security can reduce time to insight while strengthening data governance—and waning consumer confidence.
Download this free whitepaper to learn more about The Opportunity of Real Word Data, or contact MarkLogic to learn how other life sciences organization are securely and efficiently accelerating their real world evidence initiatives.