Today’s clinical trials pull from several data sources. Researchers share data across departments and clinical research organizations (CROs), building on their existing knowledge to create new treatments and options for patients. Even within clinical trials, data comes in paper formats as well as digital recordings, and from AI-based systems. Somehow, all of this data needs to line up to ensure accuracy and clarity.
Clinical trial technology and services providers are rethinking the data ecosystem to increase clarity within the information on hand. This requires clear data management and process creation to handle almost any data source. Here’s how these developers are overcoming complex data challenges.
Tracking Data Lineage
One of the first steps toward improving the data ecosystem within a clinical trial is to track the data lineage. This will help teams better understand the flow of data — and potential areas where it could get muddled.
“The term ‘data lineage’ refers to the life cycle that includes the data’s origin and where it moves over time,” according to Harmony Healthcare IT. “Data lineage can help with efforts to analyze how information is used and to track key bits of information that serve a particular purpose.”
Data lineage occurs when multiple teams pull the same data to review for their trials. It occurs when different parties have access to the data and use it for various purposes.
“Consider a document like your birth certificate,” says Jason Hall, assistant vice president and data strategy engineer at Republic Bank. “You began life with only one original copy of this important document. Over time, you may have needed it to get a driver’s license or a marriage license…do you know how many folders, file cabinets, databases, printers, and screens have housed or displayed information from your birth certificate?”
Good data management means tracking this data lineage. Where did the data come from? Who has viewed it? Who has potentially used it or drawn conclusions from it?
Developing Chain-of-Custody Maps
Teams that start to track their data lineage look for clear systems to view the path of that data. One option for your clinical trials is a chain-of-custody map. In the court system, this map reflects who had access to evidence and when. In clinical trials, it refers to the collection, processing and access to different sets of data.
“I tell my clients it’s really in their best interest to create this type of chain-of-custody map for your data in a clinical trial because that gives you control over what’s available,” says John Avellanet, FDA compliance consultant. “I strongly suggest you be able to create this before you make any submission, and ideally before even a trial gets started. You should create this as part of your trial planning to show the data is under control.”
By creating a chain-of-custody plan before a trial, everyone can record the information as they collect and view it. This instills a new habit as your process changes and allows your oversight team to troubleshoot problems by going directly to the source.
“To be FDA-compliant, chain of custody documentation in our industry typically includes the date, time, temperature, activity, and who performed the activity, in some systematic fashion, for every event in the path between the manufacturer and the patient,” according to Thermo-Fisher Scientific. “Note that the FDA does not specify how this should be done, only that the system used by any given organization must work in a reliable manner.”
This means you can create a chain-of-custody map that matches your internal systems and allows your team to easily see the flow of data over time.
Evaluating Hybrid Systems
As data management becomes more complex, teams are increasingly turning to hybrid models to manage their information. This highlights the importance of clear data lineage and tracking.
“Data no longer resides in a single environment; it’s scattered across on-premises and cloud environments, which indicates that businesses are moving into a hybrid world,” writes Isha Saggu, at G2. “With the exponential growth in data formats, sources, and deployments across organizations, businesses are constantly looking for ways to best optimize data assets that live within existing on-premises legacy systems.”
Managing these data sources can become a full-time job for some data experts. Data from multiple sources need to be able to line up for comparison and easily display in readable formats. Some industry leaders are frustrated with hybrid data management and want companies to move toward a universal system.
“Regulatory guidance notes that hybrid systems will probably result in more review work compared with an electronic process,” writes consultant R.D. McDowall in LCGC North America. “By failing to replace hybrid systems, management must accept accountability for its inaction when an inspector calls and finds a data integrity problem.”
While it’s not practical for teams to completely switch to one data collection method (especially when they use external data sources), researchers should strive to reduce their dependence on hybrid systems that slow the data cleaning and evaluation process.
Creating Data Standards Today for the Future
The data solutions that clinical trial developers use today need to last. They need to keep up with industry best practices in the near future, but also in the long term. Teams can’t keep switching up their processes and retraining staff every few years.
“Today, organisations are harvesting information in many different formats without any standardisation and with few protocols for sharing,” says Steve Arlington, president at Pistoia Alliance. “Greater use and adoption of data standards will be essential if the industry is to overcome barriers to efficiency in the clinical trial process.”
Companies need to look for systems that can help them process data today and store it for tomorrow. The data needs to remain relevant and on-hand for future use.
“With the increase in analytics, industrial organizations cannot fully predict what data they will need to answer the next issue,” writes Steve Pavlosky, director of digital product management at GE Digital.
These future data storing efforts can increase the speed with which trials are completed and treatments go to market.
However, a good tool alone can’t speed up processes and save time. Even the best software tools with quality data processing can’t improve poor internal systems that flow down the workflows of employees.
“Evaluating EDC solutions for efficiency or time savings remains impractical due to the many external confounders—such as the protocol approval process and amendment resolution—that impact data management cycle times,” write Beth Harper, senior research consultant at Tufts Center for the Study of Drug Development, et. al. Essentially, teams cannot expect their data processing and management to improve if clunky internal systems hold them back.
Pharmaceutical developers absolutely need to consider their data lineage and hybrid processes for data management. They also need to evaluate whether the confusion lies within their internal systems. The fastest data cleaning tools in the world won’t help a poor data management structure.
Multiple data sources can strengthen a body of research. By using existing data sets, researchers can save time and develop treatments faster. Unfortunately, poor data integration can create confusion and eat away at the time that was saved. With clear processes and data lineage, teams can clearly understand the valuable information at their fingertips.