Integrating External Data in Clinical Trials
Are Historical Data Good?
If we don’t learn from historical data are we doomed to collect it over and over again?
The foundational question revolves around the integration of data gathered outside of a clinical trial to shape the conclusion of the clinical trial—should historical data shape the analysis of new experiments? Clinicians rely on robust past studies to make treatment decisions, yet experimental design often shuns this rich knowledge due to potential risks, which are real, but do the risks outweigh the benefits? From a statistical point of view, especially within frequentist frameworks, this reticence to include external data is often rooted in concerns over inflating type one error and potentially introducing bias. Any use of historical data in the analysis of a clinical trial can create bias – it can lead one astray – but it can also have profound benefits. Bayesian approaches naturally accommodate updating beliefs with new data, incorporating possible differences in historical and new data – offering a pathway for improved decision making.
While uncertainty rules the ability to use external data in clinical trial analyses, proponents argue that its judicious use can prevent redundancy in having to collect well understood data over again in clinical trials. This approach challenges the traditional norms, advocating for a shift where collected knowledge actively informs new trials in well-informed and explicit ways—ultimately accelerating discovery and optimizing resources.
External Data: Risk and Opportunity
Concerns about borrowing information from historical trials often revolve around potential deviations between past and present. If previous data deviates from the current experimental context, it could lead to skewed conclusions, such as falsely validating an ineffective treatment or falsely negating an effective therapy. Understanding these risks, scenarios exist where ignoring historical data could be a greater disservice. Many scenarios exist where there are huge barriers to collecting new data – rare diseases, ethical challenges, the resource of time – all present the opportunity for scientifically credible use of external knowledge.
Frequentist approaches focus on the single clinical trial analysis and its behavior. The inferences are based on the likelihood of the results from that experimental outcome. But medical decision making incorporates all knowledge – the “totality of evidence” in making decisions. Should we design trials to explicitly use this totality of evidence or ask each consumer to infer decisions by doing their own synthesis? Bayesian methods can be idea for synthesizing the current trial data with the external data in an explicit and prospective method.
Practical Examples and Shifting Paradigms
Various scenarios highlight the quandary of integrating past data. In oncology trials, for instance, balancing the recruitment of new patients on a standard of care and the use of historical control data presents complex challenges. Some oncology trials are single-arm trials against an objective performance, for example observing an overall response rate greater than 10% could be considered a success. This uses historical results to create the 10% number, completely utilizing external data to determine success. Alternatively, a trial could randomize 1:1 and only analyze patients in the trial, which would ignore external data completely. These two trials dominate, using historical data completely or not at all, but is there room for a middle ground? Designs utilizing, as an example, a three-to-one experimental to control ratio, enriching the control with historical data can help mitigate some of the concerns of differences in the standard of care. This approach allows tracking comparability between new controls and historical ones, providing a real-time gauge for robustness.
Basket trials showcase a different aspect of “external data,” where inferences will be made in each subgroup of patients – with outcomes available on different subgroups. Making inferences in Subgroup 1 should the data in Subgroup 2 help those inferences? This data can be considered ‘external’ to Subgroup 1. Again, the standard is generally pooling – assuming the effect is identical in all subgroups (the usual approach) or to estimate them entirely separately. Statisticians now increasingly explore borrowing methods that dynamically adjust based on observed similarity in treatment effects between the different subgroups. These trials reveal a philosophical shift in trial analyses, where data external to one group can help estimate the efficacy for a different group.
Potential for Efficient Trial Designs
The increased availability of data and outcomes for patients bring huge potential for improving clinical trial designs and analyses. The challenge is to use these data in ways that preserve the scientific brilliance of randomization but enhance these trials to provide more efficient conclusions. The integration of external data must be approached with scientific rigor, balancing advances in methodological development with the skeptical examination necessary for robust results.
Moving forward, the approach to trial design inevitably tilts towards incorporating broader data sources, from historical trial outcomes to extensive real-world evidence. The future showcases a potential for more informed decision-making pathways, optimizing trial designs to historically grounded yet forward-thinking standards. This is the horizon for experimental design, marrying data richness with methodical precision.
Statistical advances offer numerous paths to integrate these external datasets effectively, from dynamic borrowing to sophisticated modeling that accounts for heterogeneity between past and present. Enhanced trial designs will depend on striking a balance between maintaining scientific integrity within the trial and embracing the adaptability that current data landscapes afford. The goal remains the same – to treat patients, current and future, as well as possible.