Exploring the Past and Current Landscape of Biomarker-Driven Clinical Trials Through Large Language Models.

Guo, M., Passalacqua, E., Bao, E., Miao, B., Butte, A., & Zack, T. (2026). Exploring the Past and Current Landscape of Biomarker-Driven Clinical Trials Through Large Language Models.. JCO Clinical Cancer Informatics, 10, e2500028.

Abstract

PURPOSE: Biomarkers, or specific somatic alterations, are increasingly required for clinical trial eligibility. Finding and enrolling patients with these biomarkers is essential not only for continuous progress in the treatment of disease but also for democratizing clinical trial participation. Here, we use data from the National Cancer Institute Clinical Trials Reporting Program (NCI CTRP), combined with large language model applications, to survey the current landscape of cancer clinical trials.

METHODS: We extracted 20,894 trials from Cancer.gov from the application programming interface (API) of the NCI CTRP. We quantified biomarker rates in cancer subtypes, described the geographic distribution of trial sites, and identified failure causes for these trials. Finally, we built an application from this API to match patients with clinical trials.

RESULTS: We showed that 5,044 of the 20,894 interventional clinical trials contained biomarker eligibility data and trials tended to cluster around large academic centers and cities. We identified 630 biomarkers in 36 cancer subtypes and show that most biomarkers are used as eligibility criteria for multiple cancer subtypes. We highlight that the difficulties with accrual and sponsorship were the most common reason for discontinuing clinical trials. Finally, we demonstrate a novel method to automatically match natural language queries with eligible clinical trials, NCI Clinical Trials Navigator.

CONCLUSION: A survey of our clinical genomics showed that many individuals likely have mutations that would make them eligible for biomarker-driven trials. We used the NCI Clinical Trials database to show that the distribution of biomarker trials across the United States limits access for many patients and likely leads to the frequent trial termination because of inadequate accrual. Finally, we built an automated publicly available tool that can improve patient-to-trial biomarker-based matching.

Last updated on 04/01/2026
PubMed