Abstract
PURPOSE: Routinely collected administrative data provide insights into health care utilization and outcomes but lack detailed clinical information, such as the specific site and intent of radiation therapy (RT). This study aimed to validate claims-based algorithms to accurately identify thoracic RT (TRT) and curative-intent RT in administrative databases.
METHODS: Patients at our institution with lung cancer and any RT Current Procedural Terminology (CPT) code from October 2015 to January 2024 were analyzed. RT claims were organized by treatment episode, and RT details were manually abstracted from the electronic health record to classify episodes as TRT or non-TRT and curative or noncurative. A priori algorithms were defined as the presence of respiratory motion management codes, >14 treatment codes (except for stereotactic body RT [SBRT] courses), with or without exclusive thoracic malignancy diagnosis codes. Positive predictive value (PPV) was computed for each episode, stratified by modality (three-dimensional conformal RT [3DCRT], intensity-modulated RT [IMRT], and SBRT). Algorithms were considered acceptable if the lower bound of the Clopper-Pearson 95% CI for PPV exceeded 70%.
RESULTS: A total of 3,846 RT episodes were analyzed. The primary a priori TRT algorithm achieved a PPV of 97% (95% CI, 96 to 98) for IMRT, 99% (95% CI, 97 to 99) for SBRT, and 87% (95% CI, 81 to 92) for 3DCRT. Performance declined when exclusive thoracic malignancy diagnosis codes were excluded. For curative-intent RT, PPVs were 87% for IMRT, 90% for SBRT, and 55% for 3DCRT.
CONCLUSION: Clinically informed algorithms can accurately identify TRT in claims data, achieving high PPVs particularly for IMRT and SBRT courses. These algorithms can be applied in claims databases to assess RT toxicity and effectiveness. External validation across diverse data sets will be important to confirm generalizability.