Multimodal AI for early prediction of adverse clinical outcomes in acute pancreatitis.

Karkas, Ahmet Yasin, Yavuz B Taktak, Burak Gultekin, Ziliang Hong, Halil Ertugrul Aktas, Deniz Seyithanoglu, Timurhan Cebeci, et al. 2026. “Multimodal AI for Early Prediction of Adverse Clinical Outcomes in Acute Pancreatitis.”. Abdominal Radiology (New York).

Publisher's Version

Abstract

BACKGROUND: Conventional clinical scoring systems and contrast-enhanced computed tomography (CECT) interpretation provide limited accuracy in predicting adverse outcomes in early acute pancreatitis (AP). This leads to suboptimal patient management and underscores the need for improved triage methods. To address this, we developed a multimodal artificial intelligence (AI) framework that integrates clinical parameters, radiomics, and deep learning (DL) models to predict adverse clinical outcomes in early AP.

METHODS: In this retrospective tertiary-care, imaging-enriched cohort study, patients with AP who underwent CECT within 72 h of hospital admission were included. Adverse clinical outcomes were defined as mortality, intensive care unit (ICU) admission, or the need for invasive intervention within 30 days. Radiomics (using both pancreatic and peripancreatic features) and DL models were developed using CECT images to predict adverse outcomes. Multimodal models were constructed by integrating imaging and laboratory variables. Model performance was compared with three independent radiologists' prognostic imaging assessments and with established clinical scoring systems (Ranson and Glasgow-Imrie).

RESULTS: A total of 284 patients with AP were included, of whom 140 (49.3%) experienced adverse clinical outcomes. Conventional clinical scores showed limited discrimination, with AUCs of 0.61 for Ranson and 0.67 for Glasgow-Imrie. Imaging-only assessment by three expert radiologists yielded modest predictive performance (average AUC = 0.629; sensitivity = 42.5%, specificity = 83.2%) and moderate interobserver agreement (Fleiss κ = 0.650; ICC = 0.653). Imaging-only radiomics and DL models achieved higher discrimination (AUC 0.77 and 0.76, respectively). Integration of laboratory parameters into the radiomics model further improved predictive performance (AUC of 0.77 to 0.80), whereas the DL and fusion models showed no substantial improvement.

CONCLUSION: Our multimodal AI framework, which combines quantitative CECT features with clinical data, enhances the ability to predict adverse outcomes in early AP compared to traditional clinical and imaging severity scoring systems. These findings should be interpreted as preliminary, and prospective multicenter validation is required before considering clinical implementation.

Last updated on 06/02/2026
PubMed

Return to the BIDMC Radiology Research Homepage