MedSlice: fine-tuned large language models for secure clinical note sectioning.

Davis, J., Sounack, T., Sciacca, K., Brain, J. M., Durieux, B. N., Agaronnik, N. D., & Lindvall, C. (2026). MedSlice: fine-tuned large language models for secure clinical note sectioning.. JAMIA Open, 9(1), ooaf179.

Abstract

OBJECTIVES: Extracting sections from clinical notes is crucial for downstream analysis but is challenging due to variability in formatting and labor-intensive nature of manual sectioning. This study develops a pipeline for automated note sectioning using open-source large language models (LLMs), focusing on three sections: History of Present Illness, Interval History, and Assessment and Plan.

MATERIALS AND METHODS: We fine-tuned three open-source LLMs to extract sections using a curated dataset of 487 progress notes, comparing results relative to proprietary models (GPT-4o, GPT-4o mini). Internal and external validity were assessed via precision, recall, and F1 score.

RESULTS: Fine-tuned Llama 3.1 8B (F1 = 0.92) outperformed GPT-4o. On the external validity test set, performance remained high (F1 = 0.85).

DISCUSSION: While proprietary LLMs have shown promise, privacy concerns limit their utility in medicine; fine-tuned, open-source LLMs offer advantages in cost, performance, and accessibility.

CONCLUSION: Fine-tuned, open-source LLMs can surpass proprietary models in clinical note sectioning.

Last updated on 04/01/2026
PubMed