利用开源大型语言模型在医院员工调查中进行数据扩充：混合方法研究。

Leveraging Open-Source Large Language Models for Data Augmentation in Hospital Staff Surveys: Mixed Methods Study.

机构信息

Watt Family Innovation Center, Clemson University, Clemson, SC, United States.

Department of Industrial Engineering, Clemson University, Clemson, SC, United States.

出版信息

JMIR Med Educ. 2024 Nov 19;10:e51433. doi: 10.2196/51433.

DOI:10.2196/51433

PMID:39560937

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11590755/

Abstract

BACKGROUND

Generative large language models (LLMs) have the potential to revolutionize medical education by generating tailored learning materials, enhancing teaching efficiency, and improving learner engagement. However, the application of LLMs in health care settings, particularly for augmenting small datasets in text classification tasks, remains underexplored, particularly for cost- and privacy-conscious applications that do not permit the use of third-party services such as OpenAI's ChatGPT.

OBJECTIVE

This study aims to explore the use of open-source LLMs, such as Large Language Model Meta AI (LLaMA) and Alpaca models, for data augmentation in a specific text classification task related to hospital staff surveys.

METHODS

The surveys were designed to elicit narratives of everyday adaptation by frontline radiology staff during the initial phase of the COVID-19 pandemic. A 2-step process of data augmentation and text classification was conducted. The study generated synthetic data similar to the survey reports using 4 generative LLMs for data augmentation. A different set of 3 classifier LLMs was then used to classify the augmented text for thematic categories. The study evaluated performance on the classification task.

RESULTS

The overall best-performing combination of LLMs, temperature, classifier, and number of synthetic data cases is via augmentation with LLaMA 7B at temperature 0.7 with 100 augments, using Robustly Optimized BERT Pretraining Approach (RoBERTa) for the classification task, achieving an average area under the receiver operating characteristic (AUC) curve of 0.87 (SD 0.02; ie, 1 SD). The results demonstrate that open-source LLMs can enhance text classifiers' performance for small datasets in health care contexts, providing promising pathways for improving medical education processes and patient care practices.

CONCLUSIONS

The study demonstrates the value of data augmentation with open-source LLMs, highlights the importance of privacy and ethical considerations when using LLMs, and suggests future directions for research in this field.

摘要

背景

生成式大型语言模型（LLMs）具有通过生成定制学习材料、提高教学效率和提高学习者参与度来彻底改变医学教育的潜力。然而，LLMs 在医疗保健环境中的应用，特别是在增强文本分类任务中小数据集方面的应用，仍然没有得到充分探索，特别是对于那些不允许使用第三方服务（如 OpenAI 的 ChatGPT）的注重成本和隐私的应用程序。

目的

本研究旨在探索使用开源 LLM，如 Large Language Model Meta AI（LLaMA）和 Alpaca 模型，在与医院工作人员调查相关的特定文本分类任务中进行数据扩充。

方法

调查旨在了解一线放射科工作人员在 COVID-19 大流行初期的日常适应情况。采用两步数据扩充和文本分类过程。该研究使用 4 种生成式 LLM 生成类似于调查报告的合成数据进行数据扩充。然后，使用一组不同的 3 种分类器 LLM 对扩充后的文本进行主题分类。该研究评估了分类任务的性能。

结果

总体而言，表现最佳的 LLM 组合、温度、分类器和合成数据案例数是通过在温度为 0.7 时使用 LLaMA 7B 进行扩充，并使用 Robustly Optimized BERT Pretraining Approach（RoBERTa）进行分类任务，实现了平均接收器操作特征（ROC）曲线下面积（AUC）为 0.87（标准差 0.02；即 1 个标准差）。结果表明，开源 LLM 可以增强医疗保健环境中小数据集的文本分类器性能，为改善医学教育流程和患者护理实践提供了有前途的途径。

结论

该研究展示了使用开源 LLM 进行数据扩充的价值，强调了在使用 LLM 时隐私和伦理考虑的重要性，并提出了该领域未来研究的方向。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9812/11590755/11f6a78b3c3d/mededu-v10-e51433-g001.jpg

相似文献

Leveraging Open-Source Large Language Models for Data Augmentation in Hospital Staff Surveys: Mixed Methods Study.

JMIR Med Educ. 2024 Nov 19;10:e51433. doi: 10.2196/51433.

Evaluating large language models for health-related text classification tasks with public social media data.

J Am Med Inform Assoc. 2024 Oct 1;31(10):2181-2189. doi: 10.1093/jamia/ocae210.

DeepSeek in Healthcare: Revealing Opportunities and Steering Challenges of a New Open-Source Artificial Intelligence Frontier.

Cureus. 2025 Feb 18;17(2):e79221. doi: 10.7759/cureus.79221. eCollection 2025 Feb.

Leveraging Large Language Models for Precision Monitoring of Chemotherapy-Induced Toxicities: A Pilot Study with Expert Comparisons and Future Directions.

Cancers (Basel). 2024 Aug 12;16(16):2830. doi: 10.3390/cancers16162830.

A qualitative survey on perception of medical students on the use of large language models for educational purposes.

Adv Physiol Educ. 2025 Mar 1;49(1):27-36. doi: 10.1152/advan.00088.2024. Epub 2024 Oct 24.

Privacy-ensuring Open-weights Large Language Models Are Competitive with Closed-weights GPT-4o in Extracting Chest Radiography Findings from Free-Text Reports.

Radiology. 2025 Jan;314(1):e240895. doi: 10.1148/radiol.240895.

The Role of Large Language Models in Transforming Emergency Medicine: Scoping Review.

JMIR Med Inform. 2024 May 10;12:e53787. doi: 10.2196/53787.

Large language models for error detection in radiology reports: a comparative analysis between closed-source and privacy-compliant open-source models.

Eur Radiol. 2025 Feb 20. doi: 10.1007/s00330-025-11438-y.

Learning to Make Rare and Complex Diagnoses With Generative AI Assistance: Qualitative Study of Popular Large Language Models.

JMIR Med Educ. 2024 Feb 13;10:e51391. doi: 10.2196/51391.

Examining the Role of Large Language Models in Orthopedics: Systematic Review.

J Med Internet Res. 2024 Nov 15;26:e59607. doi: 10.2196/59607.

引用本文的文献

Performance of ChatGPT-4o and Four Open-Source Large Language Models in Generating Diagnoses Based on China's Rare Disease Catalog: Comparative Study.

J Med Internet Res. 2025 Jun 18;27:e69929. doi: 10.2196/69929.

本文引用的文献

The next paradigm shift? ChatGPT, artificial intelligence, and medical education.

Med Teach. 2023 Aug;45(8):925. doi: 10.1080/0142159X.2023.2198663. Epub 2023 Apr 10.

Response to: Aye, AI! ChatGPT passes multiple-choice family medicine exam.

Med Teach. 2023 Jun;45(6):666. doi: 10.1080/0142159X.2023.2190476. Epub 2023 Mar 20.

The rise of ChatGPT: Exploring its potential in medical education.

Anat Sci Educ. 2024 Jul-Aug;17(5):926-931. doi: 10.1002/ase.2270. Epub 2023 Mar 28.

Using ChatGPT to write patient clinic letters.

Lancet Digit Health. 2023 Apr;5(4):e179-e181. doi: 10.1016/S2589-7500(23)00048-1. Epub 2023 Mar 7.

Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models.

PLOS Digit Health. 2023 Feb 9;2(2):e0000198. doi: 10.1371/journal.pdig.0000198. eCollection 2023 Feb.

ChatGPT passing USMLE shines a spotlight on the flaws of medical education.

PLOS Digit Health. 2023 Feb 9;2(2):e0000205. doi: 10.1371/journal.pdig.0000205. eCollection 2023 Feb.

Artificial intelligence bot ChatGPT in medical research: the potential game changer as a double-edged sword.

Knee Surg Sports Traumatol Arthrosc. 2023 Apr;31(4):1187-1189. doi: 10.1007/s00167-023-07355-6. Epub 2023 Feb 21.

ChatGPT and other artificial intelligence applications speed up scientific writing.

J Chin Med Assoc. 2023 Apr 1;86(4):351-353. doi: 10.1097/JCMA.0000000000000900. Epub 2023 Feb 14.

ChatGPT: the future of discharge summaries?

Lancet Digit Health. 2023 Mar;5(3):e107-e108. doi: 10.1016/S2589-7500(23)00021-3. Epub 2023 Feb 6.

ChatGPT Is Shaping the Future of Medical Writing But Still Requires Human Judgment.

Radiology. 2023 Apr;307(2):e230171. doi: 10.1148/radiol.230171. Epub 2023 Feb 2.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用开源大型语言模型在医院员工调查中进行数据扩充：混合方法研究。

Leveraging Open-Source Large Language Models for Data Augmentation in Hospital Staff Surveys: Mixed Methods Study.

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

背景

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献