Suppr超能文献

突破骨骼,突破障碍:ChatGPT、DeepSeek和Gemini在手部骨折管理中的应用

Breaking Bones, Breaking Barriers: ChatGPT, DeepSeek, and Gemini in Hand Fracture Management.

作者信息

Marcaccini Gianluca, Seth Ishith, Xie Yi, Susini Pietro, Pozzi Mirco, Cuomo Roberto, Rozen Warren M

机构信息

Plastic Surgery Unit, Department of Medicine, Surgery and Neuroscience, University of Siena, 53100 Siena, Italy.

Department of Plastic and Reconstructive Surgery, Peninsula Health, Frankston, VIC 3199, Australia.

出版信息

J Clin Med. 2025 Mar 14;14(6):1983. doi: 10.3390/jcm14061983.

Abstract

: Hand fracture management requires precise diagnostic accuracy and complex decision-making. Advances in artificial intelligence (AI) suggest that large language models (LLMs) may assist or even rival traditional clinical approaches. This study evaluates the effectiveness of ChatGPT-4o, DeepSeek-V3, and Gemini 1.5 in diagnosing and recommending treatment strategies for hand fractures compared to experienced surgeons. : A retrospective analysis of 58 anonymized hand fracture cases was conducted. Clinical details, including fracture site, displacement, and soft-tissue involvement, were provided to the AI models, which generated management plans. Their recommendations were compared to actual surgeon decisions, assessing accuracy, precision, recall, and F1 score. : ChatGPT-4o demonstrated the highest accuracy (98.28%) and recall (91.74%), effectively identifying most correct interventions but occasionally proposing extraneous options (precision 58.48%). DeepSeek-V3 showed moderate accuracy (63.79%), with balanced precision (61.17%) and recall (57.89%), sometimes omitting correct treatments. Gemini 1.5 performed poorly (accuracy 18.97%), with low precision and recall, indicating substantial limitations in clinical decision support. : AI models can enhance clinical workflows, particularly in radiographic interpretation and triage, but their limitations highlight the irreplaceable role of human expertise in complex hand trauma management. ChatGPT-4o demonstrated promising accuracy but requires refinement. Ethical concerns regarding AI-driven medical decisions, including bias and transparency, must be addressed before widespread clinical implementation.

摘要

手部骨折的处理需要精确的诊断准确性和复杂的决策。人工智能(AI)的进展表明,大语言模型(LLMs)可能辅助甚至媲美传统临床方法。本研究评估了ChatGPT-4o、DeepSeek-V3和Gemini 1.5在诊断手部骨折并推荐治疗策略方面与经验丰富的外科医生相比的有效性。

对58例匿名手部骨折病例进行了回顾性分析。将包括骨折部位、移位和软组织受累情况在内的临床细节提供给人工智能模型,这些模型生成了处理方案。将它们的建议与外科医生的实际决策进行比较,评估准确性、精确性、召回率和F1分数。

ChatGPT-4o表现出最高的准确性(98.28%)和召回率(91.74%),能有效识别出大多数正确的干预措施,但偶尔会提出无关选项(精确性58.48%)。DeepSeek-V3表现出中等准确性(63.79%),精确性(61.17%)和召回率(57.89%)较为平衡,有时会遗漏正确的治疗方法。Gemini 1.5表现较差(准确性18.97%),精确性和召回率较低,表明在临床决策支持方面存在重大局限性。

人工智能模型可以改善临床工作流程,特别是在影像学解读和分诊方面,但其局限性凸显了人类专业知识在复杂手部创伤处理中不可替代的作用。ChatGPT-4o表现出了有前景的准确性,但需要改进。在广泛临床应用之前,必须解决与人工智能驱动的医疗决策相关的伦理问题,包括偏差和透明度问题。

相似文献

1
3
Can deepseek and ChatGPT be used in the diagnosis of oral pathologies?
BMC Oral Health. 2025 Apr 25;25(1):638. doi: 10.1186/s12903-025-06034-x.
5
Use of Multimodal Artificial Intelligence in Surgical Instrument Recognition.
Bioengineering (Basel). 2025 Jan 15;12(1):72. doi: 10.3390/bioengineering12010072.
7
Comparative Analysis of ChatGPT-4o and Gemini Advanced Performance on Diagnostic Radiology In-Training Exams.
Cureus. 2025 Mar 20;17(3):e80874. doi: 10.7759/cureus.80874. eCollection 2025 Mar.
8
Comparative analysis of ChatGPT-4o mini, ChatGPT-4o and Gemini Advanced in the treatment of postmenopausal osteoporosis.
BMC Musculoskelet Disord. 2025 Apr 16;26(1):369. doi: 10.1186/s12891-025-08601-3.
9
Comparative performance of artificial intelligence models in rheumatology board-level questions: evaluating Google Gemini and ChatGPT-4o.
Clin Rheumatol. 2024 Nov;43(11):3507-3513. doi: 10.1007/s10067-024-07154-5. Epub 2024 Sep 28.

引用本文的文献

2
3
Diagnostic Performance of ChatGPT-4o in Detecting Hip Fractures on Pelvic X-rays.
Cureus. 2025 Jun 24;17(6):e86654. doi: 10.7759/cureus.86654. eCollection 2025 Jun.

本文引用的文献

1
Revolutionizing surgery: AI and robotics for precision, risk reduction, and innovation.
J Robot Surg. 2025 Jan 7;19(1):47. doi: 10.1007/s11701-024-02205-0.
2
Chatbots for breast cancer education: a systematic review and meta-analysis.
Support Care Cancer. 2024 Dec 27;33(1):55. doi: 10.1007/s00520-024-09096-9.
3
The Algorithmic Divide: A Systematic Review on AI-Driven Racial Disparities in Healthcare.
J Racial Ethn Health Disparities. 2024 Dec 18. doi: 10.1007/s40615-024-02237-0.
4
Letter on: "Artificial Intelligence: Enhancing Scientific Presentations in Aesthetic Surgery".
Aesthetic Plast Surg. 2024 Dec 9. doi: 10.1007/s00266-024-04592-z.
5
Role of Artificial Intelligence and Machine Learning in Facial Aesthetic Surgery: A Systematic Review.
Facial Plast Surg Aesthet Med. 2024 Nov-Dec;26(6):679-705. doi: 10.1089/fpsam.2024.0204.
6
Decoding the Impact of AI on Microsurgery: Systematic Review and Classification of Six Subdomains for Future Development.
Plast Reconstr Surg Glob Open. 2024 Nov 20;12(11):e6323. doi: 10.1097/GOX.0000000000006323. eCollection 2024 Nov.
7
Machine Learning, Deep Learning, Artificial Intelligence and Aesthetic Plastic Surgery: A Qualitative Systematic Review.
Aesthetic Plast Surg. 2025 Jan;49(1):389-399. doi: 10.1007/s00266-024-04421-3. Epub 2024 Oct 9.
8
Artificial Intelligence in Facial Plastics and Reconstructive Surgery.
Otolaryngol Clin North Am. 2024 Oct;57(5):843-852. doi: 10.1016/j.otc.2024.05.002. Epub 2024 Jul 8.
9
Applications of artificial intelligence in facial plastic and reconstructive surgery: a systematic review.
Curr Opin Otolaryngol Head Neck Surg. 2024 Aug 1;32(4):222-233. doi: 10.1097/MOO.0000000000000975. Epub 2024 Apr 19.
10
Artificial Intelligence and Submissions to Annals of Plastic Surgery.
Ann Plast Surg. 2024 May 1;92(5):487-488. doi: 10.1097/SAP.0000000000003997.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验