Suppr超能文献

复杂医疗决策场景中人工智能模型的比较分析:评估ChatGPT、Claude AI、Bard和Perplexity

A Comparative Analysis of AI Models in Complex Medical Decision-Making Scenarios: Evaluating ChatGPT, Claude AI, Bard, and Perplexity.

作者信息

Uppalapati Vamsi Krishna, Nag Deb Sanjay

机构信息

Department of Anesthesiology, Tata Main Hospital, Jamshedpur, IND.

出版信息

Cureus. 2024 Jan 18;16(1):e52485. doi: 10.7759/cureus.52485. eCollection 2024 Jan.

Abstract

This study rigorously evaluates the performance of four artificial intelligence (AI) language models - ChatGPT, Claude AI, Google Bard, and Perplexity AI - across four key metrics: accuracy, relevance, clarity, and completeness. We used a strong mix of research methods, getting opinions from 14 scenarios. This helped us make sure our findings were accurate and dependable. The study showed that Claude AI performs better than others because it gives complete responses. Its average score was 3.64 for relevance and 3.43 for completeness compared to other AI tools. ChatGPT always did well, and Google Bard had unclear responses, which varied greatly, making it difficult to understand it, so there was no consistency in Google Bard. These results give important information about what AI language models are doing well or not for medical suggestions. They help us use them better, telling us how to improve future tech changes that use AI. The study shows that AI abilities match complex medical scenarios.

摘要

本研究严格评估了四种人工智能(AI)语言模型——ChatGPT、Claude AI、谷歌巴德(Google Bard)和Perplexity AI——在四个关键指标上的表现:准确性、相关性、清晰度和完整性。我们采用了多种研究方法,从14个场景中获取意见。这有助于确保我们的研究结果准确可靠。研究表明,Claude AI表现优于其他模型,因为它给出的回答完整。与其他人工智能工具相比,其相关性平均得分为3.64,完整性平均得分为3.43。ChatGPT一直表现出色,而谷歌巴德的回答不清晰,差异很大,难以理解,因此谷歌巴德缺乏一致性。这些结果提供了关于人工智能语言模型在提供医学建议方面表现优劣的重要信息。它们有助于我们更好地使用这些模型,告诉我们如何改进未来使用人工智能的技术变革。研究表明,人工智能的能力与复杂的医疗场景相匹配。

相似文献

2
Assessing the Accuracy of Information on Medication Abortion: A Comparative Analysis of ChatGPT and Google Bard AI.
Cureus. 2024 Jan 2;16(1):e51544. doi: 10.7759/cureus.51544. eCollection 2024 Jan.
3
Radiologic Decision-Making for Imaging in Pulmonary Embolism: Accuracy and Reliability of Large Language Models-Bing, Claude, ChatGPT, and Perplexity.
Indian J Radiol Imaging. 2024 Jul 4;34(4):653-660. doi: 10.1055/s-0044-1787974. eCollection 2024 Oct.
4
The performance of artificial intelligence models in generating responses to general orthodontic questions: ChatGPT vs Google Bard.
Am J Orthod Dentofacial Orthop. 2024 Jun;165(6):652-662. doi: 10.1016/j.ajodo.2024.01.012. Epub 2024 Mar 15.
6
Understanding the Landscape: The Emergence of Artificial Intelligence (AI), ChatGPT, and Google Bard in Gastroenterology.
Cureus. 2024 Jan 8;16(1):e51848. doi: 10.7759/cureus.51848. eCollection 2024 Jan.

引用本文的文献

1
ChatGPT's role in the rapidly evolving hematologic cancer landscape.
Future Sci OA. 2025 Dec;11(1):2546259. doi: 10.1080/20565623.2025.2546259. Epub 2025 Sep 3.
6
Evaluating the Use of Generative Artificial Intelligence to Support Genetic Counseling for Rare Diseases.
Diagnostics (Basel). 2025 Mar 10;15(6):672. doi: 10.3390/diagnostics15060672.
7
Generative AI Decision-Making Attributes in Complex Health Services: A Rapid Review.
Cureus. 2025 Jan 30;17(1):e78257. doi: 10.7759/cureus.78257. eCollection 2025 Jan.
8
Opportunities and Challenges of Chatbots in Ophthalmology: A Narrative Review.
J Pers Med. 2024 Dec 21;14(12):1165. doi: 10.3390/jpm14121165.
10
Assessing AI efficacy in medical knowledge tests: A study using Taiwan's internal medicine exam questions from 2020 to 2023.
Digit Health. 2024 Oct 18;10:20552076241291404. doi: 10.1177/20552076241291404. eCollection 2024 Jan-Dec.

本文引用的文献

1
Perioperative Management for Non-Thyroidal Surgery in Thyroid Dysfunction.
Indian J Endocrinol Metab. 2022 Sep-Oct;26(5):428-434. doi: 10.4103/ijem.ijem_273_22. Epub 2022 Nov 22.
3
Interstitial lung disease following coronavirus disease 2019.
Curr Opin Pulm Med. 2022 Sep 1;28(5):399-406. doi: 10.1097/MCP.0000000000000900.
4
Defining AMIA's artificial intelligence principles.
J Am Med Inform Assoc. 2022 Mar 15;29(4):585-591. doi: 10.1093/jamia/ocac006.
6
Adverse intraoperative events during surgical repair of ruptured cerebral aneurysms: a systematic review.
Neurosurg Rev. 2021 Jun;44(3):1273-1285. doi: 10.1007/s10143-020-01312-4. Epub 2020 Jun 16.
7
Are Tracheotomies Required for Patients Undergoing Composite Mandibular Resections for Oral Cancer?
J Oral Maxillofac Surg. 2020 Aug;78(8):1427-1435. doi: 10.1016/j.joms.2020.03.027. Epub 2020 Apr 6.
8
Ludwig's Angina: Anesthetic Management.
Anesth Prog. 2019 Summer;66(2):103-110. doi: 10.2344/anpr-66-01-13.
10
Long-Term Survival After Arterial Versus Atrial Switch in d-Transposition of the Great Arteries.
Ann Thorac Surg. 2018 Dec;106(6):1827-1833. doi: 10.1016/j.athoracsur.2018.06.084. Epub 2018 Aug 31.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验