Isleem Ula N, Zaidat Bashar, Ren Renee, Geng Eric A, Burapachaisri Aonnicha, Tang Justin E, Kim Jun S, Cho Samuel K
Department of Orthopaedic Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
J Orthop. 2023 Nov 5;53:27-33. doi: 10.1016/j.jor.2023.10.026. eCollection 2024 Jul.
Resident training programs in the US use the Orthopaedic In-Training Examination (OITE) developed by the American Academy of Orthopaedic Surgeons (AAOS) to assess the current knowledge of their residents and to identify the residents at risk of failing the Amerian Board of Orthopaedic Surgery (ABOS) examination. Optimal strategies for OITE preparation are constantly being explored. There may be a role for Large Language Models (LLMs) in orthopaedic resident education. ChatGPT, an LLM launched in late 2022 has demonstrated the ability to produce accurate, detailed answers, potentially enabling it to aid in medical education and clinical decision-making. The purpose of this study is to evaluate the performance of ChatGPT on Orthopaedic In-Training Examinations using Self-Assessment Exams from the AAOS database and approved literature as a proxy for the Orthopaedic Board Examination.
301 SAE questions from the AAOS database and associated AAOS literature were input into ChatGPT's interface in a question and multiple-choice format and the answers were then analyzed to determine which answer choice was selected. A new chat was used for every question. All answers were recorded, categorized, and compared to the answer given by the OITE and SAE exams, noting whether the answer was right or wrong.
Of the 301 questions asked, ChatGPT was able to correctly answer 183 (60.8%) of them. The subjects with the highest percentage of correct questions were basic science (81%), oncology (72.7%, shoulder and elbow (71.9%), and sports (71.4%). The questions were further subdivided into 3 groups: those about management, diagnosis, or knowledge recall. There were 86 management questions and 47 were correct (54.7%), 45 diagnosis questions with 32 correct (71.7%), and 168 knowledge recall questions with 102 correct (60.7%).
ChatGPT has the potential to provide orthopedic educators and trainees with accurate clinical conclusions for the majority of board-style questions, although its reasoning should be carefully analyzed for accuracy and clinical validity. As such, its usefulness in a clinical educational context is currently limited but rapidly evolving.
ChatGPT can access a multitude of medical data and may help provide accurate answers to clinical questions.
美国住院医师培训项目使用美国矫形外科医师学会(AAOS)开发的骨科住院医师培训考试(OITE)来评估住院医师的当前知识水平,并识别有美国骨科医师委员会(ABOS)考试不及格风险的住院医师。OITE备考的最佳策略一直在探索中。大语言模型(LLMs)在骨科住院医师教育中可能会发挥作用。ChatGPT是2022年末推出的一个大语言模型,已证明有能力给出准确、详细的答案,这可能使其有助于医学教育和临床决策。本研究的目的是使用AAOS数据库中的自我评估考试和经批准的文献作为骨科委员会考试的替代,来评估ChatGPT在骨科住院医师培训考试中的表现。
将来自AAOS数据库的301道SAE问题及相关的AAOS文献以问题和多项选择题的形式输入ChatGPT的界面,然后分析答案以确定选择了哪个答案选项。每个问题使用一个新的聊天窗口。记录所有答案,进行分类,并与OITE和SAE考试给出的答案进行比较,记录答案是否正确。
在提出的301个问题中,ChatGPT能够正确回答其中的183个(60.8%)。正确问题比例最高的主题是基础科学(81%)、肿瘤学(72.7%)、肩部和肘部(71.9%)以及运动医学(71.4%)。这些问题进一步细分为3组:关于管理、诊断或知识回忆的问题。有86个管理问题,其中47个正确(54.7%);45个诊断问题,其中32个正确(71.7%);168个知识回忆问题,其中102个正确(60.7%)。
ChatGPT有潜力为骨科教育工作者和学员提供大多数委员会风格问题的准确临床结论,不过其推理的准确性和临床有效性应仔细分析。因此,它在临床教育背景下的有用性目前有限,但正在迅速发展。
ChatGPT可以获取大量医学数据,可能有助于为临床问题提供准确答案。