Yang Xi, Chen Aokun, PourNejatian Nima, Shin Hoo Chang, Smith Kaleb E, Parisien Christopher, Compas Colin, Martin Cheryl, Costa Anthony B, Flores Mona G, Zhang Ying, Magoc Tanja, Harle Christopher A, Lipori Gloria, Mitchell Duane A, Hogan William R, Shenkman Elizabeth A, Bian Jiang, Wu Yonghui
Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA.
Cancer Informatics and eHealth core, University of Florida Health Cancer Center, Gainesville, FL, USA.
NPJ Digit Med. 2022 Dec 26;5(1):194. doi: 10.1038/s41746-022-00742-2.
There is an increasing interest in developing artificial intelligence (AI) systems to process and interpret electronic health records (EHRs). Natural language processing (NLP) powered by pretrained language models is the key technology for medical AI systems utilizing clinical narratives. However, there are few clinical language models, the largest of which trained in the clinical domain is comparatively small at 110 million parameters (compared with billions of parameters in the general domain). It is not clear how large clinical language models with billions of parameters can help medical AI systems utilize unstructured EHRs. In this study, we develop from scratch a large clinical language model-GatorTron-using >90 billion words of text (including >82 billion words of de-identified clinical text) and systematically evaluate it on five clinical NLP tasks including clinical concept extraction, medical relation extraction, semantic textual similarity, natural language inference (NLI), and medical question answering (MQA). We examine how (1) scaling up the number of parameters and (2) scaling up the size of the training data could benefit these NLP tasks. GatorTron models scale up the clinical language model from 110 million to 8.9 billion parameters and improve five clinical NLP tasks (e.g., 9.6% and 9.5% improvement in accuracy for NLI and MQA), which can be applied to medical AI systems to improve healthcare delivery. The GatorTron models are publicly available at: https://catalog.ngc.nvidia.com/orgs/nvidia/teams/clara/models/gatortron_og .
开发用于处理和解读电子健康记录(EHR)的人工智能(AI)系统正引发越来越多的关注。由预训练语言模型驱动的自然语言处理(NLP)是利用临床叙述的医学AI系统的关键技术。然而,临床语言模型很少,其中在临床领域训练的最大模型参数相对较少,仅1.1亿个参数(相比通用领域的数十亿个参数)。尚不清楚具有数十亿参数的大型临床语言模型如何帮助医学AI系统利用非结构化的电子健康记录。在本研究中,我们从零开始开发了一个大型临床语言模型——GatorTron,使用了超过900亿个单词的文本(包括超过820亿个去识别化的临床文本),并在五个临床NLP任务上对其进行了系统评估,这些任务包括临床概念提取、医学关系提取、语义文本相似度、自然语言推理(NLI)和医学问答(MQA)。我们研究了(1)扩大参数数量和(2)扩大训练数据规模如何能使这些NLP任务受益。GatorTron模型将临床语言模型的参数从1.1亿个扩大到89亿个,并改善了五个临床NLP任务(例如,NLI和MQA的准确率分别提高了9.6%和9.5%),可应用于医学AI系统以改善医疗服务。GatorTron模型可在以下网址公开获取:https://catalog.ngc.nvidia.com/orgs/nvidia/teams/clara/models/gatortron_og 。