利用自然语言处理从电子健康记录中提取健康的社会决定因素：系统评价。

Extracting social determinants of health from electronic health records using natural language processing: a systematic review.

机构信息

Department of Population Health Sciences, Weill Cornell Medicine, New York, New York, USA.

Information Technologies and Services, Weill Cornell Medicine, New York, New York, USA.

出版信息

J Am Med Inform Assoc. 2021 Nov 25;28(12):2716-2727. doi: 10.1093/jamia/ocab170.

DOI:10.1093/jamia/ocab170

PMID:34613399

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8633615/

Abstract

OBJECTIVE

Social determinants of health (SDoH) are nonclinical dispositions that impact patient health risks and clinical outcomes. Leveraging SDoH in clinical decision-making can potentially improve diagnosis, treatment planning, and patient outcomes. Despite increased interest in capturing SDoH in electronic health records (EHRs), such information is typically locked in unstructured clinical notes. Natural language processing (NLP) is the key technology to extract SDoH information from clinical text and expand its utility in patient care and research. This article presents a systematic review of the state-of-the-art NLP approaches and tools that focus on identifying and extracting SDoH data from unstructured clinical text in EHRs.

MATERIALS AND METHODS

A broad literature search was conducted in February 2021 using 3 scholarly databases (ACL Anthology, PubMed, and Scopus) following Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. A total of 6402 publications were initially identified, and after applying the study inclusion criteria, 82 publications were selected for the final review.

RESULTS

Smoking status (n = 27), substance use (n = 21), homelessness (n = 20), and alcohol use (n = 15) are the most frequently studied SDoH categories. Homelessness (n = 7) and other less-studied SDoH (eg, education, financial problems, social isolation and support, family problems) are mostly identified using rule-based approaches. In contrast, machine learning approaches are popular for identifying smoking status (n = 13), substance use (n = 9), and alcohol use (n = 9).

CONCLUSION

NLP offers significant potential to extract SDoH data from narrative clinical notes, which in turn can aid in the development of screening tools, risk prediction models, and clinical decision support systems.

摘要

目的

健康的社会决定因素（SDoH）是非临床因素，会影响患者的健康风险和临床结果。在临床决策中利用 SDoH 可能会改善诊断、治疗计划和患者的结果。尽管人们越来越有兴趣在电子健康记录（EHR）中获取 SDoH 信息，但这些信息通常被锁定在非结构化的临床记录中。自然语言处理（NLP）是从临床文本中提取 SDoH 信息并扩大其在患者护理和研究中应用的关键技术。本文对关注从 EHR 中的非结构化临床文本中识别和提取 SDoH 数据的最先进的 NLP 方法和工具进行了系统回顾。

材料与方法

2021 年 2 月，根据系统评价和荟萃分析的首选报告项目（PRISMA）指南，在 3 个学术数据库（ACL 文集、PubMed 和 Scopus）中进行了广泛的文献检索。最初共确定了 6402 篇出版物，在应用研究纳入标准后，选择了 82 篇出版物进行最终综述。

结果

吸烟状况（n=27）、物质使用（n=21）、无家可归（n=20）和酒精使用（n=15）是最常研究的 SDoH 类别。无家可归（n=7）和其他研究较少的 SDoH（例如，教育、经济问题、社会孤立和支持、家庭问题）主要通过基于规则的方法来识别。相比之下，机器学习方法常用于识别吸烟状况（n=13）、物质使用（n=9）和酒精使用（n=9）。

结论

NLP 提供了从叙述性临床记录中提取 SDoH 数据的巨大潜力，从而有助于开发筛选工具、风险预测模型和临床决策支持系统。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7e18/8633615/783cdda17e8c/ocab170f1.jpg

相似文献

Extracting social determinants of health from electronic health records using natural language processing: a systematic review.

J Am Med Inform Assoc. 2021 Nov 25;28(12):2716-2727. doi: 10.1093/jamia/ocab170.

Classifying social determinants of health from unstructured electronic health records using deep learning-based natural language processing.

J Biomed Inform. 2022 Mar;127:103984. doi: 10.1016/j.jbi.2021.103984. Epub 2022 Jan 7.

Natural language processing to identify social determinants of health in Alzheimer's disease and related dementia from electronic health records.

Health Serv Res. 2023 Dec;58(6):1292-1302. doi: 10.1111/1475-6773.14210. Epub 2023 Aug 3.

Extracting social determinants of health events with transformer-based multitask, multilabel named entity recognition.

J Am Med Inform Assoc. 2023 Jul 19;30(8):1379-1388. doi: 10.1093/jamia/ocad046.

Extracting adverse drug events from clinical Notes: A systematic review of approaches used.

J Biomed Inform. 2024 Mar;151:104603. doi: 10.1016/j.jbi.2024.104603. Epub 2024 Feb 6.

Social determinants of health in electronic health records and their impact on analysis and risk prediction: A systematic review.

J Am Med Inform Assoc. 2020 Nov 1;27(11):1764-1773. doi: 10.1093/jamia/ocaa143.

Leveraging natural language processing to augment structured social determinants of health data in the electronic health record.

J Am Med Inform Assoc. 2023 Jul 19;30(8):1389-1397. doi: 10.1093/jamia/ocad073.

Natural language processing with machine learning methods to analyze unstructured patient-reported outcomes derived from electronic health records: A systematic review.

Artif Intell Med. 2023 Dec;146:102701. doi: 10.1016/j.artmed.2023.102701. Epub 2023 Nov 1.

Social Determinants of Health Documentation in Structured and Unstructured Clinical Data of Patients With Diabetes: Comparative Analysis.

JMIR Med Inform. 2023 Aug 22;11:e46159. doi: 10.2196/46159.

Barriers and Facilitators of Obtaining Social Determinants of Health of Patients With Cancer Through the Electronic Health Record Using Natural Language Processing Technology: Qualitative Feasibility Study With Stakeholder Interviews.

JMIR Form Res. 2022 Dec 27;6(12):e43059. doi: 10.2196/43059.

引用本文的文献

Enabling discovery of the social determinants of health: using a specialized lens to see beyond the surface.

J Med Libr Assoc. 2025 Jul 1;113(3):204-222. doi: 10.5195/jmla.2025.2186. Epub 2025 Aug 1.

Performance of 4 Methods to Assess Health-Related Social Needs.

JAMA Netw Open. 2025 Aug 1;8(8):e2527426. doi: 10.1001/jamanetworkopen.2025.27426.

SynthEHR-Eviction: Enhancing Eviction SDoH Detection with LLM-Augmented Synthetic EHR Data.

medRxiv. 2025 Jul 14:2025.07.10.25331237. doi: 10.1101/2025.07.10.25331237.

Unveiling social determinants of health impact on adverse pregnancy outcomes through natural language processing.

Sci Rep. 2025 Aug 9;15(1):29183. doi: 10.1038/s41598-025-13542-x.

Academic case reports lack diversity: Assessing the presence and diversity of sociodemographic and behavioral factors related to Post COVID-19 Condition.

PLoS One. 2025 Jul 2;20(7):e0326668. doi: 10.1371/journal.pone.0326668. eCollection 2025.

Deep learning for occupation recognition and knowledge discovery in rheumatology clinical notes.

Sci Rep. 2025 Jul 1;15(1):20944. doi: 10.1038/s41598-025-05294-5.

Z-Coding for Social Contributors to Health in Colorado Federally Qualified Health Centers.

Nurs Res. 2025;74(4):318-323. doi: 10.1097/NNR.0000000000000817.

Leveraging Social Determinants of Health in Alzheimer's Research Using LLM-Augmented Literature Mining and Knowledge Graphs.

AMIA Jt Summits Transl Sci Proc. 2025 Jun 10;2025:491-500. eCollection 2025.

Studying Veteran food insecurity longitudinally using electronic health record data and natural language processing.

AMIA Jt Summits Transl Sci Proc. 2025 Jun 10;2025:124-133. eCollection 2025.

Development of a Surveillance System to Identify Incidence of Evictions Among Patients in Veterans Affairs Medical Centers Across the United States.

J Community Health. 2025 Jun 9. doi: 10.1007/s10900-025-01491-5.

本文引用的文献

Social and Behavioral Determinants of Health in the Era of Artificial Intelligence with Electronic Health Records: A Scoping Review.

Health Data Sci. 2021 Aug 24;2021:9759016. doi: 10.34133/2021/9759016. eCollection 2021.

Social Determinants of Health 201 for Health Care: Plan, Do, Study, Act.

NAM Perspect. 2021 Jun 21;2021. doi: 10.31478/202106c. eCollection 2021.

Identification of social determinants of health using multi-label classification of electronic health record clinical notes.

JAMIA Open. 2021 Feb 9;4(3):ooaa069. doi: 10.1093/jamiaopen/ooaa069. eCollection 2021 Jul.

Examining the Interfacility Variation of Social Determinants of Health in the Veterans Health Administration.

Fed Pract. 2021 Jan;38(1):15-19. doi: 10.12788/fp.0080.

Annotating social determinants of health using active learning, and characterizing determinants using neural event extraction.

J Biomed Inform. 2021 Jan;113:103631. doi: 10.1016/j.jbi.2020.103631. Epub 2020 Dec 5.

Social determinants of health in electronic health records and their impact on analysis and risk prediction: A systematic review.

J Am Med Inform Assoc. 2020 Nov 1;27(11):1764-1773. doi: 10.1093/jamia/ocaa143.

Utilization of Social Determinants of Health ICD-10 Z-Codes Among Hospitalized Patients in the United States, 2016-2017.

Med Care. 2020 Dec;58(12):1037-1043. doi: 10.1097/MLR.0000000000001418.

The association between neighbourhood characteristics and physical victimisation in men and women with mental disorders.

BJPsych Open. 2020 Jul 16;6(4):e73. doi: 10.1192/bjo.2020.52.

Discovering New Social Determinants of Health Concepts from Unstructured Data: Framework and Evaluation.

Stud Health Technol Inform. 2020 Jun 16;270:173-177. doi: 10.3233/SHTI200145.

Extracting Smoking Status from Electronic Health Records Using NLP and Deep Learning.

AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:507-516. eCollection 2020.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用自然语言处理从电子健康记录中提取健康的社会决定因素：系统评价。

Extracting social determinants of health from electronic health records using natural language processing: a systematic review.

机构信息

出版信息

OBJECTIVE

MATERIALS AND METHODS

RESULTS

CONCLUSION

目的

材料与方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献