Suppr超能文献

pyPheWAS:用于电子病历分析的表型-疾病关联工具。

pyPheWAS: A Phenome-Disease Association Tool for Electronic Medical Record Analysis.

机构信息

Department of Electrical Engineering, Vanderbilt University, Nashville, TN, USA.

Department of Computer Science, Vanderbilt University, Nashville, TN, USA.

出版信息

Neuroinformatics. 2022 Apr;20(2):483-505. doi: 10.1007/s12021-021-09553-4. Epub 2022 Jan 3.

Abstract

Along with the increasing availability of electronic medical record (EMR) data, phenome-wide association studies (PheWAS) and phenome-disease association studies (PheDAS) have become a prominent, first-line method of analysis for uncovering the secrets of EMR. Despite this recent growth, there is a lack of approachable software tools for conducting these analyses on large-scale EMR cohorts. In this article, we introduce pyPheWAS, an open-source python package for conducting PheDAS and related analyses. This toolkit includes 1) data preparation, such as cohort censoring and age-matching; 2) traditional PheDAS analysis of ICD-9 and ICD-10 billing codes; 3) PheDAS analysis applied to a novel EMR phenotype mapping: current procedural terminology (CPT) codes; and 4) novelty analysis of significant disease-phenotype associations found through PheDAS. The pyPheWAS toolkit is approachable and comprehensive, encapsulating data prep through result visualization all within a simple command-line interface. The toolkit is designed for the ever-growing scale of available EMR data, with the ability to analyze cohorts of 100,000 + patients in less than 2 h. Through a case study of Down Syndrome and other intellectual developmental disabilities, we demonstrate the ability of pyPheWAS to discover both known and potentially novel disease-phenotype associations across different experiment designs and disease groups. The software and user documentation are available in open source at https://github.com/MASILab/pyPheWAS .

摘要

随着电子病历 (EMR) 数据的日益普及,表型全基因组关联研究 (PheWAS) 和表型疾病关联研究 (PheDAS) 已成为揭示 EMR 秘密的一种突出的、首选的分析方法。尽管最近取得了这一进展,但对于在大规模 EMR 队列上进行这些分析,仍然缺乏易于使用的软件工具。在本文中,我们介绍了 pyPheWAS,这是一个用于进行 PheDAS 和相关分析的开源 Python 包。该工具包包括 1)数据准备,如队列删失和年龄匹配;2)ICD-9 和 ICD-10 计费代码的传统 PheDAS 分析;3)应用于新的 EMR 表型映射的 PheDAS 分析:当前程序术语 (CPT) 代码;4)通过 PheDAS 发现的显著疾病-表型关联的新颖性分析。pyPheWAS 工具包易于使用且全面,封装了从数据准备到结果可视化的所有内容,仅需一个简单的命令行界面即可完成。该工具包专为日益增长的 EMR 数据规模而设计,能够在不到 2 小时的时间内分析 100,000 多名患者的队列。通过唐氏综合征和其他智力发育障碍的案例研究,我们展示了 pyPheWAS 发现不同实验设计和疾病组之间已知和潜在新的疾病-表型关联的能力。软件和用户文档可在 https://github.com/MASILab/pyPheWAS 上获得开源访问。

相似文献

1
pyPheWAS: A Phenome-Disease Association Tool for Electronic Medical Record Analysis.
Neuroinformatics. 2022 Apr;20(2):483-505. doi: 10.1007/s12021-021-09553-4. Epub 2022 Jan 3.
2
pyPheWAS Explorer: a visualization tool for exploratory analysis of phenome-disease associations.
JAMIA Open. 2023 Apr 3;6(1):ooad018. doi: 10.1093/jamiaopen/ooad018. eCollection 2023 Apr.
5
PheMIME: an interactive web app and knowledge base for phenome-wide, multi-institutional multimorbidity analysis.
J Am Med Inform Assoc. 2024 Nov 1;31(11):2440-2446. doi: 10.1093/jamia/ocae182.
8
Methodology in phenome-wide association studies: a systematic review.
J Med Genet. 2021 Nov;58(11):720-728. doi: 10.1136/jmedgenet-2021-107696. Epub 2021 Jul 16.
9
Electronic Medical Record Context Signatures Improve Diagnostic Classification Using Medical Image Computing.
IEEE J Biomed Health Inform. 2019 Sep;23(5):2052-2062. doi: 10.1109/JBHI.2018.2890084. Epub 2018 Dec 28.
10
PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene-disease associations.
Bioinformatics. 2010 May 1;26(9):1205-10. doi: 10.1093/bioinformatics/btq126. Epub 2010 Mar 24.

引用本文的文献

1
Evaluating the impact of common clinical confounders on performance of deep-learning-based sepsis risk assessment.
Front Artif Intell. 2025 Jul 15;8:1452471. doi: 10.3389/frai.2025.1452471. eCollection 2025.
3
Learning disentangled representations to harmonize connectome network measures.
J Med Imaging (Bellingham). 2025 Jan;12(1):014004. doi: 10.1117/1.JMI.12.1.014004. Epub 2025 Feb 14.
4
PheWAS analysis on large-scale biobank data with PheTK.
Bioinformatics. 2024 Dec 26;41(1). doi: 10.1093/bioinformatics/btae719.
5
Learning site-invariant features of connectomes to harmonize complex network measures.
Proc SPIE Int Soc Opt Eng. 2024 Feb;12930. doi: 10.1117/12.3009645. Epub 2024 Apr 2.
6
Machine learning uncovers manganese as a key nutrient associated with reduced risk of steatotic liver disease.
Liver Int. 2024 Oct;44(10):2807-2821. doi: 10.1111/liv.16055. Epub 2024 Jul 31.
7
PYPE: A pipeline for phenome-wide association and Mendelian randomization in investigator-driven biobank scale analysis.
Patterns (N Y). 2024 May 1;5(6):100982. doi: 10.1016/j.patter.2024.100982. eCollection 2024 Jun 14.
8
Phenotyping Down syndrome: discovery and predictive modelling with electronic medical records.
J Intellect Disabil Res. 2024 May;68(5):491-511. doi: 10.1111/jir.13124. Epub 2024 Feb 1.
9
Batch size: go big or go home? Counterintuitive improvement in medical autoencoders with smaller batch size.
Proc SPIE Int Soc Opt Eng. 2023 Feb;12464. doi: 10.1117/12.2653643. Epub 2023 Apr 3.
10
Association of Helicobacter pylori Positivity With Risk of Disease and Mortality.
Clin Transl Gastroenterol. 2023 Sep 1;14(9):e00610. doi: 10.14309/ctg.0000000000000610.

本文引用的文献

1
MR-PheWAS: exploring the causal effect of SUA level on multiple disease outcomes by using genetic instruments in UK Biobank.
Ann Rheum Dis. 2018 Jul;77(7):1039-1047. doi: 10.1136/annrheumdis-2017-212534. Epub 2018 Feb 6.
2
Phenome-wide association study maps new diseases to the human major histocompatibility complex region.
J Med Genet. 2016 Oct;53(10):681-9. doi: 10.1136/jmedgenet-2016-103867. Epub 2016 Jun 10.
3
R PheWAS: data analysis and plotting tools for phenome-wide association studies in the R environment.
Bioinformatics. 2014 Aug 15;30(16):2375-6. doi: 10.1093/bioinformatics/btu197. Epub 2014 Apr 14.
4
Secondary use of clinical data: the Vanderbilt approach.
J Biomed Inform. 2014 Dec;52:28-35. doi: 10.1016/j.jbi.2014.02.003. Epub 2014 Feb 14.
6
The challenges, advantages and future of phenome-wide association studies.
Immunology. 2014 Feb;141(2):157-65. doi: 10.1111/imm.12195.
7
A PheWAS approach in studying HLA-DRB1*1501.
Genes Immun. 2013 Apr;14(3):187-91. doi: 10.1038/gene.2013.2. Epub 2013 Feb 7.
8
History of the Rochester Epidemiology Project: half a century of medical records linkage in a US population.
Mayo Clin Proc. 2012 Dec;87(12):1202-13. doi: 10.1016/j.mayocp.2012.08.012. Epub 2012 Nov 28.
10
Health supervision for children with Down syndrome.
Pediatrics. 2011 Aug;128(2):393-406. doi: 10.1542/peds.2011-1605. Epub 2011 Jul 25.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验