Suppr超能文献

基于与电子健康记录相关联的生物银行的健康研究的新兴领域:现有资源、统计挑战和潜在机会。

The emerging landscape of health research based on biobanks linked to electronic health records: Existing resources, statistical challenges, and potential opportunities.

机构信息

Department of Biostatistics, University of Michigan, Ann Arbor, Michigan.

Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan.

出版信息

Stat Med. 2020 Mar 15;39(6):773-800. doi: 10.1002/sim.8445. Epub 2019 Dec 20.

Abstract

Biobanks linked to electronic health records provide rich resources for health-related research. With improvements in administrative and informatics infrastructure, the availability and utility of data from biobanks have dramatically increased. In this paper, we first aim to characterize the current landscape of available biobanks and to describe specific biobanks, including their place of origin, size, and data types. The development and accessibility of large-scale biorepositories provide the opportunity to accelerate agnostic searches, expedite discoveries, and conduct hypothesis-generating studies of disease-treatment, disease-exposure, and disease-gene associations. Rather than designing and implementing a single study focused on a few targeted hypotheses, researchers can potentially use biobanks' existing resources to answer an expanded selection of exploratory questions as quickly as they can analyze them. However, there are many obvious and subtle challenges with the design and analysis of biobank-based studies. Our second aim is to discuss statistical issues related to biobank research such as study design, sampling strategy, phenotype identification, and missing data. We focus our discussion on biobanks that are linked to electronic health records. Some of the analytic issues are illustrated using data from the Michigan Genomics Initiative and UK Biobank, two biobanks with two different recruitment mechanisms. We summarize the current body of literature for addressing these challenges and discuss some standing open problems. This work complements and extends recent reviews about biobank-based research and serves as a resource catalog with analytical and practical guidance for statisticians, epidemiologists, and other medical researchers pursuing research using biobanks.

摘要

生物库与电子健康记录相关联,为与健康相关的研究提供了丰富的资源。随着管理和信息基础设施的改进,生物库数据的可用性和实用性大大提高。在本文中,我们首先旨在描述可用生物库的当前现状,并描述特定的生物库,包括它们的来源、规模和数据类型。大规模生物库的开发和可及性为加速无偏搜索、加快发现以及进行疾病-治疗、疾病-暴露和疾病-基因关联的假设生成研究提供了机会。研究人员可以利用生物库现有的资源来回答更多探索性问题,而不是设计和实施一项针对少数目标假设的单一研究,只要他们能够分析这些问题,就可以尽快回答这些问题。然而,基于生物库的研究在设计和分析方面存在许多明显和微妙的挑战。我们的第二个目标是讨论与生物库研究相关的统计问题,如研究设计、抽样策略、表型识别和缺失数据。我们专注于与电子健康记录相关联的生物库。使用密歇根基因组倡议和英国生物库的数据来说明一些分析问题,这两个生物库有两种不同的招募机制。我们总结了目前解决这些挑战的文献,并讨论了一些悬而未决的问题。这项工作补充并扩展了最近关于基于生物库的研究的评论,并作为一个资源目录,为使用生物库进行研究的统计学家、流行病学家和其他医学研究人员提供分析和实践指导。

相似文献

4
The Michigan Genomics Initiative: A biobank linking genotypes and electronic clinical records in Michigan Medicine patients.
Cell Genom. 2023 Jan 31;3(2):100257. doi: 10.1016/j.xgen.2023.100257. eCollection 2023 Feb 8.
5
Impact of biobanks on research outcomes in rare diseases: a systematic review.
Orphanet J Rare Dis. 2018 Nov 12;13(1):202. doi: 10.1186/s13023-018-0942-z.
8
Biobanks and personalized medicine.
Clin Genet. 2014 Jul;86(1):50-5. doi: 10.1111/cge.12370. Epub 2014 Mar 27.
9
United Kingdom Biobank (UK Biobank): JACC Focus Seminar 6/8.
J Am Coll Cardiol. 2021 Jul 6;78(1):56-65. doi: 10.1016/j.jacc.2021.03.342.
10
Ethical governance in biobanks linked to electronic health records.
Eur Rev Med Pharmacol Sci. 2015 Nov;19(21):4182-6.

引用本文的文献

2
Implications of the choice of method to identify major depressive disorder in large research cohorts.
J Mood Anxiety Disord. 2025 Jun 19;11:100136. doi: 10.1016/j.xjmad.2025.100136. eCollection 2025 Sep.
3
Overcoming Barriers in Cancer Biology Research: Current Limitations and Solutions.
Cancers (Basel). 2025 Jun 23;17(13):2102. doi: 10.3390/cancers17132102.
4
Revisiting representativeness.
Int J Epidemiol. 2025 Jun 11;54(4). doi: 10.1093/ije/dyaf109.
7
Clinical and genetic associations for night eating syndrome in a patient biobank.
J Eat Disord. 2024 Dec 23;12(1):211. doi: 10.1186/s40337-024-01180-z.
8
Benefits and Challenges of Rare Genetic Variation in Alzheimer's Disease.
Curr Genet Med Rep. 2019 Mar;7(1):53-62. doi: 10.1007/s40142-019-0161-5. Epub 2019 Feb 1.
9
Clinical associations with treatment resistance in depression: An electronic health record study.
Psychiatry Res. 2024 Dec;342:116203. doi: 10.1016/j.psychres.2024.116203. Epub 2024 Sep 16.
10
A framework for understanding selection bias in real-world healthcare data.
J R Stat Soc Ser A Stat Soc. 2024 May 2;187(3):606-635. doi: 10.1093/jrsssa/qnae039. eCollection 2024 Aug.

本文引用的文献

1
Tutorial: a guide to performing polygenic risk score analyses.
Nat Protoc. 2020 Sep;15(9):2759-2772. doi: 10.1038/s41596-020-0353-1. Epub 2020 Jul 24.
3
Improved polygenic prediction by Bayesian multiple regression on summary statistics.
Nat Commun. 2019 Nov 8;10(1):5086. doi: 10.1038/s41467-019-12653-0.
4
Scalable and accurate deep learning with electronic health records.
NPJ Digit Med. 2018 May 8;1:18. doi: 10.1038/s41746-018-0029-1. eCollection 2018.
5
Genes for Good: Engaging the Public in Genetics Research via Social Media.
Am J Hum Genet. 2019 Jul 3;105(1):65-77. doi: 10.1016/j.ajhg.2019.05.006. Epub 2019 Jun 13.
6
Polygenic prediction via Bayesian regression and continuous shrinkage priors.
Nat Commun. 2019 Apr 16;10(1):1776. doi: 10.1038/s41467-019-09718-5.
8
Diagnostic methods for uncovering outcome dependent visit processes.
Biostatistics. 2020 Jul 1;21(3):483-498. doi: 10.1093/biostatistics/kxy068.
10
Automated feature selection of predictors in electronic medical records data.
Biometrics. 2019 Mar;75(1):268-277. doi: 10.1111/biom.12987. Epub 2019 Apr 2.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验