Suppr超能文献

评估来自单细胞RNA测序数据的遗传血统推断

Evaluating genetic-ancestry inference from single-cell RNA-seq data.

作者信息

Yao Jianing, Gazal Steven

机构信息

Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.

Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.

出版信息

bioRxiv. 2025 Mar 28:2025.03.25.645175. doi: 10.1101/2025.03.25.645175.

Abstract

Characterizing the ancestry of donors in single-cell RNA sequencing (scRNA-seq) studies is critical to ensure the genetic homogeneity of the dataset and reduce biases in analyses, to identify ancestry-specific regulatory mechanisms and understand their downstream role in diseases, and to ensure that existing datasets are representative of human genetic diversity. While scRNA-seq is now widely available, the information on the ancestry of the donors is often missing, hindering further analysis. Here we propose a framework to evaluate methods for inferring genetic-ancestry from genetic polymorphisms detected from scRNA-seq reads. We demonstrate that widely used tools (e.g., ADMIXTURE) provide accurate inference of genetic-ancestry and admixture proportions despite the limited number of genetic polymorphisms identified and imperfect variant calling from scRNA-seq reads. We inferred genetic-ancestry for 196 donors from four scRNA-seq datasets from the Human Cell Atlas and highlighted an extremely large proportion of donors of European ancestry. For researchers generating single-cell datasets, we recommend reporting genetic-ancestry inference for all donors and generating datasets that represent diverse ancestries.

摘要

在单细胞RNA测序(scRNA-seq)研究中,确定供体的血统对于确保数据集的基因同质性、减少分析偏差、识别特定血统的调控机制并了解其在疾病中的下游作用,以及确保现有数据集能够代表人类遗传多样性至关重要。虽然scRNA-seq现在已广泛应用,但供体血统的信息往往缺失,这阻碍了进一步分析。在此,我们提出了一个框架,用于评估从scRNA-seq读数中检测到的基因多态性推断遗传血统的方法。我们证明,尽管从scRNA-seq读数中鉴定出的基因多态性数量有限且变异调用不完美,但广泛使用的工具(如ADMIXTURE)仍能准确推断遗传血统和混合比例。我们从人类细胞图谱的四个scRNA-seq数据集中推断了196名供体的遗传血统,并强调了欧洲血统供体的比例极高。对于生成单细胞数据集的研究人员,我们建议报告所有供体的遗传血统推断结果,并生成代表不同血统的数据集。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e30/11974901/42fb72630a2a/nihpp-2025.03.25.645175v1-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验