Suppr超能文献

基于多目标优化的单细胞RNA测序缺失值插补

Imputing dropouts for single-cell RNA sequencing based on multi-objective optimization.

作者信息

Jin Ke, Li Bo, Yan Hong, Zhang Xiao-Fei

机构信息

Department of Statistics, School of Mathematics and Statistics, Central China Normal University, Wuhan 430079, China.

Hubei Key Laboratory of Mathematical Sciences, Central China Normal University, Wuhan 430079, China.

出版信息

Bioinformatics. 2022 Jun 13;38(12):3222-3230. doi: 10.1093/bioinformatics/btac300.

Abstract

MOTIVATION

Single-cell RNA sequencing (scRNA-seq) technologies have been testified revolutionary for their promotion on the profiling of single-cell transcriptomes at single-cell resolution. Excess zeros due to various technical noises, called dropouts, will mislead downstream analyses. Therefore, it is crucial to have accurate imputation methods to address the dropout problem.

RESULTS

In this article, we develop a new dropout imputation method for scRNA-seq data based on multi-objective optimization. Our method is different from existing ones, which assume that the underlying data has a preconceived structure and impute the dropouts according to the information learned from such structure. We assume that the data combines three types of latent structures, including the horizontal structure (genes are similar to each other), the vertical structure (cells are similar to each other) and the low-rank structure. The combination weights and latent structures are learned using multi-objective optimization. And, the weighted average of the observed data and the imputation results learned from the three types of structures are considered as the final result. Comprehensive downstream experiments show the superiority of our method in terms of recovery of true gene expression profiles, differential expression analysis, cell clustering and cell trajectory inference.

AVAILABILITY AND IMPLEMENTATION

The R package is available at https://github.com/Zhangxf-ccnu/scMOO and https://zenodo.org/record/5785195. The codes to reproduce the downstream analyses in this article can be found at https://github.com/Zhangxf-ccnu/scMOO_experiments_codes and https://zenodo.org/record/5786211. The detailed list of data sets used in the present study is represented in Supplementary Table S1 in the Supplementary materials.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

单细胞RNA测序(scRNA-seq)技术因其在单细胞分辨率下促进单细胞转录组分析而具有革命性。由于各种技术噪声导致的过多零值(称为缺失值)会误导下游分析。因此,拥有准确的插补方法来解决缺失值问题至关重要。

结果

在本文中,我们基于多目标优化开发了一种用于scRNA-seq数据的新型缺失值插补方法。我们的方法与现有方法不同,现有方法假设基础数据具有预先设定的结构,并根据从该结构中学到的信息来插补缺失值。我们假设数据结合了三种类型的潜在结构,包括水平结构(基因彼此相似)、垂直结构(细胞彼此相似)和低秩结构。使用多目标优化来学习组合权重和潜在结构。并且,将观测数据与从三种结构中学到的插补结果的加权平均值视为最终结果。全面的下游实验表明,我们的方法在恢复真实基因表达谱、差异表达分析、细胞聚类和细胞轨迹推断方面具有优势。

可用性和实现方式

R包可在https://github.com/Zhangxf-ccnu/scMOO和https://zenodo.org/record/5785195获取。重现本文下游分析的代码可在https://github.com/Zhangxf-ccnu/scMOO_experiments_codes和https://zenodo.org/record/5786211找到。本研究中使用的数据集详细列表见补充材料中的补充表S1。

补充信息

补充数据可在《生物信息学》在线获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验