Liu Yang, Li Feng, Shang Junliang, Liu Jinxing, Wang Juan, Ge Daohui
School of Computer Science, Qufu Normal University, Rizhao, 276826, China.
Interdiscip Sci. 2023 Dec;15(4):590-601. doi: 10.1007/s12539-023-00574-y. Epub 2023 Jul 4.
Recently developed single-cell RNA-seq (scRNA-seq) technology has given researchers the chance to investigate single-cell level of disease development. Clustering is one of the most essential strategies for analyzing scRNA-seq data. Choosing high-quality feature sets can significantly enhance the outcomes of single-cell clustering and classification. But computationally burdensome and highly expressed genes cannot afford a stabilized and predictive feature set for technical reasons. In this study, we introduce scFED, a feature-engineered gene selection framework. scFED identifies prospective feature sets to eliminate the noise fluctuation. And fuse them with existing knowledge from the tissue-specific cellular taxonomy reference database (CellMatch) to avoid the influence of subjective factors. Then present a reconstruction approach for noise reduction and crucial information amplification. We apply scFED on four genuine single-cell datasets and compare it with other techniques. According to the results, scFED improves clustering, decreases dimension of the scRNA-seq data, improves cell type identification when combined with clustering algorithms, and has higher performance than other methods. Therefore, scFED offers certain benefits in scRNA-seq data gene selection.
最近开发的单细胞RNA测序(scRNA-seq)技术让研究人员有机会在单细胞水平上研究疾病发展。聚类是分析scRNA-seq数据的最基本策略之一。选择高质量的特征集可以显著提高单细胞聚类和分类的结果。但由于技术原因,计算量大且高表达的基因无法提供稳定且具有预测性的特征集。在本研究中,我们引入了scFED,这是一种经过特征工程处理的基因选择框架。scFED识别潜在的特征集以消除噪声波动。并将它们与来自组织特异性细胞分类参考数据库(CellMatch)的现有知识融合,以避免主观因素的影响。然后提出一种用于降噪和关键信息放大的重建方法。我们将scFED应用于四个真实的单细胞数据集,并与其他技术进行比较。根据结果,scFED改善了聚类,降低了scRNA-seq数据的维度,与聚类算法结合时提高了细胞类型识别能力,并且比其他方法具有更高的性能。因此,scFED在scRNA-seq数据基因选择方面具有一定优势。