Li Zhonghao, Chen Shengsong, Gao Nan, Chen Jie, Qin Ying, Zhang Guoqiang
Department of Neurosurgery, Dongfang Hospital, Beijing University of Chinese Medicine, Beijing, China.
Emergency Department, China-Japan Friendship Hospital, Beijing, China.
Inflamm Res. 2025 Jun 30;74(1):100. doi: 10.1007/s00011-025-02068-7.
This study aims to identify key genes of sepsis and construct a model for sepsis identification through integrated multi-organ single-cell RNA sequencing (scRNA-seq) and machine learning.
Datasets downloaded from the Gene Expression Omnibus (GSE207363, GSE207651, GSE185263, GSE69063 and GSE134347) were used.
ScRNA-seq data extracted from heart (GSE207363) and lung tissues (GSE207651) of septic mice were processed and analyzed using the Seurat package in R. Key genes were identified as present in both heart and lung tissues, resulting from the overlap of three analyses along with differential expression analyses. We then used support vector machine recursive feature elimination to construct a model for sepsis identification based on these key genes. The GSE185263 dataset was used for training, while GSE69063 and GSE134347 were used for testing. The accuracy of the model in identifying of sepsis was validated by analyzing the area under the receiver operating characteristic curve (AUROC) using the test datasets.
Thirteen genes were initially identified as key genes, and after translation to their human homologs, ten genes remained. The optimal SVM-RFE model incorporated eight of these genes (CAMP, CD74, HLA-DQA1, HLA-DQB1, HLA-DMA, HLA-DRB5, and LYZ). In the two test datasets, the AUROC value for the accuracy of the model in identifying of sepsis was 0.904 and 0.924, respectively.
We have identified several key genes and developed a machine learning model for sepsis identification. Further studies are needed to validate our findings.
本研究旨在通过整合多器官单细胞RNA测序(scRNA-seq)和机器学习来鉴定脓毒症的关键基因并构建脓毒症识别模型。
使用从基因表达综合数据库(GSE207363、GSE207651、GSE185263、GSE69063和GSE134347)下载的数据集。
使用R语言中的Seurat软件包对从脓毒症小鼠的心脏(GSE207363)和肺组织(GSE207651)中提取的scRNA-seq数据进行处理和分析。通过三次分析的重叠以及差异表达分析,将同时存在于心脏和肺组织中的基因鉴定为关键基因。然后,我们使用支持向量机递归特征消除法,基于这些关键基因构建脓毒症识别模型。GSE185263数据集用于训练,而GSE69063和GSE134347用于测试。通过使用测试数据集分析受试者工作特征曲线下面积(AUROC)来验证模型在识别脓毒症方面的准确性。
最初鉴定出13个基因作为关键基因,在转化为人同源基因后,剩下10个基因。最佳支持向量机递归特征消除(SVM-RFE)模型纳入了其中8个基因(CAMP、CD74、HLA-DQA1、HLA-DQB1、HLA-DMA、HLA-DRB5和LYZ)。在两个测试数据集中,该模型识别脓毒症准确性的AUROC值分别为0.904和0.924。
我们已经鉴定出几个关键基因,并开发了一种用于脓毒症识别的机器学习模型。需要进一步研究来验证我们的发现。