Liu Jinya, Liu Leping, Antwi Paul Akwasi, Luo Yanwei, Liang Fang
Department of Plastic Surgery, The Third Xiangya Hospital of Central South University, Changsha, China.
Department of Blood Transfusion, The Third Xiangya Hospital of Central South University, Changsha, China.
Front Genet. 2022 Jun 1;13:858466. doi: 10.3389/fgene.2022.858466. eCollection 2022.
Ovarian cancer (OC) has a high mortality rate and poses a severe threat to women's health. However, abnormal gene expression underlying the tumorigenesis of OC has not been fully understood. This study aims to identify diagnostic characteristic genes involved in OC by bioinformatics and machine learning. We utilized five datasets retrieved from the Gene Expression Omnibus (GEO) database, The Cancer Genome Atlas (TCGA) database, and the Genotype-Tissue Expression (GTEx) Project database. GSE12470 and GSE18520 were combined as the training set, and GSE27651 was used as the validation set A. Also, we combined the TCGA database and GTEx database as validation set B. First, in the training set, differentially expressed genes (DEGs) between OC and non-ovarian cancer tissues (nOC) were identified. Next, Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), Disease Ontology (DO) enrichment analysis, and Gene Set Enrichment Analysis (GSEA) were performed for functional enrichment analysis of these DEGs. Then, two machine learning algorithms, Least Absolute Shrinkage and Selector Operation (LASSO) and Support Vector Machine-Recursive Feature Elimination (SVM-RFE), were used to get the diagnostic genes. Subsequently, the obtained diagnostic-related DEGs were validated in the validation sets. Then, we used the computational approach (CIBERSORT) to analyze the association between immune cell infiltration and DEGs. Finally, we analyzed the prognostic role of several genes on the KM-plotter website and used the human protein atlas (HPA) online database to analyze the expression of these genes at the protein level. 590 DEGs were identified, including 276 upregulated and 314 downregulated DEGs.The Enrichment analysis results indicated the DEGs were mainly involved in the nuclear division, cell cycle, and IL-17 signaling pathway. Besides, DEGs were also closely related to immune cell infiltration. Finally, we found that BUB1, FOLR1, and PSAT1 have prognostic roles and the protein-level expression of these six genes SFPR1, PSAT1, PDE8B, INAVA and TMEM139 in OC tissue and nOC tissue was consistent with our analysis. We screened nine diagnostic characteristic genes of OC, including SFRP1, PSAT1, BUB1B, FOLR1, ABCB1, PDE8B, INAVA, BUB1, TMEM139. Combining these genes may be useful for OC diagnosis and evaluating immune cell infiltration.
卵巢癌(OC)死亡率高,对女性健康构成严重威胁。然而,OC肿瘤发生背后的基因异常表达尚未完全明确。本研究旨在通过生物信息学和机器学习确定参与OC的诊断特征基因。我们利用了从基因表达综合数据库(GEO)、癌症基因组图谱(TCGA)数据库和基因型-组织表达(GTEx)项目数据库中检索到的五个数据集。将GSE12470和GSE18520合并作为训练集,GSE27651用作验证集A。此外,我们将TCGA数据库和GTEx数据库合并作为验证集B。首先,在训练集中,鉴定OC与非卵巢癌组织(nOC)之间的差异表达基因(DEG)。接下来,对这些DEG进行基因本体论(GO)、京都基因与基因组百科全书(KEGG)、疾病本体论(DO)富集分析以及基因集富集分析(GSEA),以进行功能富集分析。然后,使用两种机器学习算法,即最小绝对收缩和选择算子(LASSO)以及支持向量机-递归特征消除(SVM-RFE)来获取诊断基因。随后,在验证集中对获得的与诊断相关的DEG进行验证。然后,我们使用计算方法(CIBERSORT)分析免疫细胞浸润与DEG之间的关联。最后,我们在KM-plotter网站上分析了几个基因的预后作用,并使用人类蛋白质图谱(HPA)在线数据库分析这些基因在蛋白质水平的表达。共鉴定出590个DEG,包括276个上调的DEG和314个下调的DEG。富集分析结果表明,DEG主要参与核分裂、细胞周期和IL-17信号通路。此外,DEG也与免疫细胞浸润密切相关。最后,我们发现BUB1、FOLR1和PSAT1具有预后作用,并且这六个基因SFPR1、PSAT1、PDE8B、INAVA和TMEM139在OC组织和nOC组织中的蛋白质水平表达与我们的分析一致。我们筛选出了OC的九个诊断特征基因,包括SFRP1、PSAT1、BUB1B、FOLR1、ABCB1、PDE8B、INAVA、BUB1、TMEM139。联合这些基因可能有助于OC诊断和评估免疫细胞浸润。