Huo Tianyao, Canepa Ronald, Sura Andrei, Modave François, Gong Yan
Department of Health Outcomes & Policy, College of Medicine, University of Florida, Gainesville, Florida, United States of America.
Information Technology and Services, University of Florida, Gainesville, Florida, United States of America.
PLoS One. 2017 Nov 28;12(11):e0188697. doi: 10.1371/journal.pone.0188697. eCollection 2017.
Colorectal cancer (CRC) is the third most common cancer and the second leading cause of cancer-related deaths in the United States. The purpose of this study was to evaluate the gene expression differences in different stages of CRC. Gene expression data on 433 CRC patient samples were obtained from The Cancer Genome Atlas (TCGA). Gene expression differences were evaluated across CRC stages using linear regression. Genes with p≤0.001 in expression differences were evaluated further in principal component analysis and genes with p≤0.0001 were evaluated further in gene set enrichment analysis. A total of 377 patients with gene expression data in 20,532 genes were included in the final analysis. The numbers of patients in stage I through IV were 59, 147, 116 and 55, respectively. NEK4 gene, which encodes for NIMA related kinase 4, was differentially expressed across the four stages of CRC. The stage I patients had the highest expression of NEK4 genes, while the stage IV patients had the lowest expressions (p = 9*10-6). Ten other genes (RNF34, HIST3H2BB, NUDT6, LRCh4, GLB1L, HIST2H4A, TMEM79, AMIGO2, C20orf135 and SPSB3) had p value of 0.0001 in the differential expression analysis. Principal component analysis indicated that the patients from the 4 clinical stages do not appear to have distinct gene expression pattern. Network-based and pathway-based gene set enrichment analyses showed that these 11 genes map to multiple pathways such as meiotic synapsis and packaging of telomere ends, etc. Ten of these 11 genes were linked to Gene Ontology terms such as nucleosome, DNA packaging complex and protein-DNA interactions. The protein complex-based gene set analysis showed that four genes were involved in H2AX complex II. This study identified a small number of genes that might be associated with clinical stages of CRC. Our analysis was not able to find a molecular basis for the current clinical staging for CRC based on the gene expression patterns.
结直肠癌(CRC)是美国第三大常见癌症,也是癌症相关死亡的第二大主要原因。本研究的目的是评估CRC不同阶段的基因表达差异。从癌症基因组图谱(TCGA)获得了433例CRC患者样本的基因表达数据。使用线性回归评估整个CRC阶段的基因表达差异。在主成分分析中进一步评估表达差异p≤0.001的基因,在基因集富集分析中进一步评估p≤0.0001的基因。最终分析纳入了377例有20532个基因表达数据的患者。I期至IV期的患者人数分别为59、147、116和55。编码NIMA相关激酶4的NEK4基因在CRC的四个阶段中差异表达。I期患者的NEK4基因表达最高,而IV期患者的表达最低(p = 9×10⁻⁶)。其他十个基因(RNF34、HIST3H2BB、NUDT6、LRCh4、GLB1L、HIST2H4A、TMEM79、AMIGO2、C20orf135和SPSB3)在差异表达分析中的p值为0.0001。主成分分析表明,来自4个临床阶段的患者似乎没有明显的基因表达模式。基于网络和基于通路的基因集富集分析表明,这11个基因映射到多个通路,如减数分裂突触和端粒末端包装等。这11个基因中的10个与基因本体术语相关,如核小体、DNA包装复合体和蛋白质-DNA相互作用。基于蛋白质复合体的基因集分析表明,四个基因参与了H2AX复合体II。本研究确定了少数可能与CRC临床阶段相关的基因。我们的分析未能基于基因表达模式找到当前CRC临床分期的分子基础。