Verma Pragya, Shakya Madhvi
Department of Mathematics, Bioinformatics and Computer Applications, Maulana Azad National Institute of Technology, Bhopal, Madhya Pradesh, 462003 India.
Cogn Neurodyn. 2022 Apr;16(2):443-453. doi: 10.1007/s11571-021-09724-8. Epub 2021 Sep 22.
Considering human brain disorders, Major Depressive Disorder (MDD) is seen as a lethal disease in which a person goes to the extent of suicidal behavior. Physical detection of MDD patients is less precise but machine learning can aid in improved classification of disease. The present research included three RNA-seq data classes to classify DEGs and then train key gene data using a random forest machine learning method. The three classes in the sample are 29 CON (sudden death healthy control), 21 MDD-S (a Major Depressive Disorder Suicide) being included in the second group, and 9 MDD (non-suicides MDD) which are included in the third group. With PCA analysis, 99 key genes were obtained. 47.1% data variability is given by these 99 genes. The model training of 99 genes indicated improved classification. The RF classification model has an accuracy of 61.11% over test data and 97.56% over train data. It was also noticed that the RF method offered greater accuracy than the KNN method. 99 genes were annotated using DAVID and ClueGo packages. Some of the important pathways and function observed in the study were glutamatergic synapse, GABA receptor activation, long-term synaptic depression, and morphine addiction. Out Of 99 genes, four genes, namely DLGAP1, GNG2, GRIA1, and GRIA4, were found to be predominantly involved in the glutamatergic synapse pathway. Another substantial link was observed in the GABA receptor activation involving the following two genes, GABBR2 and GNG2. Also, the genes found responsible for long-term synaptic depression were GRIA1, MAPT, and PTEN. There was another finding of morphine addiction which comprises three genes, namely GABBR2, GNG2, and PDE4D. For massive datasets, this approach will act as the gold standard. The cases of CON, MDD, and MDD-S are physically distinct. There was dysregulation in the expression level of 12 genes. The 12 genes act as a possible biomarker for Major Depressive Disorder and open up a new path for depressed subjects to explore further.
考虑到人类脑部疾病,重度抑郁症(MDD)被视为一种致命疾病,患者可能会出现自杀行为。对MDD患者进行身体检测的准确性较低,但机器学习有助于改进疾病分类。本研究纳入了三类RNA测序数据以对差异表达基因(DEG)进行分类,然后使用随机森林机器学习方法训练关键基因数据。样本中的三类分别是29例CON(猝死健康对照),第二组纳入21例MDD-S(重度抑郁症自杀患者),第三组纳入9例MDD(非自杀性MDD患者)。通过主成分分析(PCA),获得了99个关键基因。这99个基因解释了47.1%的数据变异性。对99个基因进行模型训练显示分类得到了改进。随机森林(RF)分类模型在测试数据上的准确率为61.11%,在训练数据上的准确率为97.56%。研究还发现,RF方法比K近邻(KNN)方法具有更高的准确率。使用DAVID和ClueGo软件包对99个基因进行了注释。该研究中观察到的一些重要途径和功能包括谷氨酸能突触、GABA受体激活、长期突触抑制和吗啡成瘾。在99个基因中,发现有四个基因,即DLGAP1、GNG2、GRIA1和GRIA4,主要参与谷氨酸能突触途径。在GABA受体激活过程中观察到另一个重要联系,涉及以下两个基因GABBR2和GNG2。此外,发现与长期突触抑制有关的基因有GRIA1、MAPT和PTEN。还有一个关于吗啡成瘾的发现,涉及三个基因,即GABBR2、GNG2和PDE4D。对于大规模数据集,这种方法将成为金标准。CON、MDD和MDD-S病例在生理上是不同的。有12个基因的表达水平出现失调。这12个基因可能作为重度抑郁症的生物标志物,为抑郁症患者开辟了一条进一步探索之路。