Suppr超能文献

c-Diadem:一种受限双输入深度学习模型,用于识别阿尔茨海默病中的新型生物标志物。

c-Diadem: a constrained dual-input deep learning model to identify novel biomarkers in Alzheimer's disease.

机构信息

Department of Biomedical Engineering, Khalifa University, PO Box 127788, Abu Dhabi, United Arab Emirates.

出版信息

BMC Med Genomics. 2023 Oct 13;16(Suppl 2):244. doi: 10.1186/s12920-023-01675-9.

Abstract

BACKGROUND

Alzheimer's disease (AD) is an incurable, debilitating neurodegenerative disorder. Current biomarkers for AD diagnosis require expensive neuroimaging or invasive cerebrospinal fluid sampling, thus precluding early detection. Blood-based biomarker discovery in Alzheimer's can facilitate less-invasive, routine diagnostic tests to aid early intervention. Therefore, we propose "c-Diadem" (constrained dual-input Alzheimer's disease model), a novel deep learning classifier which incorporates KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway constraints on the input genotyping data to predict disease, i.e., mild cognitive impairment (MCI)/AD or cognitively normal (CN). SHAP (SHapley Additive exPlanations) was used to explain the model and identify novel, potential blood-based genetic markers of MCI/AD.

METHODS

We developed a novel constrained deep learning neural network which utilizes SNPs (single nucleotide polymorphisms) and microarray data from ADNI (Alzheimer's Disease Neuroimaging Initiative) to predict the disease status of participants, i.e., CN or with disease (MCI/AD), and identify potential blood-based biomarkers for diagnosis and intervention. The dataset contains samples from 626 participants, of which 212 are CN (average age 74.6 ± 5.4 years) and 414 patients have MCI/AD (average age 72.7 ± 7.6 years). KEGG pathway information was used to generate constraints applied to the input tensors, thus enhancing the interpretability of the model. SHAP scores were used to identify genes which could potentially serve as biomarkers for diagnosis and targets for drug development.

RESULTS

Our model's performance, with accuracy of 69% and AUC of 70% in the test dataset, is superior to previous models. The SHAP scores show that SNPs in PRKCZ, PLCB1 and ITPR2 as well as expression of HLA-DQB1, EIF1AY, HLA-DQA1, and ZFP57 have more impact on model predictions.

CONCLUSIONS

In addition to predicting MCI/AD, our model has been interrogated for potential genetic biomarkers using SHAP. From our analysis, we have identified blood-based genetic markers related to Ca ion release in affected regions of the brain, as well as depression. The findings from our study provides insights into disease mechanisms, and can facilitate innovation in less-invasive, cost-effective diagnostics. To the best of our knowledge, our model is the first to use pathway constraints in a multimodal neural network to identify potential genetic markers for AD.

摘要

背景

阿尔茨海默病(AD)是一种无法治愈的、使人虚弱的神经退行性疾病。目前 AD 的诊断生物标志物需要昂贵的神经影像学或侵入性的脑脊液取样,因此无法进行早期检测。在阿尔茨海默病中进行基于血液的生物标志物发现可以促进更微创、常规的诊断测试,以帮助早期干预。因此,我们提出了“c-Diadem”(约束双输入阿尔茨海默病模型),这是一种新型的深度学习分类器,它对输入的基因分型数据应用了 KEGG(京都基因和基因组百科全书)途径约束,以预测疾病,即轻度认知障碍(MCI)/阿尔茨海默病(AD)或认知正常(CN)。使用 SHAP(SHapley Additive exPlanations)来解释模型并确定 MCI/AD 的新的、潜在的基于血液的遗传标记物。

方法

我们开发了一种新的约束深度学习神经网络,该网络利用 ADNI(阿尔茨海默病神经影像学倡议)的 SNP(单核苷酸多态性)和微阵列数据来预测参与者的疾病状态,即 CN 或患有疾病(MCI/AD),并确定潜在的用于诊断和干预的基于血液的生物标志物。该数据集包含 626 名参与者的样本,其中 212 名是 CN(平均年龄 74.6±5.4 岁),414 名患者患有 MCI/AD(平均年龄 72.7±7.6 岁)。KEGG 途径信息用于生成应用于输入张量的约束,从而增强模型的可解释性。SHAP 分数用于识别可能作为诊断生物标志物和药物开发靶点的基因。

结果

我们的模型在测试数据集上的性能为 69%的准确率和 70%的 AUC,优于以前的模型。SHAP 分数表明,PRKCZ、PLCB1 和 ITPR2 中的 SNP 以及 HLA-DQB1、EIF1AY、HLA-DQA1 和 ZFP57 的表达对模型预测的影响更大。

结论

除了预测 MCI/AD 之外,我们的模型还使用 SHAP 对潜在的遗传生物标志物进行了研究。通过我们的分析,我们已经确定了与大脑受影响区域钙释放以及抑郁症相关的基于血液的遗传标记物。我们的研究结果提供了对疾病机制的深入了解,并可以促进微创、具有成本效益的诊断技术的创新。据我们所知,我们的模型是第一个在多模态神经网络中使用途径约束来识别 AD 潜在遗传标记物的模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/875f/10571239/c5392ca87cab/12920_2023_1675_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验