Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington, USA.
Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, California, USA.
J Diabetes. 2021 Feb;13(2):143-153. doi: 10.1111/1753-0407.13093. Epub 2020 Aug 16.
The Environmental Determinants of the Diabetes in the Young (TEDDY) study has prospectively followed, from birth, children at increased genetic risk of type 1 diabetes. TEDDY has collected heterogenous data longitudinally to gain insights into the environmental and biological mechanisms driving the progression to persistent islet autoantibodies.
We developed a machine learning model to predict imminent transition to the development of persistent islet autoantibodies based on time-varying metabolomics data integrated with time-invariant risk factors (eg, gestational age). The machine learning was initiated with 221 potential features (85 genetic, 5 environmental, 131 metabolomic) and an ensemble-based feature evaluation was utilized to identify a small set of predictive features that can be interrogated to better understand the pathogenesis leading up to persistent islet autoimmunity.
The final integrative machine learning model included 42 disparate features, returning a cross-validated receiver operating characteristic area under the curve (AUC) of 0.74 and an AUC of ~0.65 on an independent validation dataset. The model identified a principal set of 20 time-invariant markers, including 18 genetic markers (16 single nucleotide polymorphisms [SNPs] and two HLA-DR genotypes) and two demographic markers (gestational age and exposure to a prebiotic formula). Integration with the metabolome identified 22 supplemental metabolites and lipids, including adipic acid and ceramide d42:0, that predicted development of islet autoantibodies.
The majority (86%) of metabolites that predicted development of islet autoantibodies belonged to three pathways: lipid oxidation, phospholipase A2 signaling, and pentose phosphate, suggesting that these metabolic processes may play a role in triggering islet autoimmunity.
幼年起病的 1 型糖尿病环境决定因素(TEDDY)研究前瞻性地随访了具有 1 型糖尿病遗传高风险的儿童。TEDDY 收集了异质的纵向数据,以深入了解驱动持续胰岛自身抗体进展的环境和生物学机制。
我们开发了一种机器学习模型,基于时变代谢组学数据和时不变风险因素(例如,胎龄)预测即将发生的持续胰岛自身抗体的转变。机器学习模型从 221 个潜在特征(85 个遗传特征、5 个环境特征、131 个代谢组学特征)开始,并采用基于集成的特征评估来识别一小部分预测特征,以便更好地了解导致持续胰岛自身免疫的发病机制。
最终的综合机器学习模型包含 42 个不同的特征,在交叉验证的接收器操作特征曲线(AUC)中获得了 0.74 的值,在独立验证数据集上的 AUC 值约为 0.65。该模型确定了一组主要的 20 个时不变标记物,包括 18 个遗传标记物(16 个单核苷酸多态性[SNP]和两个 HLA-DR 基因型)和两个人口统计学标记物(胎龄和暴露于一种益生元配方)。与代谢组学的整合确定了 22 种补充代谢物和脂质,包括己二酸和神经酰胺 d42:0,这些代谢物可预测胰岛自身抗体的发展。
预测胰岛自身抗体发展的代谢物中,约 86%属于三个途径:脂质氧化、磷脂酶 A2 信号转导和戊糖磷酸途径,这表明这些代谢过程可能在触发胰岛自身免疫中发挥作用。