Department of Pharmacology and Toxicology, University of Louisville, Louisville, USA.
UofL Health-Brown Cancer Center, University of Louisville, Louisville, USA.
Metabolomics. 2022 Jul 20;18(8):57. doi: 10.1007/s11306-022-01918-3.
While prediction of short versus long term survival from lung cancer is clinically relevant in the context of patient management and therapy selection, it has proven difficult to identify reliable biomarkers of survival. Metabolomic markers from tumor core biopsies have been shown to reflect cancer metabolic dysregulation and hold prognostic value.
Implement and validate a novel ensemble machine learning approach to evaluate survival based on metabolomic biomarkers from tumor core biopsies.
Data were obtained from tumor core biopsies evaluated with high-resolution 2DLC-MS/MS. Unlike biofluid samples, analysis of tumor tissue is expected to accurately reflect the cancer metabolism and its impact on patient survival. A comprehensive suite of machine learning algorithms were trained as base learners and then combined into a stacked-ensemble meta-learner for predicting "short" versus "long" survival on an external validation cohort. An ensemble method of feature selection was employed to find a reliable set of biomarkers with potential clinical utility.
Overall survival (OS) is predicted in external validation cohort with AUROC of 0.881 with support vector machine meta learner model, while progression-free survival (PFS) is predicted with AUROC of 0.833 with boosted logistic regression meta learner model, outperforming a nomogram using covariate data (staging, age, sex, treatment vs. non-treatment) as predictors. Increased relative abundance of guanine, choline, and creatine corresponded with shorter OS, while increased leucine and tryptophan corresponded with shorter PFS. In patients that expired, N6,N6,N6-Trimethyl-L-lysine, L-pyrogluatmic acid, and benzoic acid were increased while cystine, methionine sulfoxide and histamine were decreased. In patients with progression, itaconic acid, pyruvate, and malonic acid were increased.
This study demonstrates the feasibility of an ensemble machine learning approach to accurately predict patient survival from tumor core biopsy metabolomic data.
虽然预测肺癌的短期和长期生存对于患者管理和治疗选择具有重要的临床意义,但识别可靠的生存标志物一直具有挑战性。肿瘤核心活检的代谢组学标志物已被证明反映了癌症代谢失调,并具有预后价值。
实施和验证一种新的集成机器学习方法,基于肿瘤核心活检的代谢组学标志物来评估生存。
数据来自采用高分辨率 2DLC-MS/MS 评估的肿瘤核心活检。与生物流体样本不同,分析肿瘤组织有望准确反映癌症代谢及其对患者生存的影响。一系列全面的机器学习算法被训练为基础学习者,然后组合成一个堆叠集成元学习者,用于在外部验证队列中预测“短”与“长”生存。采用集成特征选择方法找到具有潜在临床应用价值的可靠标志物集。
在外部验证队列中,使用支持向量机元学习者模型预测总生存(OS)的 AUC 为 0.881,而使用提升逻辑回归元学习者模型预测无进展生存(PFS)的 AUC 为 0.833,优于使用协变量数据(分期、年龄、性别、治疗与非治疗)作为预测因子的列线图。鸟嘌呤、胆碱和肌酸的相对丰度增加与 OS 缩短相关,而亮氨酸和色氨酸的相对丰度增加与 PFS 缩短相关。在死亡患者中,N6、N6、N6-三甲基-L-赖氨酸、L-吡咯戊二酸和苯甲酸增加,而胱氨酸、甲硫氨酸亚砜和组氨酸减少。在进展患者中,异柠檬酸、丙酮酸和丙二酸增加。
本研究证明了集成机器学习方法准确预测肿瘤核心活检代谢组学数据患者生存的可行性。