Department of Tuberculosis and Respiratory, Hubei Clinical Research Center for Infectious Diseases, Wuhan Research Center for Communicable Disease Diagnosis and Treatment, Wuhan Jinyintan Hospital, Tongji Medical College of Huazhong University of Science and Technology, Chinese Academy of Medical Sciences, Joint Laboratory of Infectious Diseases and Health, Wuhan Institute of Virology and Wuhan Jinyintan Hospital, Chinese Academy of Sciences, Wuhan, 430023, China.
BMC Pulm Med. 2023 Oct 19;23(1):397. doi: 10.1186/s12890-023-02699-8.
Idiopathic pulmonary fibrosis (IPF) is a chronic and progressive interstitial lung disease. Multiple research has revealed that the extracellular matrix (ECM) may be associated with the development and prognosis of IPF, however, the underlying mechanisms remain incompletely understood.
We included GSE70866 dataset from the GEO database and established an ECM-related prognostic model utilizing LASSO, Random forest and Support vector machines algorithms. To compare immune cell infiltration levels between the high and low risk groups, we employed the ssGSEA algorithm. Enrichment analysis was conducted to explore pathway differences between the high-risk and low-risk groups. Finally, the model genes were validated using an external validation set consisting of IPF cases, as well as single-cell data analysis.
Based on machine learning algorithms, we constructed an ECM-related risk model. IPF patients in the high-risk group had a worse overall survival rate than those in the low-risk group. The model's AUC predictive values were 0.786, 0.767, and 0.768 for the 1-, 2-, and 3-year survival rates, respectively. The validation cohort validated these findings, demonstrating our model's effective prognostication. Chemokine-related pathways were enriched through enrichment analysis. Moreover, immune cell infiltration varied significantly between the two groups. Finally, the validation results indicate that the expression levels of all the model genes exhibited significant differential expression.
Based on CST6, PPBP, CSPG4, SEMA3B, LAMB2, SERPINB4 and CTF1, our study developed and validated an ECM-related risk model that accurately predicts the outcome of IPF patients.
特发性肺纤维化(IPF)是一种慢性进行性间质性肺疾病。多项研究表明,细胞外基质(ECM)可能与 IPF 的发生发展和预后相关,但潜在机制尚不完全清楚。
我们纳入了 GEO 数据库中的 GSE70866 数据集,并利用 LASSO、随机森林和支持向量机算法建立了一个 ECM 相关的预后模型。为了比较高低风险组之间免疫细胞浸润水平的差异,我们采用了 ssGSEA 算法。富集分析用于探索高低风险组之间的通路差异。最后,我们使用包含 IPF 病例和单细胞数据分析的外部验证集来验证模型基因。
基于机器学习算法,我们构建了一个 ECM 相关的风险模型。高风险组的 IPF 患者总生存率低于低风险组。该模型对 1 年、2 年和 3 年生存率的 AUC 预测值分别为 0.786、0.767 和 0.768。验证队列验证了这些发现,表明我们的模型具有有效的预后预测能力。富集分析表明趋化因子相关通路存在富集。此外,两组之间的免疫细胞浸润差异显著。最后,验证结果表明,所有模型基因的表达水平均表现出明显的差异表达。
基于 CST6、PPBP、CSPG4、SEMA3B、LAMB2、SERPINB4 和 CTF1,我们研究开发并验证了一个 ECM 相关的风险模型,该模型能够准确预测 IPF 患者的预后。