Benkirane Hakim, Vakalopoulou Maria, Planchard David, Adam Julien, Olaussen Ken, Michiels Stefan, Cournède Paul-Henry
Université Paris-Saclay, CentraleSupélec, Laboratory of Mathematics and Computer Science (MICS), Gif-sur-Yvette, France.
IHU PRISM, National PRecISion Medicine Center in Oncology, Gustave Roussy, Villejuif, France.
PLoS Comput Biol. 2025 Jun 17;21(6):e1013012. doi: 10.1371/journal.pcbi.1013012. eCollection 2025 Jun.
Characterizing cancer presents a delicate challenge as it involves deciphering complex biological interactions within the tumor's microenvironment. Clinical trials often provide histology images and molecular profiling of tumors, which can help understand these interactions. Despite recent advances in representing multimodal data for weakly supervised tasks in the medical domain, achieving a coherent and interpretable fusion of whole slide images and multi-omics data is still a challenge. Each modality operates at distinct biological levels, introducing substantial correlations between and within data sources. In response to these challenges, we propose a novel deep-learning-based approach designed to represent multi-omics & histopathology data for precision medicine in a readily interpretable manner. While our approach demonstrates superior performance compared to state-of-the-art methods across multiple test cases, it also deals with incomplete and missing data in a robust manner. It extracts various scores characterizing the activity of each modality and their interactions at the pathway and gene levels. The strength of our method lies in its capacity to unravel pathway activation through multimodal relationships and to extend enrichment analysis to spatial data for supervised tasks. We showcase its predictive capacity and interpretation scores by extensively exploring multiple TCGA datasets and validation cohorts. The method opens new perspectives in understanding the complex relationships between multimodal pathological genomic data in different cancer types and is publicly available on Github.
对癌症进行特征描述是一项微妙的挑战,因为这涉及到解读肿瘤微环境内复杂的生物相互作用。临床试验通常会提供肿瘤的组织学图像和分子图谱,这有助于理解这些相互作用。尽管在医学领域针对弱监督任务表示多模态数据方面取得了最新进展,但要实现全切片图像和多组学数据的连贯且可解释的融合仍是一项挑战。每种模态在不同的生物学层面上运作,在数据源之间和内部引入了大量的相关性。为应对这些挑战,我们提出了一种新颖的基于深度学习的方法,旨在以易于解释的方式表示用于精准医学的多组学和组织病理学数据。虽然我们的方法在多个测试案例中与最先进的方法相比表现出卓越的性能,但它也能以稳健的方式处理不完整和缺失的数据。它提取了各种分数,这些分数表征了每种模态的活性及其在通路和基因水平上的相互作用。我们方法的优势在于其能够通过多模态关系揭示通路激活,并将富集分析扩展到用于监督任务的空间数据。我们通过广泛探索多个TCGA数据集和验证队列来展示其预测能力和解释分数。该方法为理解不同癌症类型中多模态病理基因组数据之间的复杂关系开辟了新的视角,并且在Github上公开可用。