Suppr超能文献

M3C:基于蒙特卡罗模拟的共识聚类。

M3C: Monte Carlo reference-based consensus clustering.

机构信息

Experimental Medicine and Rheumatology, William Harvey Research Institute, Bart's and The London School of Medicine and Dentistry, Queen Mary University of London, Charterhouse Square, London, EC1M 6BQ, United Kingdom.

Oxford Internet Institute, University of Oxford, 1 St. Giles, OX1 3JS, Oxford, United Kingdom.

出版信息

Sci Rep. 2020 Feb 4;10(1):1816. doi: 10.1038/s41598-020-58766-1.

Abstract

Genome-wide data is used to stratify patients into classes for precision medicine using clustering algorithms. A common problem in this area is selection of the number of clusters (K). The Monti consensus clustering algorithm is a widely used method which uses stability selection to estimate K. However, the method has bias towards higher values of K and yields high numbers of false positives. As a solution, we developed Monte Carlo reference-based consensus clustering (M3C), which is based on this algorithm. M3C simulates null distributions of stability scores for a range of K values thus enabling a comparison with real data to remove bias and statistically test for the presence of structure. M3C corrects the inherent bias of consensus clustering as demonstrated on simulated and real expression data from The Cancer Genome Atlas (TCGA). For testing M3C, we developed clusterlab, a new method for simulating multivariate Gaussian clusters.

摘要

使用聚类算法,基于全基因组数据将患者分层为精准医学的类别。该领域的一个常见问题是聚类数量 (K) 的选择。蒙蒂共识聚类算法是一种广泛使用的方法,它使用稳定性选择来估计 K。然而,该方法存在对更高 K 值的偏差,并且会产生大量的假阳性。作为一种解决方案,我们开发了基于蒙特卡罗参考的共识聚类 (M3C),它基于该算法。M3C 为一系列 K 值模拟稳定性分数的零分布,从而可以与真实数据进行比较,以消除偏差并进行统计学检验以确定结构的存在。M3C 校正了共识聚类的固有偏差,这在从癌症基因组图谱 (TCGA) 获得的模拟和真实表达数据上得到了证明。为了测试 M3C,我们开发了 clusterlab,这是一种用于模拟多元高斯聚类的新方法。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验