Boutros Paul C, Okey Allan B
Department of Medical Biophysics, University of Toronto, Ontario, Canada M5S 1A8.
Brief Bioinform. 2005 Dec;6(4):331-43. doi: 10.1093/bib/6.4.331.
Clustering has become an integral part of microarray data analysis and interpretation. The algorithmic basis of clustering -- the application of unsupervised machine-learning techniques to identify the patterns inherent in a data set -- is well established. This review discusses the biological motivations for and applications of these techniques to integrating gene expression data with other biological information, such as functional annotation, promoter data and proteomic data.
聚类已成为微阵列数据分析与解读中不可或缺的一部分。聚类的算法基础——应用无监督机器学习技术来识别数据集中固有的模式——已得到充分确立。本综述讨论了将这些技术用于将基因表达数据与其他生物学信息(如功能注释、启动子数据和蛋白质组学数据)进行整合的生物学动机及应用。