Suppr超能文献

平均相关聚类算法(ACCA)用于对具有相似表达值变化模式的共调控基因进行分组。

Average correlation clustering algorithm (ACCA) for grouping of co-regulated genes with similar pattern of variation in their expression values.

机构信息

Department of Computer Science and Engineering, Netaji Subhash Engineering College, Kolkata 700 152, India.

出版信息

J Biomed Inform. 2010 Aug;43(4):560-8. doi: 10.1016/j.jbi.2010.02.001. Epub 2010 Feb 6.

Abstract

Distance based clustering algorithms can group genes that show similar expression values under multiple experimental conditions. They are unable to identify a group of genes that have similar pattern of variation in their expression values. Previously we developed an algorithm called divisive correlation clustering algorithm (DCCA) to tackle this situation, which is based on the concept of correlation clustering. But this algorithm may also fail for certain cases. In order to overcome these situations, we propose a new clustering algorithm, called average correlation clustering algorithm (ACCA), which is able to produce better clustering solution than that produced by some others. ACCA is able to find groups of genes having more common transcription factors and similar pattern of variation in their expression values. Moreover, ACCA is more efficient than DCCA with respect to the time of execution. Like DCCA, we use the concept of correlation clustering concept introduced by Bansal et al. ACCA uses the correlation matrix in such a way that all genes in a cluster have the highest average correlation values with the genes in that cluster. We have applied ACCA and some well-known conventional methods including DCCA to two artificial and nine gene expression datasets, and compared the performance of the algorithms. The clustering results of ACCA are found to be more significantly relevant to the biological annotations than those of the other methods. Analysis of the results show the superiority of ACCA over some others in determining a group of genes having more common transcription factors and with similar pattern of variation in their expression profiles. Availability of the software: The software has been developed using C and Visual Basic languages, and can be executed on the Microsoft Windows platforms. The software may be downloaded as a zip file from http://www.isical.ac.in/~rajat. Then it needs to be installed. Two word files (included in the zip file) need to be consulted before installation and execution of the software.

摘要

基于距离的聚类算法可以将在多种实验条件下表现出相似表达值的基因进行分组。它们无法识别具有相似表达值变化模式的一组基因。之前,我们开发了一种称为分治相关聚类算法(DCCA)的算法来解决这种情况,该算法基于相关聚类的概念。但是,这种算法在某些情况下也可能会失败。为了克服这些情况,我们提出了一种新的聚类算法,称为平均相关聚类算法(ACCA),它能够产生比其他一些算法更好的聚类解决方案。ACCA 能够找到具有更多共同转录因子和相似表达值变化模式的基因群。此外,ACCA 在执行时间方面比 DCCA 更有效。与 DCCA 一样,我们使用 Bansal 等人提出的相关聚类概念。ACCA 以这样的方式使用相关矩阵,即簇中的所有基因与该簇中的基因具有最高的平均相关值。我们已经将 ACCA 和一些著名的传统方法(包括 DCCA)应用于两个人工和九个基因表达数据集,并比较了算法的性能。发现 ACCA 的聚类结果与生物学注释更相关,而其他方法的聚类结果则不相关。结果分析表明,在确定具有更多共同转录因子和相似表达值变化模式的基因群方面,ACCA 优于其他一些方法。

软件可用性

该软件使用 C 和 Visual Basic 语言开发,可以在 Microsoft Windows 平台上执行。该软件可以从 http://www.isical.ac.in/~rajat 以 zip 文件的形式下载。然后需要进行安装。在安装和执行软件之前,需要查阅包含在 zip 文件中的两个 word 文件。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验