Suppr超能文献

TF-finder:一款软件包,用于使用微阵列数据和现有的知识库识别参与生物过程的转录因子。

TF-finder: a software package for identifying transcription factors involved in biological processes using microarray data and existing knowledge base.

机构信息

School of Forest Resources and Environmental Science, Michigan Technological University, 1400 Townsend Drive, Houghton, MI 49931, USA.

出版信息

BMC Bioinformatics. 2010 Aug 12;11:425. doi: 10.1186/1471-2105-11-425.

Abstract

BACKGROUND

Identification of transcription factors (TFs) involved in a biological process is the first step towards a better understanding of the underlying regulatory mechanisms. However, due to the involvement of a large number of genes and complicated interactions in a gene regulatory network (GRN), identification of the TFs involved in a biology process remains to be very challenging. In reality, the recognition of TFs for a given a biological process can be further complicated by the fact that most eukaryotic genomes encode thousands of TFs, which are organized in gene families of various sizes and in many cases with poor sequence conservation except for small conserved domains. This poses a significant challenge for identification of the exact TFs involved or ranking the importance of a set of TFs to a process of interest. Therefore, new methods for recognizing novel TFs are desperately needed. Although a plethora of methods have been developed to infer regulatory genes using microarray data, it is still rare to find the methods that use existing knowledge base in particular the validated genes known to be involved in a process to bait/guide discovery of novel TFs. Such methods can replace the sometimes-arbitrary process of selection of candidate genes for experimental validation and significantly advance our knowledge and understanding of the regulation of a process.

RESULTS

We developed an automated software package called TF-finder for recognizing TFs involved in a biological process using microarray data and existing knowledge base. TF-finder contains two components, adaptive sparse canonical correlation analysis (ASCCA) and enrichment test, for TF recognition. ASCCA uses positive target genes to bait TFS from gene expression data while enrichment test examines the presence of positive TFs in the outcomes from ASCCA. Using microarray data from salt and water stress experiments, we showed TF-finder is very efficient in recognizing many important TFs involved in salt and drought tolerance as evidenced by the rediscovery of those TFs that have been experimentally validated. The efficiency of TF-finder in recognizing novel TFs was further confirmed by a thorough comparison with a method called Intersection of Coexpression (ICE).

CONCLUSIONS

TF-finder can be successfully used to infer novel TFs involved a biological process of interest using publicly available gene expression data and known positive genes from existing knowledge bases. The package for TF-finder includes an R script for ASCCA, a Perl controller, and several Perl scripts for parsing intermediate outputs. The package is available upon request (hairong@mtu.edu). The R code for standalone ASCCA is also available.

摘要

背景

鉴定参与生物过程的转录因子(TFs)是深入了解潜在调控机制的第一步。然而,由于基因调控网络(GRN)中涉及大量基因和复杂的相互作用,因此鉴定参与生物过程的 TFs 仍然极具挑战性。实际上,由于大多数真核生物基因组编码数千个 TF,这些 TF 组织在大小各异的基因家族中,在许多情况下,除了小的保守结构域外,序列保守性很差,因此给定生物过程的 TF 识别会更加复杂。这给鉴定确切的 TF 或对一组 TF 对感兴趣过程的重要性进行排序带来了重大挑战。因此,迫切需要新的方法来识别新的 TF。虽然已经开发了许多使用微阵列数据推断调控基因的方法,但仍然很少找到使用特定现有知识库的方法,特别是使用已知参与过程的验证基因来诱饵/指导发现新的 TF。此类方法可以替代候选基因的实验验证选择过程,并显著提高我们对过程调控的认识和理解。

结果

我们开发了一个名为 TF-finder 的自动化软件包,用于使用微阵列数据和现有知识库识别参与生物过程的 TF。TF-finder 包含两个组件,自适应稀疏正则相关分析(ASCCA)和富集测试,用于 TF 识别。ASCCA 使用阳性靶基因从基因表达数据中捕获 TF,而富集测试则检查 ASCCA 结果中是否存在阳性 TF。使用盐和水胁迫实验的微阵列数据,我们表明 TF-finder 非常有效地识别参与耐盐和耐旱性的许多重要 TF,这从已被实验验证的那些 TF 的重新发现中得到了证明。通过与称为交集共表达(ICE)的方法进行彻底比较,进一步证实了 TF-finder 识别新 TF 的效率。

结论

TF-finder 可以成功地使用公开可用的基因表达数据和现有知识库中的已知阳性基因来推断感兴趣的生物过程中的新 TF。TF-finder 软件包包括用于 ASCCA 的 R 脚本、一个 Perl 控制器和几个用于解析中间输出的 Perl 脚本。该软件包可应要求提供(hairong@mtu.edu)。独立 ASCCA 的 R 代码也可提供。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/61bb/2930629/38b4b7fc0dd6/1471-2105-11-425-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验