Department of Medical Oncology, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands.
The Stratingh Institute for Chemistry, University of Groningen, Groningen, The Netherlands.
Nat Commun. 2021 Mar 5;12(1):1464. doi: 10.1038/s41467-021-21671-w.
The interpretation of high throughput sequencing data is limited by our incomplete functional understanding of coding and non-coding transcripts. Reliably predicting the function of such transcripts can overcome this limitation. Here we report the use of a consensus independent component analysis and guilt-by-association approach to predict over 23,000 functional groups comprised of over 55,000 coding and non-coding transcripts using publicly available transcriptomic profiles. We show that, compared to using Principal Component Analysis, Independent Component Analysis-derived transcriptional components enable more confident functionality predictions, improve predictions when new members are added to the gene sets, and are less affected by gene multi-functionality. Predictions generated using human or mouse transcriptomic data are made available for exploration in a publicly available web portal.
高通量测序数据的解释受到我们对编码和非编码转录本功能理解不完整的限制。可靠地预测这些转录本的功能可以克服这一限制。在这里,我们报告了使用一致的独立成分分析和关联分析方法,使用公开的转录组谱预测由超过 55000 个编码和非编码转录本组成的超过 23000 个功能组。我们表明,与使用主成分分析相比,独立成分分析衍生的转录组成分能够更自信地进行功能预测,当向基因集添加新成员时,预测会得到改善,并且受基因多功能性的影响较小。使用人类或小鼠转录组数据生成的预测可在公共可用的网络门户中进行探索。