Institute of Biomedical Informatics, National Yang Ming Chiao Tung University, No.155, Sec. 2, Linong St., Beitou District, Taipei 11221, Taiwan.
Department of Genomic Medicine and Center for Medical Genetics, Changhua Christian Hospital, No.176, Chong-Hua Rd., Changhua 50046, Taiwan.
Database (Oxford). 2021 Aug 31;2021. doi: 10.1093/database/baab053.
Over the past few years, with the rapid growth of deep-sequencing technology and the development of computational prediction algorithms, a large number of long non-coding RNAs (lncRNAs) have been identified in various types of human cancers. Therefore, it has become critical to determine how to properly annotate the potential function of lncRNAs from RNA-sequencing (RNA-seq) data and arrange the robust information and analysis into a useful system readily accessible by biological and clinical researchers. In order to produce a collective interpretation of lncRNA functions, it is necessary to integrate different types of data regarding the important functional diversity and regulatory role of these lncRNAs. In this study, we utilized transcriptomic sequencing data to systematically observe and identify lncRNAs and their potential functions from 5034 The Cancer Genome Atlas RNA-seq datasets covering 24 cancers. Then, we constructed the 'lncExplore' database that was developed to comprehensively integrate various types of genomic annotation data for collective interpretation. The distinctive features in our lncExplore database include (i) novel lncRNAs verified by both coding potential and translation efficiency score, (ii) pan-cancer analysis for studying the significantly aberrant expression across 24 human cancers, (iii) genomic annotation of lncRNAs, such as cis-regulatory information and gene ontology, (iv) observation of the regulatory roles as enhancer RNAs and competing endogenous RNAs and (v) the findings of the potential lncRNA biomarkers for the user-interested cancers by integrating clinical information and disease specificity score. The lncExplore database is to our knowledge the first public lncRNA annotation database providing cancer-specific lncRNA expression profiles for not only known but also novel lncRNAs, enhancer RNAs annotation and clinical analysis based on pan-cancer analysis. lncExplore provides a more complete pathway to highly efficient, novel and more comprehensive translation of laboratory discoveries into the clinical context and will assist in reinterpreting the biological regulatory function of lncRNAs in cancer research. Database URL http://lncexplore.bmi.nycu.edu.tw.
在过去的几年中,随着高通量测序技术的飞速发展和计算预测算法的不断进步,在各种人类癌症中已经鉴定出大量的长非编码 RNA(lncRNA)。因此,如何正确注释 RNA 测序(RNA-seq)数据中 lncRNA 的潜在功能,并将可靠的信息和分析整理成生物和临床研究人员易于使用的系统,已变得至关重要。为了对 lncRNA 功能进行综合阐释,有必要整合有关这些 lncRNA 的重要功能多样性和调控作用的不同类型的数据。在这项研究中,我们利用转录组测序数据,从涵盖 24 种癌症的 5034 个癌症基因组图谱 RNA-seq 数据集中,系统地观察和鉴定 lncRNA 及其潜在功能。然后,我们构建了“lncExplore”数据库,用于综合整合各种基因组注释数据,进行集体阐释。lncExplore 数据库的独特功能包括:(i)通过编码潜能和翻译效率评分验证的新型 lncRNA;(ii)24 种人类癌症中跨癌种分析研究的显著异常表达;(iii)lncRNA 的基因组注释,如顺式调控信息和基因本体论;(iv)观察作为增强子 RNA 和竞争内源 RNA 的调控作用;(v)通过整合临床信息和疾病特异性评分,为用户感兴趣的癌症发现潜在的 lncRNA 生物标志物。lncExplore 数据库是我们所知的第一个公共 lncRNA 注释数据库,不仅为已知的 lncRNA,还为新型 lncRNA、增强子 RNA 注释和基于泛癌分析的临床分析提供了癌症特异性 lncRNA 表达谱。lncExplore 提供了一条更为完整的途径,可以将实验室发现高效、新颖且更全面地转化到临床环境中,并有助于重新阐释 lncRNA 在癌症研究中的生物学调控功能。数据库网址:http://lncexplore.bmi.nycu.edu.tw。