Suppr超能文献

组学数据的真正价值是什么?提升研究成果,确保数据长期卓越。

What is the real value of omics data? Enhancing research outcomes and securing long-term data excellence.

机构信息

Department of Biochemical Engineering, University College London, Gower Street, London WC1E 6BT, UK.

Oxford Biomedica (UK) Ltd, Windrush Court, Transport Way, Oxford OX4 6LT, UK.

出版信息

Nucleic Acids Res. 2024 Nov 11;52(20):12130-12140. doi: 10.1093/nar/gkae901.

Abstract

A wealth of high-throughput biological data, of which omics constitute a significant fraction, has been made publicly available in repositories over the past decades. These data come in various formats and cover a range of species and research areas providing insights into the complexities of biological systems; the public repositories hosting these data serve as multifaceted resources. The potentially greater value of these data lies in their secondary utilization as the deployment of data science and artificial intelligence in biology advances. Here, we critically evaluate challenges in secondary data use, focusing on omics data of human embryonic kidney cell lines available in public repositories. The emerging issues are obstacles faced by secondary data users across diverse domains as they concern platforms and repositories, which accept deposition of data irrespective of their species type. The evolving landscape of data-driven research in biology prompts re-evaluation of open access data curation and submission procedures to ensure that these challenges do not impede novel research opportunities through data exploitation. This paper aims to draw attention to widespread issues with data reporting and encourages data owners to meticulously curate submissions to maximize not only their immediate research impact but also the long-term legacy of datasets.

摘要

在过去几十年中,大量的高通量生物数据已经在存储库中公开提供,其中组学构成了重要的一部分。这些数据有多种格式,涵盖了多种物种和研究领域,为深入了解生物系统的复杂性提供了线索;这些数据的公共存储库是多方面的资源。这些数据的潜在更大价值在于它们的二次利用,因为数据科学和人工智能在生物学中的应用正在推进。在这里,我们批判性地评估了二次数据使用中的挑战,重点关注公共存储库中可用的人类胚胎肾细胞系的组学数据。新兴问题是,由于平台和存储库接受数据的存储,无论其物种类型如何,这给跨多个领域的二次数据用户带来了障碍。生物学中数据驱动研究的不断发展促使重新评估开放获取数据策展和提交程序,以确保这些挑战不会通过数据利用来阻碍新的研究机会。本文旨在引起人们对数据报告中普遍存在的问题的关注,并鼓励数据所有者精心策展提交内容,以最大限度地提高其即时研究影响,以及数据集的长期影响。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0c5/11551742/aeadc5d5b9cb/gkae901figgra1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验