Suppr超能文献

PGxMine:用于 PharmGKB 策管的文本挖掘。

PGxMine: Text mining for curation of PharmGKB.

机构信息

Department of Bioengineering, Stanford University, Stanford, CA, 94305, USA.

出版信息

Pac Symp Biocomput. 2020;25:611-622.

Abstract

Precision medicine tailors treatment to individuals personal data including differences in their genome. The Pharmacogenomics Knowledgebase (PharmGKB) provides highly curated information on the effect of genetic variation on drug response and side effects for a wide range of drugs. PharmGKB's scientific curators triage, review and annotate a large number of papers each year but the task is challenging. We present the PGxMine resource, a text-mined resource of pharmacogenomic associations from all accessible published literature to assist in the curation of PharmGKB. We developed a supervised machine learning pipeline to extract associations between a variant (DNA and protein changes, star alleles and dbSNP identifiers) and a chemical. PGxMine covers 452 chemicals and 2,426 variants and contains 19,930 mentions of pharmacogenomic associations across 7,170 papers. An evaluation by PharmGKB curators found that 57 of the top 100 associations not found in PharmGKB led to 83 curatable papers and a further 24 associations would likely lead to curatable papers through citations. The results can be viewed at https://pgxmine.pharmgkb.org/ and code can be downloaded at https://github.com/jakelever/pgxmine.

摘要

精准医学根据个人的个人数据(包括基因组差异)来定制治疗方案。药物基因组学知识库(PharmGKB)提供了广泛的药物对遗传变异对药物反应和副作用影响的高度精细化信息。PharmGKB 的科学编辑人员每年都会对大量论文进行分类、审查和注释,但这项任务极具挑战性。我们提出了 PGxMine 资源,这是一个从所有可访问的已发表文献中挖掘药物基因组学关联的文本挖掘资源,以协助 PharmGKB 的编辑工作。我们开发了一个有监督的机器学习管道,从变体(DNA 和蛋白质变化、星等位基因和 dbSNP 标识符)和化学物质之间提取关联。PGxMine 涵盖了 452 种化学物质和 2426 个变体,并在 7170 篇论文中包含了 19930 个药物基因组学关联的提及。PharmGKB 编辑人员的评估发现,在 PharmGKB 中未找到的前 100 个关联中有 57 个导致了 83 篇可编辑论文,另有 24 个关联可能通过引用导致可编辑论文。结果可以在 https://pgxmine.pharmgkb.org/ 上查看,代码可以在 https://github.com/jakelever/pgxmine 上下载。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/44a0/6917032/9a2889d52385/nihms-1061502-f0002.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验