Suppr超能文献

词汇构建器网络服务:从两百个生物医学本体构建自定义词汇表。

The Lexicon Builder Web service: Building Custom Lexicons from two hundred Biomedical Ontologies.

作者信息

Parai Gautam K, Jonquet Clement, Xu Rong, Musen Mark A, Shah Nigam H

机构信息

Department of Computer Science.

出版信息

AMIA Annu Symp Proc. 2010 Nov 13;2010:587-91.

Abstract

Domain specific biomedical lexicons are extensively used by researchers for natural language processing tasks. Currently these lexicons are created manually by expert curators and there is a pressing need for automated methods to compile such lexicons. The Lexicon Builder Web service addresses this need and reduces the investment of time and effort involved in lexicon maintenance. The service has three components: Inclusion - selects one or several ontologies (or its branches) and includes preferred names and synonym terms; Exclusion - filters terms based on the term's Medline frequency, syntactic type, UMLS semantic type and match with stopwords; Output - aggregates information, handles compression and output formats. Evaluation demonstrates that the service has high accuracy and runtime performance. It is currently being evaluated for several use cases to establish its utility in biomedical information processing tasks. The Lexicon Builder promotes collaboration, sharing and standardization of lexicons amongst researchers by automating the creation, maintainence and cross referencing of custom lexicons.

摘要

特定领域的生物医学词汇表被研究人员广泛用于自然语言处理任务。目前,这些词汇表是由专家编纂人员手动创建的,因此迫切需要自动化方法来编纂此类词汇表。词汇表构建器网络服务满足了这一需求,并减少了词汇表维护所需的时间和精力投入。该服务有三个组件:包含——选择一个或多个本体(或其分支),并包含首选名称和同义词;排除——根据术语的Medline频率、句法类型、UMLS语义类型以及与停用词的匹配情况过滤术语;输出——汇总信息、处理压缩和输出格式。评估表明,该服务具有较高的准确性和运行时性能。目前正在对其进行多个用例的评估,以确定其在生物医学信息处理任务中的效用。词汇表构建器通过自动创建、维护和交叉引用自定义词汇表,促进了研究人员之间词汇表的协作、共享和标准化。

相似文献

3
MedLexSp - a medical lexicon for Spanish medical natural language processing.
J Biomed Semantics. 2023 Feb 2;14(1):2. doi: 10.1186/s13326-022-00281-5.
6
Automatic lexeme acquisition for a multilingual medical subword thesaurus.
Int J Med Inform. 2007 Feb-Mar;76(2-3):184-9. doi: 10.1016/j.ijmedinf.2006.05.032. Epub 2006 Jul 12.
7
UMLS knowledge for biomedical language processing.
Bull Med Libr Assoc. 1993 Apr;81(2):184-94.
8
SIFR annotator: ontology-based semantic annotation of French biomedical text and clinical notes.
BMC Bioinformatics. 2018 Nov 6;19(1):405. doi: 10.1186/s12859-018-2429-2.
9
Unified Medical Language System term occurrences in clinical notes: a large-scale corpus analysis.
J Am Med Inform Assoc. 2012 Jun;19(e1):e149-56. doi: 10.1136/amiajnl-2011-000744. Epub 2012 Apr 4.

引用本文的文献

1
Predicting Future Cardiovascular Events in Patients With Peripheral Artery Disease Using Electronic Health Record Data.
Circ Cardiovasc Qual Outcomes. 2019 Mar;12(3):e004741. doi: 10.1161/CIRCOUTCOMES.118.004741.
2
Statin Intensity or Achieved LDL? Practice-based Evidence for the Evaluation of New Cholesterol Treatment Guidelines.
PLoS One. 2016 May 26;11(5):e0154952. doi: 10.1371/journal.pone.0154952. eCollection 2016.
3
Building the graph of medicine from millions of clinical narratives.
Sci Data. 2014 Sep 16;1:140032. doi: 10.1038/sdata.2014.32. eCollection 2014.
4
Network analysis of unstructured EHR data for clinical research.
AMIA Jt Summits Transl Sci Proc. 2013 Mar 18;2013:14-8. eCollection 2013.
5
Profiling risk factors for chronic uveitis in juvenile idiopathic arthritis: a new model for EHR-based research.
Pediatr Rheumatol Online J. 2013 Dec 3;11(1):45. doi: 10.1186/1546-0096-11-45.
6
Identifying phenotypic signatures of neuropsychiatric disorders from electronic medical records.
J Am Med Inform Assoc. 2013 Dec;20(e2):e297-305. doi: 10.1136/amiajnl-2013-001933. Epub 2013 Aug 16.
7
Practice-based evidence: profiling the safety of cilostazol by text-mining of clinical notes.
PLoS One. 2013 May 23;8(5):e63499. doi: 10.1371/journal.pone.0063499. Print 2013.
8
Mining the pharmacogenomics literature--a survey of the state of the art.
Brief Bioinform. 2012 Jul;13(4):460-94. doi: 10.1093/bib/bbs018.
9
Unified Medical Language System term occurrences in clinical notes: a large-scale corpus analysis.
J Am Med Inform Assoc. 2012 Jun;19(e1):e149-56. doi: 10.1136/amiajnl-2011-000744. Epub 2012 Apr 4.
10
The National Center for Biomedical Ontology.
J Am Med Inform Assoc. 2012 Mar-Apr;19(2):190-5. doi: 10.1136/amiajnl-2011-000523. Epub 2011 Nov 10.

本文引用的文献

1
The open biomedical annotator.
Summit Transl Bioinform. 2009 Mar 1;2009:56-60.
3
Building a biomedical ontology recommender web service.
J Biomed Semantics. 2010 Jun 22;1 Suppl 1(Suppl 1):S1. doi: 10.1186/2041-1480-1-S1-S1.
4
Creating mappings for ontologies in biomedicine: simple methods work.
AMIA Annu Symp Proc. 2009 Nov 14;2009:198-202.
5
Integrating text mining into the MGI biocuration workflow.
Database (Oxford). 2009;2009:bap019. doi: 10.1093/database/bap019. Epub 2009 Nov 21.
6
Reflect: augmented browsing for the life scientist.
Nat Biotechnol. 2009 Jun;27(6):508-10. doi: 10.1038/nbt0609-508.
7
BioPortal: ontologies and integrated data resources at the click of a mouse.
Nucleic Acids Res. 2009 Jul;37(Web Server issue):W170-3. doi: 10.1093/nar/gkp440. Epub 2009 May 29.
8
Ontology-driven indexing of public datasets for translational bioinformatics.
BMC Bioinformatics. 2009 Feb 5;10 Suppl 2(Suppl 2):S1. doi: 10.1186/1471-2105-10-S2-S1.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验