Parai Gautam K, Jonquet Clement, Xu Rong, Musen Mark A, Shah Nigam H
Department of Computer Science.
AMIA Annu Symp Proc. 2010 Nov 13;2010:587-91.
Domain specific biomedical lexicons are extensively used by researchers for natural language processing tasks. Currently these lexicons are created manually by expert curators and there is a pressing need for automated methods to compile such lexicons. The Lexicon Builder Web service addresses this need and reduces the investment of time and effort involved in lexicon maintenance. The service has three components: Inclusion - selects one or several ontologies (or its branches) and includes preferred names and synonym terms; Exclusion - filters terms based on the term's Medline frequency, syntactic type, UMLS semantic type and match with stopwords; Output - aggregates information, handles compression and output formats. Evaluation demonstrates that the service has high accuracy and runtime performance. It is currently being evaluated for several use cases to establish its utility in biomedical information processing tasks. The Lexicon Builder promotes collaboration, sharing and standardization of lexicons amongst researchers by automating the creation, maintainence and cross referencing of custom lexicons.
特定领域的生物医学词汇表被研究人员广泛用于自然语言处理任务。目前,这些词汇表是由专家编纂人员手动创建的,因此迫切需要自动化方法来编纂此类词汇表。词汇表构建器网络服务满足了这一需求,并减少了词汇表维护所需的时间和精力投入。该服务有三个组件:包含——选择一个或多个本体(或其分支),并包含首选名称和同义词;排除——根据术语的Medline频率、句法类型、UMLS语义类型以及与停用词的匹配情况过滤术语;输出——汇总信息、处理压缩和输出格式。评估表明,该服务具有较高的准确性和运行时性能。目前正在对其进行多个用例的评估,以确定其在生物医学信息处理任务中的效用。词汇表构建器通过自动创建、维护和交叉引用自定义词汇表,促进了研究人员之间词汇表的协作、共享和标准化。