Stacking with Auxiliary Features for Entity Linking in the Medical Domain
Abstract
Linking spans of natural language text to concepts in a structured source is an important task for many problems. It allows intelligent systems to leverage rich knowledge available in those sources (such as concept properties and relations) to enhance the semantics of the mentions of these concepts in text. In the medical domain, it is common to link text spans to medical concepts in large, curated knowledge repositories such as the Unified Medical Language System. Different approaches have different strengths: some are precision-oriented, some recall-oriented; some better at considering context but more prone to hallucination. The variety of techniques suggests that ensembling could outperform component technologies at this task. In this paper, we describe our process for building a Stacking ensemble using additional, auxiliary features for Entity Linking in the medical domain. Our best model beats several baselines and produces state-of-the-art results on several medical datasets.