Introduction to the special issue on managing information extraction
Abstract
The special issue of SIGMOD Record, December 2008, focuses on managing information extraction (IE) with the help of nine papers, which have been segregated in management systems, novel IE technologies, building knowledge base with IE and web-scale, open IE. IE is typically a program, which is used to extract structured data from unstructured text and 'SystemT: A System for Declarative Information Extraction' describes three IE management systems currently under the development at IBM Almaden, Wisconsin and Yahoo! Research. The research paper' Webpage Understanding: Beyond Page Level Search', describes a powerful set of learning-based techniques, which can be used to extract structured data from Web pages. The paper, 'Using Wikipedia to Bootstrap Open Information Extraction' reveals all current open-IE systems adopt a structural targeting approach. The issue also describes a paper on Kylin, an open-IE system under development at the University of Washington, which adopts the traditional approach of relational targeting.