Ontology-Based Information Extraction from Free-Form Text

Ronald Braun
DOI: https://doi.org/10.21236/ada383044
2000-10-06
Abstract:Report developed under SBIR contract. In this Phase I SBIR research we demonstrated the feasibility of an information extraction (IE) system that can leverage semantic representations to significantly increase end-to-end recall for the IE task while maintaining or improving precision. Our end-to-end Ontology-Based IE (OBIE) system combines machine learning techniques with a novel architecture built around a shared domain ontology. This architecture enables interaction between different levels of the IE processing stream simultaneously through the shared ontology. By incorporating hierarchical knowledge in their learning algorithms, IE modules can perform their extraction tasks with greater depth and accuracy. Bootstrapping algorithms were extended to automatically learn the ontology of a new domain, to assist in training the IE components, and to reduce the burden of annotation on the end-user. Broad-coverage and rare-case extraction rules were augmented by classifiers induced from the trained ontology to shore up the precision typically lost by such rules. Performance metrics allow a preliminary characterization of recall and precision gains enabled by the proposed architecture. Our Phase I research and development of a proof-of-concept prototype demonstrated the feasibility and utility of OBIE's ontology-based IE capability and provides a solid foundation for our Phase implementation.
What problem does this paper attempt to address?