CRYSTAL: Inducing a Conceptual Dictionary

Stephen Soderland,David Fisher,Jonathan Aseltine,Wendy Lehnert
DOI: https://doi.org/10.48550/arXiv.cmp-lg/9505020
1995-05-09
Computation and Language
Abstract:One of the central knowledge sources of an information extraction system is a dictionary of linguistic patterns that can be used to identify the conceptual content of a text. This paper describes CRYSTAL, a system which automatically induces a dictionary of "concept-node definitions" sufficient to identify relevant information from a training corpus. Each of these concept-node definitions is generalized as far as possible without producing errors, so that a minimum number of dictionary entries cover the positive training instances. Because it tests the accuracy of each proposed definition, CRYSTAL can often surpass human intuitions in creating reliable extraction rules.
What problem does this paper attempt to address?