WIKIPEDIA BASED NAME AND RESUME INFORMATION EXTRACTION

Wang Quanjian,Li Fang
DOI: https://doi.org/10.3969/j.issn.1000-386X.2011.07.050
2011-01-01
Abstract:Entity relation extraction has become a hot topic in current information extraction research.In this paper we define resume information extraction as extracting from webpage data for the three different entity relation tuple types: people's birth,education and work experience,represented by the composition of two entities and one relation,so that we can compose them together to form an individual's resume information in the real world.On the basis of chunk and named entity recognition tag extraction pattern and taking advantage of Wikipedia as knowledge base the article proposes a relation judgment algorithm based on present tuple and relation presented clustered semantic similarity to filter and classify entity relation tuples extracted by the pattern.Experiments demonstrate our approach gains great improvements on both precision and F measure upon the baseline approach.Thus a higher precise resume information type classification is realised.
What problem does this paper attempt to address?