Noise removing based on n-gram representations and its applications to cross-lingual projection

Mo YU,Tiejun ZHAO
DOI: https://doi.org/10.3969/j.issn.2095-2163.2016.02.030
2016-01-01
Abstract:Problems with structured predictions ( structured learning) widely exist in natural language processing. Recent research found that compared to classification problems, structured learning problems were affected more seriously by label noises, suggesting the importance of noise removing algorithms for these problems. Inspired by the development of representation learning methods, the paper proposes a noise?removing algorithm for structured learning based on low?dimensional representations of sub?structures. The algorithm finds neighbors of each node in a sequential labeling task based on its associated n?gram representation, and then performs noise removing on the label of a node according to its consistency with the labels of its neighbors. Therefore the paper proves the effectiveness of the proposed algorithm on the cross?lingual projection of named entity recognition and POS tagging tasks.
What problem does this paper attempt to address?