Time Expression Recognition Using a Time-related Tagging Scheme

Xiaoshi Zhong,Erik Cambria
2018-01-01
Abstract:We analyze four datasets for the characteristics of time expressions, finding that time expressions are formed by loose structure and that the words used to express time information can differentiate time expressions from common text. The findings drive us to design a learning method named TOMN to model time expressions. TOMN defines a time-related tagging scheme named TOMN scheme with four tags, namely T, O, M, and N, indicating the constituents of time expression, namely Time token, Modifier, Numeral, and the words Outside time expression. Essentially, our constituent-based TOMN scheme overcomes the problem of inconsistent tag assignment that is caused by the conventional position-based tagging schemes (e.g., BIO scheme and BILOU scheme). In modeling, TOMN assigns a word with a TOMN tag under a framework of conditional random fields with minimal features. Experiments show that TOMN is equally or more effective than state-of-the-art methods on various datasets, and much more robust on cross-datasets. Moreover, our analysis can help explain many empirical observations in other works about time expression recognition and named entity recognition.
What problem does this paper attempt to address?