SynTime: Token Types and Heuristic Rules

Xiaoshi Zhong,Erik Cambria
DOI: https://doi.org/10.1007/978-3-030-78961-9_4
2021-01-01
Abstract:According to the five common characteristics of time expressions, we propose a type-based approach named SynTime for time expression recognition. Specifically, we define three main syntactic token types, namely time token, modifier, and numeral, to group time-related token regular expressions. On the types we design general heuristic rules to recognize time expressions. In recognition, SynTime first identifies time tokens from raw text, then searches their surroundings for modifiers and numerals to form time segments, and finally merges the time segments to time expressions. As a light-weight rule-based tagger, SynTime runs in real time, and can be easily expanded by simply adding keywords for the text from different domains and different text types. Evaluation on benchmark datasets and tweets data shows that SynTime outperforms state-of-the-art methods.
What problem does this paper attempt to address?