BiTimeBERT: Extending Pre-Trained Language Representations with Bi-Temporal Information

Jiexin Wang,Adam Jatowt,Masatoshi Yoshikawa,Yi Cai
DOI: https://doi.org/10.1145/3539618.3591686
2023-01-01
Abstract:Time is an important aspect of documents and is used in a range of NLP and IR tasks. In this work, we investigate methods for incorporating temporal information during pre-training to further improve the performance on time-related tasks. Compared with common pre-trained language models like BERT which utilize synchronic document collections (e.g., BookCorpus and Wikipedia) as the training corpora, we use long-span temporal news article collection for building word representations. We introduce BiTimeBERT, a novel language representation model trained on a temporal collection of news articles via two new pre-training tasks, which harnesses two distinct temporal signals to construct time-aware language representations. The experimental results show that BiTimeBERT consistently outperforms BERT and other existing pre-trained models with substantial gains on different downstream NLP tasks and applications for which time is of importance (e.g., the accuracy improvement over BERT is 155% on the event time estimation task).
What problem does this paper attempt to address?