The evolution of transformer models from unidirectional to bidirectional in Natural Language Processing

Yihang Sun
DOI: https://doi.org/10.54254/2755-2721/42/20230794
2024-02-23
Abstract:Transformer models have revolutionized Natural Language Processing (NLP), transitioning from traditional sequential models to innovative architectures based on attention mechanisms. The shift from unidirectional to bidirectional models has been a remarkable development in NLP. This paper mainly focuses on the evolution of NLP caused by Transformer models, with the transition from unidirectional to bidirectional modeling. This paper explores how the transformer model has revolutionized NLP, and the evolution from traditional sequential models to innovative attention-driven architectures. In this paper, it mainly discusses the limitations of traditional NLP models like RNNs, LSTMs and CNN when handling lengthy text sequences and complex dependencies, highlighting how transformer models, employing self-attention mechanisms and bidirectional modeling (e.g., BERT and GPT), have significantly improved NLP tasks. It provides a thorough review of the shift from unidirectional to bidirectional transformer models, offering insights into their utilization and development. Finally, this paper concludes with a summary and outlook for the entire study.
What problem does this paper attempt to address?