Abstract:The neural architectures of language models are becoming increasingly complex, especially that of Transformers, based on the attention mechanism. Although their application to numerous natural language processing tasks has proven to be very fruitful, they continue to be models with little or no interpretability and explainability. One of the tasks for which they are best suited is the encoding of the contextual sense of words using contextualized embeddings. In this paper we propose a transparent, interpretable, and linguistically motivated strategy for encoding the contextual sense of words by modeling semantic compositionality. Particular attention is given to dependency relations and semantic notions such as selection preferences and paradigmatic classes. A partial implementation of the proposed model is carried out and compared with Transformer-based architectures for a given semantic task, namely the similarity calculation of word senses in context. The results obtained show that it is possible to be competitive with linguistically motivated models instead of using the black boxes underlying complex neural architectures.

What problem does this paper attempt to address?

The paper primarily aims to address the current shortcomings of language models (especially those based on the Transformer architecture) in terms of interpretability and semantic compositionality. The authors propose a new, symbol-based method to encode the meaning of words in context and compare it with the attention-based Transformer model. Specifically, the paper attempts to solve the following key issues: 1. **Model Interpretability**: Although complex neural network models like Transformers have achieved great success in natural language processing tasks, they remain "black box" models, lacking transparency and interpretability. 2. **Semantic Compositionality**: Current models often struggle to capture the systematic compositional rules in natural language, meaning that the meaning of expressions cannot be simply inferred from their components. 3. **Data Efficiency**: Compared to humans, these models require a large amount of training data to generalize correctly. To address the above issues, the authors propose a dependency-based compositional model that utilizes word selection preferences and paradigm categories to construct context-sensitive word meaning representations. This approach is more transparent and interpretable, capable of constructing the meaning of composite expressions through explicit grammatical rules. Additionally, this method is designed to be compared with the attention-based Transformer model to evaluate its ability to generate context-sensitive word vectors. Through a series of experiments, the authors demonstrate that the proposed model not only competes with complex neural architectures on specific semantic tasks but also provides more interpretability and structured knowledge, thereby overcoming some of the shortcomings of purely neural models.

Contextualized word senses: from attention to compositionality

Do Multi-Sense Embeddings Improve Natural Language Understanding?

Visualizing and Understanding Neural Models in NLP

Contextualized Word Embeddings Encode Aspects of Human-Like Word Sense Knowledge

Language Modelling Makes Sense: Propagating Representations through WordNet for Full-Coverage Word Sense Disambiguation

LMMS Reloaded: Transformer-based Sense Embeddings for Disambiguation and Beyond

Semantic Representations of Word Senses and Concepts

Semantic Composition in Visually Grounded Language Models

How much do contextualized representations encode long-range context?

Quasi-compositional mapping from form to meaning: a neural network-based approach to capturing neural responses during human language comprehension

Word Representation Learning in Multimodal Pre-Trained Transformers: An Intrinsic Evaluation

Interpreting Context Look-ups in Transformers: Investigating Attention-MLP Interactions

Contextual modulation of language comprehension in a dynamic neural model of lexical meaning

Beyond Bilingual: Multi-sense Word Embeddings using Multilingual Context

To Word Senses and Beyond: Inducing Concepts with Contextualized Language Models

How is a “Kitchen Chair” like a “Farm Horse”? Exploring the Representation of Noun-Noun Compound Semantics in Transformer-based Language Models

Probabilistic Transformer: A Probabilistic Dependency Model for Contextual Word Representation

Learning Context-Specific Word/Character Embeddings.

xSense: Learning Sense-Separated Sparse Representations and Textual Definitions for Explainable Word Sense Networks

A Method for Studying Semantic Construal in Grammatical Constructions with Interpretable Contextual Embedding Spaces

PolyLM: Learning about Polysemy through Language Modeling