Generating semantic maps through multidimensional scaling: linguistic applications and theory

Martijn van der Klis,Jos Tellings
DOI: https://doi.org/10.1515/cllt-2021-0018
2021-11-24
Abstract:This paper reports on the state-of-the-art in application of multidimensional scaling (MDS) techniques to create semantic maps in linguistic research. MDS refers to a statistical technique that represents objects (lexical items, linguistic contexts, languages, etc.) as points in a space so that close similarity between the objects corresponds to close distances between the corresponding points in the representation. We focus on the use of MDS in combination with parallel corpus data as used in research on cross-linguistic variation. We first introduce the mathematical foundations of MDS and then give an exhaustive overview of past research that employs MDS techniques in combination with parallel corpus data. We propose a set of terminology to succinctly describe the key parameters of a particular MDS application. We then show that this computational methodology is theory-neutral, i.e. it can be employed to answer research questions in a variety of linguistic theoretical frameworks. Finally, we show how this leads to two lines of future developments for MDS research in linguistics.
Computation and Language
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the methodological problem of creating semantic maps in linguistic research through Multidimensional Scaling (MDS) technology. Specifically, the paper focuses on how to use MDS technology in combination with parallel corpus data to generate semantic maps and explores the application of this method in cross - language variation research. The main objectives of the paper include: 1. **Introducing the mathematical basis of MDS**: Explain the basic principles of MDS, including mathematical concepts such as matrix algebra and eigenvalue decomposition, which are crucial for understanding how MDS works. 2. **Reviewing past research**: Conduct a detailed review of the research history of using MDS technology in combination with parallel corpus data, providing a comprehensive overview. 3. **Proposing a terminological system**: Propose a set of terms to concisely describe the key parameters of specific MDS applications, such as input data types, similarity measurement methods, and output representations. 4. **Demonstrating the theoretical neutrality of MDS**: Emphasize that MDS, as a computational method, can be applied to multiple linguistic theoretical frameworks to answer different research questions. 5. **Looking forward to future developments**: Discuss two main future development directions of MDS in the field of linguistics, as well as other possible alternative methods. Overall, this paper aims to systematically introduce and evaluate the application of MDS technology in linguistic research, especially how to generate semantic maps through MDS and the potential and limitations of this method in cross - language variation research.