Lexicale Rijkdom, Tekstmoeilijkheid en Woordenschatgrootte

Anne Vermeer
DOI: https://doi.org/10.1075/ttwia.64.10ver
2000-01-01
Abstract:Most measures of lexical richness in spontaneous speech data, based on the distribution of, or the relation between the types and tokens, appear to be neither reliable nor valid. The article describes a semi-automatic computer program, MLR (Measure of Lexical Richness) that measures lexical richness on the basis of the degree of difficulty of the words used, as measured by their (levels of) frequency in daily language input. The MLR is meant for the analysis of texts of (students in) primary education, with a vocabulary size of up to about 25,000 different lemmas, and provides an answer to the following questions: 1) What is the difficulty of the various words in the text? 2) What is the relative proportion of the degrees of difficulty of words in the text? 3) What is the covering percentage of the text for a student with a certain vocabulary size? 4) What is the size of vocabulary of the student, on the basis of the spontaneous speech data?
What problem does this paper attempt to address?