Language Models Outperform Cloze Predictability in a Cognitive Model of Reading

Adrielli Lopes Rego,Joshua Snell,Martijn Meeter
DOI: https://doi.org/10.1101/2024.04.29.591593
2024-04-30
Abstract:Although word predictability is commonly considered an important factor in reading, sophisticated accounts of predictability in theories of reading are yet lacking. Computational models of reading traditionally use cloze norming as a proxy of word predictability, but what cloze norms precisely capture remains unclear. This study investigates whether large language models (LLMs) can fill this gap. Contextual predictions are implemented via a novel parallel-graded mechanism, where all predicted words at a given position are pre-activated as a function of contextual certainty, which varies dynamically as text processing unfolds. Through reading simulations with OB1-reader, a cognitive model of word recognition and eye-movement control in reading, we compare the model’s fit to eye-movement data when using predictability values derived from a cloze task against those derived from LLMs (GPT2 and LLaMA). Root Mean Square Error between simulated and human eye movements indicates that LLM predictability provides a better fit than Cloze. This is the first study to use LLMs to augment a cognitive model of reading with higher-order language processing while proposing a mechanism on the interplay between word predictability and eye movements.
Neuroscience
What problem does this paper attempt to address?