Abstract:Although word predictability is commonly considered an important factor in reading, sophisticated accounts of predictability in theories of reading are lacking. Computational models of reading traditionally use cloze norming as a proxy of word predictability, but what cloze norms precisely capture remains unclear. This study investigates whether large language models (LLMs) can fill this gap. Contextual predictions are implemented via a novel parallel-graded mechanism, where all predicted words at a given position are pre-activated as a function of contextual certainty, which varies dynamically as text processing unfolds. Through reading simulations with OB1-reader, a cognitive model of word recognition and eye-movement control in reading, we compare the model's fit to eye-movement data when using predictability values derived from a cloze task against those derived from LLMs (GPT-2 and LLaMA). Root Mean Square Error between simulated and human eye movements indicates that LLM predictability provides a better fit than cloze. This is the first study to use LLMs to augment a cognitive model of reading with higher-order language processing while proposing a mechanism on the interplay between word predictability and eye movements. Reading comprehension is a crucial skill that is highly predictive of later success in education. One aspect of efficient reading is our ability to predict what is coming next in the text based on the current context. Although we know predictions take place during reading, the mechanism through which contextual facilitation affects oculomotor behaviour in reading is not yet well-understood. Here, we model this mechanism and test different measures of predictability (computational vs. empirical) by simulating eye movements with a cognitive model of reading. Our results suggest that, when implemented with our novel mechanism, a computational measure of predictability provides better fits to eye movements in reading than a traditional empirical measure. With this model, we scrutinize how predictions about upcoming input affects eye movements in reading, and how computational approaches to measuring predictability may support theory testing. Modelling aspects of reading comprehension and testing them against human behaviour contributes to the effort of advancing theory building in reading research. In the longer term, more understanding of reading comprehension may help improve reading pedagogies, diagnoses and treatments.

Machine-Learned Computational Models Can Enhance the Study of Text and Discourse: A Case Study Using Eye Tracking to Model Reading Comprehension

Read Beyond the Lines: Understanding the Implied Textual Meaning via a Skim and Intensive Reading Model

Language models outperform cloze predictability in a cognitive model of reading

Fine-Grained Prediction of Reading Comprehension from Eye Movements

Integrating LLM, EEG, and Eye-Tracking Biomarker Analysis for Word-Level Neural State Classification in Semantic Inference Reading Comprehension

Feeding What You Need by Understanding What You Learned

Integrating large language models and active inference to understand eye movements in reading and dyslexia

Integrating Large Language Model, EEG, and Eye-Tracking for Word-Level Neural State Classification in Reading Comprehension

From Word Embedding to Reading Embedding Using Large Language Model, EEG and Eye-tracking

Machine Reading Comprehension: The Role of Contextualized Language Models and Beyond

SEAM: An Integrated Activation-Coupled Model of Sentence Processing and Eye Movements in Reading

Human Behavior Inspired Machine Reading Comprehension

Bridging Information-Seeking Human Gaze and Machine Reading Comprehension

A Framework for Learning Assessment through Multimodal Analysis of Reading Behaviour and Language Comprehension

On the Predictive Power of Neural Language Models for Human Real-Time Comprehension Behavior

Dynamical Cognitive Modeling of Syntactic Processing and Eye Movement Control in Reading

A machine learning approach to reading level assessment

The automated model of comprehension version 4.0 – Validation studies and integration of ChatGPT

Teaching Machines to Extract Main Content for Machine Reading Comprehension

EMTeC: A Corpus of Eye Movements on Machine-Generated Texts