The stability of kindergarten teachers’ effectiveness: A generalizability study comparing the Framework For Teaching and the Classroom Assessment Scoring System

Panayota Mantzicopoulos,Brian F. French,Helen Patrick,J. Samuel Watson,Inok Ahn
DOI: https://doi.org/10.1080/10627197.2017.1408407
2017-12-04
Educational Assessment
Abstract:To meet recent accountability mandates, school districts are implementing assessment frameworks to document teachers’ effectiveness. Observational assessments play a key role in this process, albeit without compelling evidence of their psychometric rigor. Using a sample of kindergarten teachers, we employed Generalizability theory to investigate (across teachers, raters, and lessons) the stability of scores obtained with two different observation measures: The CLASS K-3 and the FFT. We conducted a series of Decision studies to document (for both measures’ constituent domains) the number of lessons per teacher and raters per lesson that would justify the use of observation scores for high stakes decisions. Acceptable, stable scores for individual-level decisions about teachers may generally require more raters and lessons than is typically used in practice (1–2 raters and fewer than 3 lessons). The considerable variability of observation-based scores raises concerns about either measure’s appropriateness for making individual or group decisions about teachers’ effectiveness.
What problem does this paper attempt to address?