Development and validation of an episodic memory measure in the Mobile Toolbox (MTB): Arranging Pictures
Stephanie Ruth Young,Elizabeth M Dworak,Miriam A Novack,Aaron J Kaat,Hubert Adam,Cindy J Nowinski,Zahra Hosseinian,Jerry Slotkin,Jordan Stoeger,Saki Amagai,Maria Varela Diaz,Anyelo Almonte Correa,Keith Alperin,Larsson Omberg,Michael Kellen,Monica R Camacho,Bernard Landavazo,Rachel L Nosheny,Michael W Weiner,Richard Gershon
DOI: https://doi.org/10.1080/13803395.2024.2353945
Abstract:Introduction: Arranging Pictures is a new episodic memory test based on the NIH Toolbox (NIHTB) Picture Sequence Memory measure and optimized for self-administration on a personal smartphone within the Mobile Toolbox (MTB). We describe evidence from three distinct validation studies. Method: In Study 1, 92 participants self-administered Arranging Pictures on study-provided smartphones in the lab and were administered external measures of similar and dissimilar constructs by trained examiners to assess validity under controlled circumstances. In Study 2, 1,021 participants completed the external measures in the lab and self-administered Arranging Pictures remotely on their personal smartphones to assess validity in real-world contexts. In Study 3, 141 participants self-administered Arranging Pictures remotely twice with a two-week delay on personal iOS smartphones to assess test-retest reliability and practice effects. Results: Internal consistency was good across samples (ρxx = .80 to .85, p < .001). Test-retest reliability was marginal (ICC = .49, p < .001) and there were significant practice effects after a two-week delay (ΔM = 3.21 (95% CI [2.56, 3.88]). As expected, correlations with convergent measures were significant and moderate to large in magnitude (ρ = .44 to .76, p < .001), while correlations with discriminant measures were small (ρ = .23 to .27, p < .05) or nonsignificant. Scores demonstrated significant negative correlations with age (ρ = -.32 to -.21, p < .001). Mean performance was slightly higher in the iOS compared to the Android group (MiOS = 18.80, NiOS = 635; MAndroid = 17.11, NAndroid = 386; t(757.73) = 4.17, p < .001), but device type did not significantly influence the psychometric properties of the measure. Indicators of potential cheating were mixed; average scores were significantly higher in the remote samples (F(2, 850) = 11.415, p < .001), but there were not significantly more perfect scores. Conclusion: The MTB Arranging Pictures measure demonstrated evidence of reliability and validity when self-administered on personal device. Future research should examine the potential for cheating in remote settings and the properties of the measure in clinical samples.