Bridging big data in the ENIGMA consortium to combine non-equivalent cognitive measures

Eamonn Kennedy,Shashank Vadlamani,Hannah M Lindsey,Pui-Wa Lei,Mary Jo-Pugh,Paul M Thompson,David F Tate,Frank G Hillary,Emily L Dennis,Elisabeth A Wilde,ENIGMA Clinical Endpoints Working Group,Maheen Adamson,Martin Alda,Silvia Alonso-Lana,Sonia Ambrogi,Tim J Anderson,Celso Arango,Robert F Asarnow,Mihai Avram,Rosa Ayesa-Arriola,Talin Babikian,Nerisa Banaj,Laura J Bird,Stefan Borgwardt,Amy Brodtmann,Katharina Brosch,Karen Caeyenberghs,Vince D Calhoun,Nancy D Chiaravalloti,David X Cifu,Benedicto Crespo-Facorro,John C Dalrymple-Alford,Kristen Dams-O'Connor,Udo Dannlowski,David Darby,Nicholas Davenport,John DeLuca,Covadonga M Diaz-Caneja,Seth G Disner,Ekaterina Dobryakova,Stefan Ehrlich,Carrie Esopenko,Fabio Ferrarelli,Lea E Frank,Carol Franz,Paola Fuentes-Claramonte,Helen Genova,Christopher C Giza,Janik Goltermann,Dominik Grotegerd,Marius Gruber,Alfonso Gutierrez-Zotes,Minji Ha,Jan Haavik,Charles Hinkin,Kristen R Hoskinson,Daniela Hubl,Andrei Irimia,Andreas Jansen,Michael Kaess,Xiaojian Kang,Kimbra Kenney,Barbora Keřková,Mohamed Salah Khlif,Minah Kim,Jochen Kindler,Tilo Kircher,Karolina Knížková,Knut K Kolskår,Denise Krch,William S Kremen,Taylor Kuhn,Veena Kumari,Jun Soo Kwon,Roberto Langella,Sarah Laskowitz,Jungha Lee,Jean Lengenfelder,Spencer W Liebel,Victoria Liou-Johnson,Sara M Lippa,Marianne Løvstad,Astri Lundervold,Cassandra Marotta,Craig A Marquardt,Paulo Mattos,Ahmad Mayeli,Carrie R McDonald,Susanne Meinert,Tracy R Melzer,Jessica Merchán-Naranjo,Chantal Michel,Rajendra A Morey,Benson Mwangi,Daniel J Myall,Igor Nenadić,Mary R Newsome,Abraham Nunes,Terence O'Brien,Viola Oertel,John Ollinger,Alexander Olsen,Victor Ortiz García de la Foz,Mustafa Ozmen,Heath Pardoe,Marise Parent,Fabrizio Piras,Federica Piras,Edith Pomarol-Clotet,Jonathan Repple,Geneviève Richard,Jonathan Rodriguez,Mabel Rodriguez,Kelly Rootes-Murdy,Jared Rowland,Nicholas P Ryan,Raymond Salvador,Anne-Marthe Sanders,Andre Schmidt,Jair C Soares,Gianfranco Spalleta,Filip Španiel,Alena Stasenko,Frederike Stein,Benjamin Straube,April Thames,Florian Thomas-Odenthal,Sophia I Thomopoulos,Erin Tone,Ivan Torres,Maya Troyanskaya,Jessica A Turner,Kristine M Ulrichsen,Guillermo Umpierrez,Elisabet Vilella,Lucy Vivash,William C Walker,Emilio Werden,Lars T Westlye,Krista Wild,Adrian Wroblewski,Mon-Ju Wu,Glenn R Wylie,Lakshmi N Yatham,Giovana B Zunta-Soares
DOI: https://doi.org/10.1038/s41598-024-72968-x
2024-10-16
Abstract:Investigators in neuroscience have turned to Big Data to address replication and reliability issues by increasing sample size. These efforts unveil new questions about how to integrate data across distinct sources and instruments. The goal of this study was to link scores across common auditory verbal learning tasks (AVLTs). This international secondary analysis aggregated multisite raw data for AVLTs across 53 studies totaling 10,505 individuals. Using the ComBat-GAM algorithm, we isolated and removed the component of memory scores associated with site effects while preserving instrumental effects. After adjustment, a continuous item response theory model used multiple memory items of varying difficulty to estimate each individual's latent verbal learning ability on a single scale. Equivalent raw scores across AVLTs were then found by linking individuals through the ability scale. Harmonization reduced total cross-site score variance by 37% while preserving meaningful memory effects. Age had the largest impact on scores overall (- 11.4%), while race/ethnicity variable was not significant (p > 0.05). The resulting tools were validated on dually administered tests. The conversion tool is available online so researchers and clinicians can convert memory scores across instruments. This work demonstrates that global harmonization initiatives can address reproducibility challenges across the behavioral sciences.
What problem does this paper attempt to address?