Shared High Value Research Resources: The CamCAN Human Lifespan Neuroimaging Dataset Processed on the Open Science Grid

Don Krieger,Paul Shepard,Ben Zusman,Anirban Jana,David O. Okonkwo
DOI: https://doi.org/10.48550/arXiv.1710.05246
2017-12-08
Abstract:The CamCAN Lifespan Neuroimaging Dataset, Cambridge (UK) Centre for Ageing and Neuroscience, was acquired and processed beginning in December, 2016. The referee consensus solver deployed to the Open Science Grid was used for this task. The dataset includes demographic and screening measures, a high-resolution MRI scan of the brain, and whole-head magnetoencephalographic (MEG) recordings during eyes closed rest (560 sec), a simple task (540 sec), and passive listening/viewing (140 sec). The data were collected from 619 neurologically normal individuals, ages 18-87. The processed results from the resting recordings are completed and available online. These constitute 1.7 TBytes of data including the location within the brain (1 mm resolution), time stamp (1 msec resolution), and 80 msec time course for each of 3.7 billion validated neuroelectric events, i.e. mean 6.1 million events for each of the 619 participants. The referee consensus solver provides high yield (mean 11,000 neuroelectric currents/sec; standard deviation (sd): 3500/sec) high confidence (p < 10-12 for each identified current) measures of the neuroelectric currents whose magnetic fields are detected in the MEG recordings. We describe the solver, the implementation of the solver deployed on the Open Science Grid, the workflow management system, the opportunistic use of high performance computing (HPC) resources to add computing capacity to the Open Science Grid reserved for this project, and our initial findings from the recently completed processing of the resting recordings. This required 14 million core hours, i.e. 40 core hours per second of data.
Neurons and Cognition,Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?