The Open Connectome Project Data Cluster: Scalable Analysis and Vision for High-Throughput Neuroscience
Randal Burns,William Gray Roncal,Dean Kleissas,Kunal Lillaney,Priya Manavalan,Eric Perlman,Daniel R. Berger,Davi D. Bock,Kwanghun Chung,Logan Grosenick,Narayanan Kasthuri,Nicholas C. Weiler,Karl Deisseroth,Michael Kazhdan,Jeff Lichtman,R. Clay Reid,Stephen J. Smith,Alexander S. Szalay,Joshua T. Vogelstein,R. Jacob Vogelstein
DOI: https://doi.org/10.48550/arXiv.1306.3543
2013-06-18
Abstract:We describe a scalable database cluster for the spatial analysis and annotation of high-throughput brain imaging data, initially for 3-d electron microscopy image stacks, but for time-series and multi-channel data as well. The system was designed primarily for workloads that build connectomes---neural connectivity maps of the brain---using the parallel execution of computer vision algorithms on high-performance compute clusters. These services and open-science data sets are publicly available at <a class="link-external link-http" href="http://openconnecto.me" rel="external noopener nofollow">this http URL</a>.
The system design inherits much from NoSQL scale-out and data-intensive computing architectures. We distribute data to cluster nodes by partitioning a spatial index. We direct I/O to different systems---reads to parallel disk arrays and writes to solid-state storage---to avoid I/O interference and maximize throughput. All programming interfaces are RESTful Web services, which are simple and stateless, improving scalability and usability. We include a performance evaluation of the production system, highlighting the effectiveness of spatial data organization.
Distributed, Parallel, and Cluster Computing,Computational Engineering, Finance, and Science,Neurons and Cognition