DICOM data storage and retrieval with MongoDB

W. Gohn,H. Govindaraju,P. Faley,F. Massanes-Basi,A. H. Vija,Wesley Gohn,Harshitha Govindaraju,Patrick Faley,Francesc Massanes,Alexander H. Vija
DOI: https://doi.org/10.1117/12.2613287
2022-04-04
Abstract:We have created a database for the storage of DICOM data using MongoDB. Data is stored in chunks using gridfs and accessible via queries on any field of the DICOM header, as well as other pre-defined metadata fields. When data is imported into the database, a custom Python plugin extracts metadata from the DICOM header for each file, which is used to populate the database. In addition to the UID values stored in the DICOM header, an MD5 checksum is computed for each file and referenced in the database to ensure that data is not duplicated. The metadata is tracked using a local instance of MongoDB Charts. Data is replicated across two NVMe servers connected with 100 Gbps ethernet for fast replication. The two servers are configured as a MongoDB replica set with an arbiter running on a third server that is responsible for promoting either of the data-bearing nodes to be the primary access point based on usage and availability. The database is accessible via a custom PyTorch connector that connects the database to the 100 Gbps network for access by a HPC cluster for deep learning applications and via a web interface on the local network used to pull or upload data.
What problem does this paper attempt to address?