Bragg Spot Finder (BSF): a new machine‐learning‐aided approach to deal with spot finding for rapidly filtering diffraction pattern images

Jianxiang Dong,Zhaozheng Yin,Dale Kreitler,Herbert J. Bernstein,Jean Jakoncic
DOI: https://doi.org/10.1107/s1600576724002450
IF: 4.868
2024-04-27
Journal of Applied Crystallography
Abstract:Bragg Spot Finder (BSF) is a U‐Net‐based spotfinder with image preprocessing, a U‐Net segmentation backbone, and post‐processing that includes artifact removal and watershed segmentation. BSF is supported by the Bragg Spot Detection (BSD) benchmark image dataset containing more than 300 images with more than 66000 spots.Macromolecular crystallography contributes significantly to understanding diseases and, more importantly, how to treat them by providing atomic resolution 3D structures of proteins. This is achieved by collecting X‐ray diffraction images of protein crystals from important biological pathways. Spotfinders are used to detect the presence of crystals with usable data, and the spots from such crystals are the primary data used to solve the relevant structures. Having fast and accurate spot finding is essential, but recent advances in synchrotron beamlines used to generate X‐ray diffraction images have brought us to the limits of what the best existing spotfinders can do. This bottleneck must be removed so spotfinder software can keep pace with the X‐ray beamline hardware improvements and be able to see the weak or diffuse spots required to solve the most challenging problems encountered when working with diffraction images. In this paper, we first present Bragg Spot Detection (BSD), a large benchmark Bragg spot image dataset that contains 304 images with more than 66000 spots. We then discuss the open source extensible U‐Net‐based spotfinder Bragg Spot Finder (BSF), with image pre‐processing, a U‐Net segmentation backbone, and post‐processing that includes artifact removal and watershed segmentation. Finally, we perform experiments on the BSD benchmark and obtain results that are (in terms of accuracy) comparable to or better than those obtained with two popular spotfinder software packages (Dozor and DIALS), demonstrating that this is an appropriate framework to support future extensions and improvements.
chemistry, multidisciplinary,crystallography
What problem does this paper attempt to address?