Robust image descriptor for machine learning based data reduction in serial crystallography
Vahid Rahmani,Shah Nawaz,David Pennicard,Heinz Graafsma
DOI: https://doi.org/10.1107/s160057672400147x
IF: 4.868
2024-03-28
Journal of Applied Crystallography
Abstract:This paper proposes a pipeline to categorize serial crystallography data, consisting of a real‐time feature extraction algorithm, an image descriptor and a machine learning classifier. This approach demonstrates superior performance compared with other feature extractors and classifiers.Serial crystallography experiments at synchrotron and X‐ray free‐electron laser (XFEL) sources are producing crystallographic data sets of ever‐increasing volume. While these experiments have large data sets and high‐frame‐rate detectors (around 3520 frames per second), only a small percentage of the data are useful for downstream analysis. Thus, an efficient and real‐time data classification pipeline is essential to differentiate reliably between useful and non‐useful images, typically known as `hit' and `miss', respectively, and keep only hit images on disk for further analysis such as peak finding and indexing. While feature‐point extraction is a key component of modern approaches to image classification, existing approaches require computationally expensive patch preprocessing to handle perspective distortion. This paper proposes a pipeline to categorize the data, consisting of a real‐time feature extraction algorithm called modified and parallelized FAST (MP‐FAST), an image descriptor and a machine learning classifier. For parallelizing the primary operations of the proposed pipeline, central processing units, graphics processing units and field‐programmable gate arrays are implemented and their performances compared. Finally, MP‐FAST‐based image classification is evaluated using a multi‐layer perceptron on various data sets, including both synthetic and experimental data. This approach demonstrates superior performance compared with other feature extractors and classifiers.
chemistry, multidisciplinary,crystallography