TORO Indexer: a PyTorch‐based indexing algorithm for kilohertz serial crystallography
Piero Gasparotto,Luis Barba,Hans-Christian Stadler,Greta Assmann,Henrique Mendonça,Alun W. Ashton,Markus Janousch,Filip Leonarski,Benjamín Béjar
DOI: https://doi.org/10.1107/s1600576724003182
IF: 4.868
2024-06-19
Journal of Applied Crystallography
Abstract:Serial crystallography (SX) requires efficient processing of numerous diffraction patterns. TORO Indexer is a high‐performance indexing solution that operates across various platforms such as GPUs, CPUs and TPUs, offering high processing speed without compromising indexing quality. Its design ensures easy integration into existing software, making it a useful tool for evolving SX techniques with ever‐expanding data volumes.Serial crystallography (SX) involves combining observations from a very large number of diffraction patterns coming from crystals in random orientations. To compile a complete data set, these patterns must be indexed (i.e. their orientation determined), integrated and merged. Introduced here is TORO (Torch‐powered robust optimization) Indexer, a robust and adaptable indexing algorithm developed using the PyTorch framework. TORO is capable of operating on graphics processing units (GPUs), central processing units (CPUs) and other hardware accelerators supported by PyTorch, ensuring compatibility with a wide variety of computational setups. In tests, TORO outpaces existing solutions, indexing thousands of frames per second when running on GPUs, which positions it as an attractive candidate to produce real‐time indexing and user feedback. The algorithm streamlines some of the ideas introduced by previous indexers like DIALS real‐space grid search [Gildea, Waterman, Parkhurst, Axford, Sutton, Stuart, Sauter, Evans & Winter (2014). Acta Cryst. D70, 2652–2666] and XGandalf [Gevorkov, Yefanov, Barty, White, Mariani, Brehm, Tolstikova, Grigat & Chapman (2019). Acta Cryst. A75, 694–704] and refines them using faster and principled robust optimization techniques which result in a concise code base consisting of less than 500 lines. On the basis of evaluations across four proteins, TORO consistently matches, and in certain instances outperforms, established algorithms such as XGandalf and MOSFLM [Powell (1999). Acta Cryst. D55, 1690–1695], occasionally amplifying the quality of the consolidated data while achieving superior indexing speed. The inherent modularity of TORO and the versatility of PyTorch code bases facilitate its deployment into a wide array of architectures, software platforms and bespoke applications, highlighting its prospective significance in SX.
crystallography,chemistry, multidisciplinary