ZMPY3D: Accelerating protein structure volume analysis through vectorized 3D Zernike Moments and Python-based GPU Integration

Jhih Siang Lai,Stephen K Burley,Jose M Duarte
DOI: https://doi.org/10.1093/bioadv/vbae111
2024-07-25
Bioinformatics Advances
Abstract:Abstract Motivation Volumetric 3D object analyses are being applied in research fields such as structural bioinformatics, biophysics, and structural biology, with potential integration of Artificial Intelligence/Machine Learning (AI/ML) techniques. One such method, 3D Zernike moments, has proven valuable in analyzing protein structures (e.g., protein fold classification, protein-protein interaction analysis, and molecular dynamics simulations). Their compactness and efficiency make them amenable to large-scale analyses. Established methods for deriving 3D Zernike moments, however, can be inefficient, particularly when higher order terms are required, hindering broader applications. As the volume of experimental and computationally-predicted protein structure information continues to increase, structural biology has become a “big data” science requiring more efficient analysis tools. Results This application note presents a Python-based software package, ZMPY3D, to accelerate computation of 3D Zernike moments by vectorizing the mathematical formulae and using graphical processing units (GPUs). The package offers popular GPU-supported libraries such as CuPy and TensorFlow together with NumPy implementations, aiming to improve computational efficiency, adaptability, and flexibility in future algorithm development. The ZMPY3D package can be installed via PyPI, and the source code is available from GitHub. Volumetric-based protein 3D structural similarity scores and transform matrix of superposition functionalities have both been implemented, creating a powerful computational tool that will allow the research community to amalgamate 3D Zernike moments with existing AI/ML tools, to advance research and education in protein structure bioinformatics. Availability ZMPY3D, implemented in Python, is available on GitHub (https://github.com/tawssie/ZMPY3D) and PyPI, released under the GPL License. Supplementary information Supplementary data are available at Bioinformatics Advances online.
What problem does this paper attempt to address?