A Parallel Implementation for Large-Scale TSR-based 3D Structural Comparisons of Protein and Amino Acid

Feng Chen,Tarikul I. Milon,Poorya Khajouie,Antoinette Myers,Wu Xu
DOI: https://doi.org/10.2174/0115748936306625240724102438
2024-08-01
Current Bioinformatics
Abstract:Background: Proteins play a vital role in sustaining life, requiring the formation of specific 3D structures to manifest their essential biological functions. Structure comparison techniques are benefiting from the ever-expanding repositories of the Protein Data Bank. The development of computational tools for protein and amino acid 3D structural comparisons plays an important role in understanding protein functions. The Triangular Spatial Relationship (TSR)-based was developed for such purpose. Methods: A parallelization strategy and actual implementation on high-performance clusters using the distributed and shared memory programming model, along with the utilization of multi-core CPU and many-core GPU accelerators, were developed. 3D structures of proteins and amino acids are represented by an integer vector in the TSR-based method. This parallelization strategy is designed for the TSR-based method for large-scale 3D structural comparisons of proteins and amino acids in this study. It can also be adapted to other applications where a vector type of data structure is used. Results: Due to the nature of the vector representation of protein and amino acid structures using the TSR-based method, the comparison algorithm is well-suited for parallelization on large scale supercomputers. Performance studies on the representative datasets were conducted to demonstrate the efficiency of the parallelization strategy. It allows comparisons of large 3D protein or amino acid structure datasets to finish within a reasonable amount of time. Conclusion: The case studies, by taking advantage of this parallelization code, demonstrate that applying either mirror image or feature selection in the TSR-based algorithms improves the classifications of protein and amino acid 3D structures. The TSR keys have the advantage of performing structure-based BLAST searches. The parallelization code could be used as a reference for similar future studies.
biochemical research methods,mathematical & computational biology
What problem does this paper attempt to address?