Water position prediction with SE(3)-Graph Neural Network

Sangwoo Park
DOI: https://doi.org/10.1101/2024.03.25.586555
2024-03-28
Abstract:Most protein molecules exist in a water medium and interact with numerous water molecules. Consideration of interactions between protein molecules and water molecules is essential to understanding the functions of the protein. In computational studies on protein functions, either implicit solvation or explicit solvation methods are used to consider the effect of water on the protein. Implicit solvation methods consider water as a continuous solvent and have lower computational costs than explicit methods that consider water as a collection of individual water molecules. However, some water molecules have specific interactions with protein molecules, which are critical to protein function and require explicit treatment to consider these specific interactions. Thus, as a compromise between computational cost and consideration of specific interactions, hybrid methods use explicit consideration of water molecules with specific interaction with protein molecules while considering other water molecules implicitly. Prediction of the water positions having specific interaction is required to perform such hybrid methods, where various water position prediction methods have been developed. However, currently developed water position prediction methods still require considerable computational cost. Here, we present a water position prediction method with low computational cost and state-of-the-art prediction performance by utilizing SE(3)-an equivariant graph neural network. The introduction of a graph neural network enabled the consideration of the atom as a single data point, which makes computational costs less than our previous water prediction method using a convolutional neural network, which considers an atom as multiple data points. Our new water position prediction method, WatGNN, showed an average computation time of 1.86 seconds while maintaining state-of-the-art prediction performance. The source code of this water prediction method is freely available at .
Bioinformatics
What problem does this paper attempt to address?
This paper focuses on the prediction problem of water molecule positions, which is a key factor in understanding protein functionality. In studies of protein functionality, implicit or explicit solvent methods are commonly used to consider the effects of water molecules. Implicit methods treat water as a continuous solvent, which has lower computational costs but cannot consider specific atomic interactions. Therefore, a hybrid method is proposed, which explicitly handles water molecules that have specific interactions with proteins, while implicitly handling other water molecules. The paper proposes a new method called WatGNN, which utilizes SE(3) group graph neural networks to reduce computational costs while maintaining state-of-the-art prediction performance. Traditional convolutional neural networks (CNNs) have issues with accuracy and computational efficiency when predicting water molecule positions, whereas WatGNN reduces the computational burden by treating each atom as an individual data point and ensures equivariance to rotation and translation. Experimental results show that WatGNN has higher accuracy in predicting water molecule positions, with an average prediction time of 1.86 seconds, which is 12.8 times faster than previous CNN-based methods. Compared to other methods such as CNN-based, statistical potential, integral theory, and geometric methods, WatGNN performs better in predicting high-resolution crystal structures. The paper provides a detailed description of the workflow of WatGNN, including input processing, graph neural network structure, feature representation of nodes and edges, placement of probe nodes, and network training and performance evaluation. Additionally, the differences in prediction performance and computational time between WatGNN and other prediction methods such as GalaxyWater-CNN, GalaxyWater-wKGB, 3D-RISM, and FoldX are compared.