GELKcat: An Integration Learning of Substrate Graph with Enzyme Embedding for Kcat prediction.

Bing-Xue Du,Haoyang Yu,Bei Zhu,Yahui Long,Min Wu,Jian-Yu Shi
DOI: https://doi.org/10.1109/BIBM58861.2023.10385630
2023-01-01
Abstract:Computational modeling and identification of the enzyme turnover number k <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">cat</inf> are crucial for synthetic biology and early-stage lead optimization. Therefore, the accurate assessment of the k <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">cat</inf> for enzyme-substrate pairs is essential. Considering wet-lab experiment is time-consuming, laborious, and expensive, in silico prediction of k <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">cat</inf> is an alternative choice. However, few computational methods have been developed to address this task and other enzyme kinetics predictions. To address this, we develop a novel end-to-end dual-representation framework GELKcat by harnessing graph transformers for substrate molecular encoding and CNNs for enzyme word2vec embeddings. We further integrate substrate and enzyme features using the adaptive gate network, which assigns optimal weights to capture the most suitable feature combinations. The comparison with several state-of-the-art methods exhibits the superiority of our GELKcat. The Ablation studies further illuminate the invaluable roles of the word2vec embeddings of enzymes. It is anticipated that this work can bridge current gaps in enzyme-substrate representation, which can give some guidance for drug discovery and synthetic biology.
What problem does this paper attempt to address?