Unlocking data in Klebsiella lysogens to predict capsular type-specificity of phage depolymerases

Robby Concha Eloko,Rafael Sanjuan,Beatriz Beamud,Pilar Domingo-Calap
DOI: https://doi.org/10.1101/2024.07.24.604748
2024-07-25
Abstract:Viral entry is a critical step in the infection process. Klebsiella spp. and other clinically relevant bacteria often express a complex polysaccharide capsule that acts as a barrier to phage entry. In turn, most Klebsiella phages encode depolymerases for capsule removal. This virus-host arms race has led to extensive genetic diversity in both capsules and depolymerases, complicating our ability to understand their interaction. This study exploits the information encoded in Klebsiella prophages to model the interplay between the bacteria, the prophages, and their depolymerases, using a graph neural network and a sequence clustering-based method. Both approaches showed significant predictive ability for prophages capsular tropism and, importantly, were transferrable to lytic phages. In addition to creating a comprehensive database linking depolymerase sequences to their specific targets, this study demonstrates the predictability of phage-host interactions at the subspecies level, providing new insights for improving the therapeutic and industrial applicability of phages.
Bioinformatics
What problem does this paper attempt to address?
This paper aims to solve the following problems: **Background problems**: - **The interaction between phages and host bacteria**: Especially for Klebsiella spp., these bacteria usually express a complex polysaccharide capsule as a barrier to phage invasion. - **Phage - encoded depolymerases**: Most Klebsiella phages encode depolymerases for removing the capsule, thereby achieving infection. - **Genetic diversity**: Due to the evolutionary race between viruses and hosts, it has led to extensive genetic diversity in capsules and depolymerases, which complicates understanding their interactions. **Specific problems**: - **Predicting the specificity of phage depolymerases for specific capsule types**: The goal of the study is to use the information in Klebsiella prophages to build models to predict the specificity of phage depolymerases for specific capsule types. - **Improving the feasibility of phage therapy and industrial applications**: By creating a comprehensive database that links depolymerase sequences to their specific targets, providing new insights to improve the applicability of phages in therapy and industrial applications. **Methods**: - **Graph Neural Network (GNN)**: Use Graph Convolutional Network (GCN) and Attention Mechanism to model the complex relationships between prophages, depolymerases, and host bacteria. - **Sequence - based clustering method**: Use the Random Forest algorithm to predict the specificity of phages for specific capsule types by clustering depolymerase domain sequences. **Results**: - **Predictive ability**: Both methods have shown significant predictive ability, especially in predicting the specificity of prophages for specific capsule types. - **Generalization ability of the model**: These models are not only applicable to prophages but can also be transferred to lytic phages. - **Database construction**: A comprehensive database containing depolymerase sequences and their target capsule types has been generated. **Significance**: - **Understanding phage - host interactions**: This study provides new insights into phage - host interactions, especially the predictive ability at the subspecies level. - **Application prospects**: The research results are expected to play an important role in phage therapy and industrial applications, improving the specificity and effectiveness of phages.