TUnA: An uncertainty aware transformer model for sequence-based protein-protein interaction prediction

Young Su Ko,Jonathan Parkinson,Cong Liu,Wei Wang
DOI: https://doi.org/10.1101/2024.02.19.581072
2024-02-21
Abstract:Protein-protein interactions (PPIs) are important for many biological processes, but predicting them from sequence data remains challenging. Existing deep learning models often cannot generalize to proteins not present in the training set, and do not provide uncertainty estimates for their predictions. To address these limitations, we present TUnA, a Transformer-based uncertainty aware model for PPI prediction. TUnA uses ESM-2 embeddings with Transformer encoders and incorporates a Spectral-normalized Neural Gaussian Process. TUnA achieves state-of-the-art performance and, importantly, evaluates uncertainty for unseen sequences. We demonstrate that TUnA’s uncertainty estimates can effectively identify the most reliable predictions, significantly reducing false positives. This capability is crucial in bridging the gap between computational predictions and experimental validation.
Bioinformatics
What problem does this paper attempt to address?