RNA-Protein Interaction Classification via Sequence Embeddings

Dominika Matus,Frederic Runge,Jörg K.H. Franke,Lars Gerne,Michael Uhl,Rolf Backofen,Frank Hutter
DOI: https://doi.org/10.1101/2024.11.08.622607
2024-11-11
Abstract:RNA-protein interactions (RPI) are ubiquitous in cellular organisms and essential for gene regulation. In particular, protein interactions with non-coding RNAs (ncRNAs) play a critical role in these processes. Experimental analysis of RPIs is time-consuming and expensive, and existing computational methods rely on small and limited datasets. This work introduces RNAInterAct, a comprehensive RPI dataset, alongside RPIembeddor, a novel transformer-based model designed for classifying ncRNA-protein interactions. By leveraging two foundation models for sequence embedding, we incorporate essential structural and functional insights into our task. We demonstrate RPIembeddor's strong performance and generalization capability compared to state-of-the-art methods across different datasets and analyze the impact of the proposed embedding strategy on the performance in an ablation study.
Bioinformatics
What problem does this paper attempt to address?