SHAPEwarp-web: sequence-agnostic search for structurally homologous RNA regions across databases of chemical probing data

Niek R Scholten,Dennis Haandrikman,Joshua O Tolhuis,Edoardo Morandi,Danny Incarnato
DOI: https://doi.org/10.1093/nar/gkae348
IF: 14.9
2024-05-08
Nucleic Acids Research
Abstract:RNA molecules perform a variety of functions in cells, many of which rely on their secondary and tertiary structures. Chemical probing methods coupled with high-throughput sequencing have significantly accelerated the mapping of RNA structures, and increasingly large datasets of transcriptome-wide RNA chemical probing data are becoming available. Analogously to what has been done for decades in the protein world, this RNA structural information can be leveraged to aid the discovery of structural similarity to a known RNA (or RNA family), which, in turn, can inform about the function of transcripts. We have previously developed SHAPEwarp, a sequence-agnostic method for the search of structurally homologous RNA segments in a database of reactivity profiles derived from chemical probing experiments. In its original implementation, however, SHAPEwarp required substantial computational resources, even for moderately sized databases, as well as significant Linux command line know-how. To address these limitations, we introduce here SHAPEwarp-web, a user-friendly web interface to rapidly query large databases of RNA chemical probing data for structurally similar RNAs. Aside from featuring a completely rewritten core, which speeds up by orders of magnitude the search inside large databases, the web server hosts several high-quality chemical probing databases across multiple species. SHAPEwarp-web is available from https://shapewarp.incarnatolab.com.
biochemistry & molecular biology
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to improve the efficiency and ease - of - use of RNA structure homology search. Specifically, the paper introduces SHAPEwarp - web, a web - based tool for quickly querying structurally similar RNA regions in large RNA chemical probe datasets. Compared with the previous SHAPEwarp tool, SHAPEwarp - web has the following features: 1. **Improved computational efficiency**: By rewriting the core algorithm from Perl to Rust, the new implementation is on average two orders of magnitude faster than the original version. For example, when searching for 50 query sequences 200 nucleotides long on a single thread, the new version only takes about 82 seconds, while the old version takes about 2.9 hours. 2. **User - friendly**: It provides an easy - to - use web interface, allowing users to perform complex search tasks through simple input and parameter settings without the need for professional knowledge of Linux command - line operations. 3. **Rich database support**: SHAPEwarp - web integrates multiple high - quality chemical probe databases, covering RNA structure data of different species, including humans, mice, SARS - CoV - 2, ZIKV, etc. 4. **Advanced features**: In addition to the basic search function, it also provides advanced parameter settings, allowing users to customize search conditions, such as E - value thresholds, matching folding evaluation, etc. The matching folding evaluation function can use the RNAalifold software to generate consistent secondary structure models. Through these improvements, SHAPEwarp - web can help researchers explore structural features in the transcriptome more quickly and conveniently, thereby better understanding the relationship between RNA structure and function.