A mapping-free natural language processing-based technique for sequence search in nanopore long-reads

Tomasz Strzoda,Lourdes Cruz-Garcia,Mustafa Najim,Christophe Badie,Joanna Polanska
DOI: https://doi.org/10.1186/s12859-024-05980-7
IF: 3.307
2024-11-19
BMC Bioinformatics
Abstract:In unforeseen situations, such as nuclear power plant's or civilian radiation accidents, there is a need for effective and computationally inexpensive methods to determine the expression level of a selected gene panel, allowing for rough dose estimates in thousands of donors. The new generation in-situ mapper, fast and of low energy consumption, working at the level of single nanopore output, is in demand. We aim to create a sequence identification tool that utilizes natural language processing techniques and ensures a high level of negative predictive value (NPV) compared to the classical approach.
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
What problem does this paper attempt to address?