An End-to-end Oxford Nanopore Basecaller Using Convolution-augmented Transformer

Xuan Lv,Zhiguang Chen,Yutong Lu,Yuedong Yang
DOI: https://doi.org/10.1109/BIBM49941.2020.9313290
2020-01-01
Abstract:Oxford Nanopore sequencing is fastly becoming an active field in genomics, and it's critical to basecall nucleotide sequences from the complex electrical signals. Many efforts have been devoted to developing new basecalling tools over the years. However, the basecalled reads still suffer from a high error rate and slow speed. Here, we developed an open-source basecalling method, CATCaller, by simultaneously capturing global context through Attention and modeling local dependencies through dynamic convolution. The method was shown to consistently outperform the ONT default basecaller Albacore, Guppy, and a recently developed attention-based method SACall in read accuracy. More importantly, our method is fast through a heterogeneously computational model to integrate both CPUs and GPUs. When compared to SACall, the method is nearly 4 times faster on a single GPU, and is highly scalable in parallelization with a further speedup of 3.3 on a four-GPU node.
What problem does this paper attempt to address?