A Time-Series Signal Classification Algorithm and Its Application to Nanopore Ionic Current Signal Identification

Ni Xue,Xin Kaili,Hu Zhengli,Jiang Cuiling,Wan Yongjing,Ying Yi-Lun,Long Yi-Tao
DOI: https://doi.org/10.6023/a23040113
2023-01-01
Abstract:Nanopore-based single molecular analysis technique usually uses time-domain features such as time-current scatter plots of blocking currents for event recognition. However, as the time-domain features overlap with each other, the substances with extremely similar molecular structures are difficult to be accurately discriminated using traditional nanopore recognition methods. The differences in the deep feature representations need fully explored to obtain credible recognition results, thus improving the recognition accuracy of nanopore ionic current signals. Here, a time-series signal classification algorithm is proposed in this paper: firstly, the original signal is framed with overlapping sliding windows to generate sub-signals and extract their shallow feature information; then a time-series signal classification network based on Emphasized Channel Attention, Propagation and Aggregation in time delay neural network (ECAPA-TDNN) is proposed to develop a multi-branch inter-layer feature fusion model for deep feature extraction, where the multi-branch multi-level attention module of this model (RepVGG-SE-Res2Block, RSR-Block) obtains multi-scale features by constructing a feature pyramid structure within each residual block, reduces the inference speed based on structural reparameterization techniques while ensuring the model performance, and introduces Adaptively Spatial Feature Fusion (ASFF) to fuse the features of different layers in the network; finally, a credible statistical prediction strategy is used to obtain reliable classification results by counting the classification probabilities of sub-signals. The experimental results show that for the peptide sequences N'-DDFFIFFDD-C' (DF_I) and N'-DDFFLFFDD-C' (DF_L) containing only the different amino acids I (isoleucine) and L (leucine), which are isomers of each other, the algorithm achieves a recognition accuracy of 99.00%, obviously improving the sensing capability of nanopores for single molecules with similar or even identical molecular weights.
What problem does this paper attempt to address?