Abstract:Kernel-based approaches for sequence classification have been successfully applied to a variety of domains, including the text categorization, image classification, speech analysis, biological sequence analysis, time series and music classification, where they show some of the most accurate results. Typical kernel functions for sequences in these domains (e.g., bag-of-words, mismatch, or subsequence kernels) are restricted to {\em discrete univariate} (i.e. one-dimensional) string data, such as sequences of words in the text analysis, codeword sequences in the image analysis, or nucleotide or amino acid sequences in the DNA and protein sequence analysis. However, original sequence data are often of real-valued multivariate nature, i.e. are not univariate and discrete as required by typical $k$-mer based sequence kernel functions. In this work, we consider the problem of the {\em multivariate} sequence classification such as classification of multivariate music sequences, or multidimensional protein sequence representations. To this end, we extend {\em univariate} kernel functions typically used in sequence analysis and propose efficient {\em multivariate} similarity kernel method (MVDFQ-SK) based on (1) a direct feature quantization (DFQ) of each sequence dimension in the original {\em real-valued} multivariate sequences and (2) applying novel multivariate discrete kernel measures on these multivariate discrete DFQ sequence representations to more accurately capture similarity relationships among sequences and improve classification performance. Experiments using the proposed MVDFQ-SK kernel method show excellent classification performance on three challenging music classification tasks as well as protein sequence classification with significant 25-40% improvements over univariate kernel methods and existing state-of-the-art sequence classification methods.

Efficient multivariate sequence classification

Distance-Based Classifier Via the Kernel Trick.

Biological Sequence Kernels with Guaranteed Flexibility

Topic Sequence Kernel.

Kernel Multivariate Analysis Framework for Supervised Subspace Learning: A Tutorial on Linear and Kernel Multivariate Methods

Kernels for sequentially ordered data

A Review of Kernel Methods Based Approaches to Classification and Clustering of Sequential Patterns

Quantum-Classical Multiple Kernel Learning

Benchmarking quantum machine learning kernel training for classification tasks

Early Classifying Multimodal Sequences

Quantum machine learning for multiclass classification beyond kernel methods

Quantum Multiple Kernel Learning in Financial Classification Tasks

Quantum Time Series Similarity Measures and Quantum Temporal Kernels

Efficient Convex Algorithms for Universal Kernel Learning

Towards Efficient Quantum Anomaly Detection: One-Class SVMs using Variable Subsampling and Randomized Measurements

BioSequence2Vec: Efficient Embedding Generation For Biological Sequences

Quadratic speed-ups in quantum kernelized binary classification

Quantum kernels for classifying dynamical singularities in a multiqubit system

Pairwise classification using quantum support vector machine with Kronecker kernel

Quantum classifiers with a trainable kernel

Kernels for time series with irregularly-spaced multivariate observations