Sparse Signature Coefficient Recovery via Kernels

Daniil Shmelev,Cristopher Salvi
2024-12-12
Abstract:Central to rough path theory is the signature transform of a path, an infinite series of tensors given by the iterated integrals of the underlying path. The signature poses an effective way to capture sequentially ordered information, thanks both to its rich analytic and algebraic properties as well as its universality when used as a basis to approximate functions on path space. Whilst a truncated version of the signature can be efficiently computed using Chen's identity, there is a lack of efficient methods for computing a sparse collection of iterated integrals contained in high levels of the signature. We address this problem by leveraging signature kernels, defined as the inner product of two signatures, and computable efficiently by means of PDE-based methods. By forming a filter in signature space with which to take kernels, one can effectively isolate specific groups of signature coefficients and, in particular, a singular coefficient at any depth of the transform. We show that such a filter can be expressed as a linear combination of suitable signature transforms and demonstrate empirically the effectiveness of our approach. To conclude, we give an example use case for sparse collections of signature coefficients based on the construction of N-step Euler schemes for sparse CDEs.
Numerical Analysis,Machine Learning
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is: **How to efficiently calculate the sparse set of iterative integral coefficients in the path signature (signature)**. Specifically, the paper proposes a method using signature kernels to recover the sparse coefficients in the path signature. ### Background and Problem Description The path signature is a powerful tool for capturing sequential information, and it performs particularly well when dealing with streaming data such as time series, text, and audio. The signature can be regarded as a collection of all iterative integrals of a path. It has rich analytical and algebraic properties and can be used as a basis for approximating functions on the path space. However, there is currently a lack of effective methods for calculating the sparse set of iterative integral coefficients in high - order signatures. ### Main Contributions of the Paper 1. **Introduction of the Signature Kernel Method**: By defining the signature kernel (i.e., the inner product of two signatures) and using PDE solution methods to efficiently calculate these kernels, the paper proposes a new technique to recover sparse signature coefficients. 2. **Design of Filters**: By constructing an appropriate filter, specific groups of signature coefficients can be isolated in the signature space, especially individual coefficients at any depth. 3. **Linear Combination Representation**: It is proved that this filter can be represented as a linear combination of appropriate signature transformations, and the effectiveness of this method in practical applications is demonstrated. 4. **Complexity Optimization**: The serial complexity of the new method is O(Ln^2n), but it can be reduced to O(L) through parallelization, which is significantly better than the existing Chen - relation method. ### Application Examples The paper also provides an application example of constructing sparse CDEs (controlled differential equations) based on the N - step Euler scheme, showing the practical application scenarios of sparse signature coefficients. ### Key Formulas - The k - th level of the signature is defined as: \[ S(x)^{(k)}_{[s,t]}=\int_{s < t_1 < \cdots < t_k < t} dx_{t_1}\otimes dx_{t_2}\otimes\cdots\otimes dx_{t_k}\in V^{\otimes k} \] - The signature kernel is defined as: \[ k_{x,y}(t, s)=\langle S(x)_{[a,t]}, S(y)_{[c,s]}\rangle_{T((V))} \] - The truncated signature kernel is defined as: \[ k^n_{x,y}(t, s)=\sum_{k = 0}^n\langle S(x)^{(k)}_{[a,t]}, S(y)^{(k)}_{[c,s]}\rangle_{V^{\otimes k}} \] Through these formulas and methods, the paper successfully solves the problem of efficiently calculating sparse signature coefficients, providing new tools and ideas for processing large - scale streaming data.