Abstract:Time series are ubiquitous in numerous science and engineering domains, e.g., signal processing, bioinformatics, and astronomy. Previous work has verified the efficacy of symbolic time series representation in a variety of engineering applications due to its storage efficiency and numerosity reduction. The most recent symbolic aggregate approximation technique, ABBA, has been shown to preserve essential shape information of time series and improve downstream applications, e.g., neural network inference regarding prediction and anomaly detection in time series. Motivated by the emergence of high-performance hardware which enables efficient computation for low bit-width representations, we present a new quantization-based ABBA symbolic approximation technique, QABBA, which exhibits improved storage efficiency while retaining the original speed and accuracy of symbolic reconstruction. We prove an upper bound for the error arising from quantization and discuss how the number of bits should be chosen to balance this with other errors. An application of QABBA with large language models (LLMs) for time series regression is also presented, and its utility is investigated. By representing the symbolic chain of patterns on time series, QABBA not only avoids the training of embedding from scratch, but also achieves a new state-of-the-art on Monash regression dataset. The symbolic approximation to the time series offers a more efficient way to fine-tune LLMs on the time series regression task which contains various application domains. We further present a set of extensive experiments performed across various well-established datasets to demonstrate the advantages of the QABBA method for symbolic approximation.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: how to improve the storage efficiency while maintaining the speed and accuracy of the symbolic representation of time series. Specifically, the author proposes a new quantization technique QABBA (Quantized ABBA) to reduce the storage requirements of time - series data while minimizing the additional error introduced by quantization. ### Problem Background Time - series data widely exists in many scientific and engineering fields, such as signal processing, bioinformatics, and astronomy. Although traditional symbolic time - series representation methods (such as SAX) are effective, there is still room for improvement in terms of storage efficiency and dimension reduction. The recent ABBA (Adaptive Brownian Bridge - based Aggregation) method has demonstrated its superiority in retaining the shape information of time series, but there is still the possibility of further optimization. ### Research Motivation With the development of high - performance hardware, low - bit - width representations (such as integer operations) can significantly reduce storage and computational costs without sacrificing speed and precision. Therefore, inspired by the quantization techniques of deep - learning models, the author applies the quantization technique to the ABBA method and proposes QABBA. ### Main Contributions 1. **Propose QABBA**: By replacing the original floating - point representation with low - bit - width integer types, QABBA can significantly improve storage efficiency while maintaining the speed and accuracy of the ABBA method. 2. **Quantization Error Analysis**: The author analyzes the additional approximation error introduced by quantization and proves the upper bound of the quantization error through theoretical derivation, providing theoretical support for the application of the quantization technique. 3. **Applied Research**: QABBA is combined with large - language models for time - series regression tasks, demonstrating its potential in various application scenarios. 4. **Experimental Verification**: Through experiments on multiple commonly - used datasets, the quantization error and reconstruction quality of QABBA under different bit lengths are verified, proving its superiority. ### Summary of Mathematical Formulas - **Quantization Mapping**: \[ \tilde{x}=Q(x)=\left\lfloor\frac{x - z}{s}\right\rceil \] where \(s = \frac{\eta-\zeta}{e_{\eta}-e_{\zeta}}\), and \(\left\lfloor\cdot\right\rceil\) represents rounding to the nearest integer. - **De - quantization Mapping**: \[ y = Q^{-1}(\tilde{x})=s(\tilde{x}+z) \] - **Upper Bound of Quantization Error**: \[ \|\tilde{C}-C\|_F\leq\frac{\eta-\zeta}{2^{\omega + 1}-2}\sqrt{2k} \] - **Upper Bound of SSE after Quantization**: \[ dSSE\leq SSE+\frac{2N(\eta-\zeta)^2}{(2^{\omega + 1}-2)^2} \] Through these improvements, QABBA not only improves the storage efficiency of the symbolic representation of time series, but also maintains the original speed and accuracy, and is suitable for various downstream tasks, such as prediction and anomaly detection.

Quantized symbolic time series approximation

Joint symbolic aggregate approximation of time series

Entropy-based Symbolic Aggregate Approximation Representation Method for Time Series

LLM-ABBA: Understanding time series via symbolic approximation

Towards a faster symbolic aggregate approximation method

Time series representation: a random shifting perspective

A Novel Trend Symbolic Aggregate Approximation for Time Series

Approximate Bayesian Computation for a Class of Time Series Models

Piecewise Statistic Approximation Based Similarity Measure for Time Series

ABBA-VSM: Time Series Classification using Symbolic Representation on the Edge

Abstracted Shapes as Tokens -- A Generalizable and Interpretable Model for Time-series Classification

Accurate Symbolization of Time Series

Sym-Q: Adaptive Symbolic Regression via Sequential Decision-Making

Entropy-Based Symbolic Representation for Time Series Classification

ABQ-LLM: Arbitrary-Bit Quantized Inference Acceleration for Large Language Models

Indexable Online Time Series Segmentation with Error Bound Guarantee

Towards Symbolic Time Series Representation Improved by Kernel Density Estimators

EfficientQAT: Efficient Quantization-Aware Training for Large Language Models

Quantum deep neural networks for time series analysis

QSpec: Speculative Decoding with Complementary Quantization Schemes

Quantile deep learning models for multi-step ahead time series prediction