PyDTS: A Python Toolkit for Deep Learning Time Series Modelling

Pascal A. Schirmer,Iosif Mporas
DOI: https://doi.org/10.3390/e26040311
IF: 2.738
2024-04-01
Entropy
Abstract:In this article, the topic of time series modelling is discussed. It highlights the criticality of analysing and forecasting time series data across various sectors, identifying five primary application areas: denoising, forecasting, nonlinear transient modelling, anomaly detection, and degradation modelling. It further outlines the mathematical frameworks employed in a time series modelling task, categorizing them into statistical, linear algebra, and machine- or deep-learning-based approaches, with each category serving distinct dimensions and complexities of time series problems. Additionally, the article reviews the extensive literature on time series modelling, covering statistical processes, state space representations, and machine and deep learning applications in various fields. The unique contribution of this work lies in its presentation of a Python-based toolkit for time series modelling (PyDTS) that integrates popular methodologies and offers practical examples and benchmarking across diverse datasets.
physics, multidisciplinary
What problem does this paper attempt to address?
The paper primarily addresses several key issues in the field of time series modeling and proposes a Python toolkit named PyDTS to support these tasks. Specifically, the paper focuses on the following problems: 1. **Denoising**: Recovering the true values of signals from noisy observations, such as separating the power consumption of individual appliances from the total household power usage. 2. **Forecasting**: Predicting future values based on historical data, such as weather forecasting or power demand prediction. 3. **Nonlinear Transient Modelling**: Addressing nonlinear and potentially underdetermined problems when the input is a time series, such as simulating the transient behavior of thermal, structural, or fluid systems. 4. **Anomaly Detection**: Identifying outliers in large time series datasets, such as detecting faulty samples in production sequences. 5. **Degradation Modelling**: Describing the relationship between output parameters that change slowly over time and input parameters, such as the change in battery capacity over time and usage conditions. To address the aforementioned issues, the paper proposes a Python toolkit, PyDTS, which integrates various commonly used time series modeling methods and provides practical examples and benchmark results on different datasets. PyDTS aims to lower the barrier to using deep learning-based methods for time series modeling, allowing users to perform model training, evaluation, and other operations through simple function calls without dealing with complex steps like data preprocessing and result visualization. The paper also provides a detailed introduction to mathematical modeling methods in different application scenarios and compares the advantages and disadvantages of statistical modeling, linear algebra modeling, and machine learning/deep learning modeling methods. Finally, the paper experimentally validates the proposed toolkit, including analysis on multiple real-world datasets, to demonstrate its effectiveness and practicality.