Extended Deep Adaptive Input Normalization for Preprocessing Time Series Data for Neural Networks

Marcus A. K. September,Francesco Sanna Passino,Leonie Goldmann,Anton Hinel
2024-02-29
Abstract:Data preprocessing is a crucial part of any machine learning pipeline, and it can have a significant impact on both performance and training efficiency. This is especially evident when using deep neural networks for time series prediction and classification: real-world time series data often exhibit irregularities such as multi-modality, skewness and outliers, and the model performance can degrade rapidly if these characteristics are not adequately addressed. In this work, we propose the EDAIN (Extended Deep Adaptive Input Normalization) layer, a novel adaptive neural layer that learns how to appropriately normalize irregular time series data for a given task in an end-to-end fashion, instead of using a fixed normalization scheme. This is achieved by optimizing its unknown parameters simultaneously with the deep neural network using back-propagation. Our experiments, conducted using synthetic data, a credit default prediction dataset, and a large-scale limit order book benchmark dataset, demonstrate the superior performance of the EDAIN layer when compared to conventional normalization methods and existing adaptive time series preprocessing layers.
Machine Learning
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve the key problems in the data pre - processing stage when using deep neural networks for time - series prediction and classification. Specifically: 1. **Irregularities of real - world time - series data**: - Real - world time - series data usually has characteristics such as multimodality, skewness, and outliers. - If these irregularities are not properly handled, the model performance will decline rapidly. 2. **Limitations of traditional normalization methods**: - Traditional static normalization methods (such as Z - score normalization and Min - Max scaling) only change the position and scale of data and cannot effectively handle the above - mentioned complex characteristics. - Using these fixed normalization schemes may lead to sub - optimal results, especially in the presence of skewed distributions, heavy - tailed distributions or outliers. 3. **The need for automated and adaptive data pre - processing**: - Determining the most suitable pre - processing method usually requires a great deal of time and repeated training and performance testing. - Therefore, researchers hope to propose a new type of highly efficient automated data pre - processing method to optimize the prediction performance of neural networks, especially in the normalization of multivariate time - series data. To solve these problems, the author proposes the **EDAIN (Extended Deep Adaptive Input Normalization) layer**, which is a new adaptive neural layer that can learn how to properly normalize irregular time - series data in an end - to - end manner instead of using a fixed normalization scheme. By optimizing unknown parameters and training with back - propagation together with deep neural networks, EDAIN can more effectively handle multimodality, skewness, outliers, etc., thereby improving the performance and training efficiency of the model. ### Summary The main objective of this paper is to propose a novel, efficient automated data pre - processing method, especially for the normalization of multivariate time - series data, to optimize the prediction performance of deep neural networks and address the shortcomings of existing methods in handling complex data characteristics.