Representational primitives using trend based global features for time series classification

C.I. Johnpaul,Munaga V.N.K. Prasad,S. Nickolas,G.R. Gangadharan,Johnpaul C.I.
DOI: https://doi.org/10.1016/j.eswa.2020.114376
IF: 8.5
2021-04-01
Expert Systems with Applications
Abstract:Feature based learning of time series sequences contains a systematic step of preprocessing, representing and analyzing the properties of time series elements. Representational features include the mapping of time series properties namely trend, seasonality and stationarity. Usually, the segmented generation of representational structures does not contain the global features of a time series sequence which can influence the learning algorithms. Global information of each time series sequence reinforces the respective segmental properties present in it. Identifying, extracting and processing of global features which are common to all time series sequences are challenging tasks in time series feature learning. Hence, we propose a novel set of global features which provides an additional representational leverage to feature based time series learning scenarios. The feature enriched primitives can provide an additional information on the global trend pattern in each of the time series sequences. This enables the learning algorithms to process the time series sequences with the awareness of trend information. We formed a minimum number of most influential trend features which describe the behavior of time series sequences. Thus the dimensionality of the features are preserved which influence the performance of various learning algorithms. The experiments on this novel representational structures are performed on UCR-2018 time series archive which contains 128 datasets. We also represented the trend sequences in a pictorial form named positional size diagram (PSD) and aggregated all the instances of the datasets into an auxiliary data representation named positional dataset (PD). We compared six traditional classification algorithms namely k-nearest neighbor (k-NN), logistic regression (LR), support vector (SV), decision tree (DC), gaussian naive bayes (GNB) and random forest (RF) with trendlets. The additional set of global features enrich the trendlets with supplementary information about the trend of time series sequences. The classification accuracy of the aforementioned algorithms shows a significant improvement with this additional set of global features.
computer science, artificial intelligence,engineering, electrical & electronic,operations research & management science
What problem does this paper attempt to address?