Multiple Time Series Modeling of Autoregressive Distributed Lags with Forward Variable Selection for Prediction

Achmad Efendi,Yusi Tyroni Mursityo,Ninik Wahju Hidajati,Nur Andajani,Zuraidah Zuraidah,Samingun Handoyo
DOI: https://doi.org/10.37394/23207.2024.21.84
2024-04-19
WSEAS TRANSACTIONS ON BUSINESS AND ECONOMICS
Abstract:The conventional time series methods tend to explore the modeling process and statistics tests to find the best model. On the other hand, machine learning methods are concerned with finding it based on the highest performance in the testing data. This research proposes a mixture approach in the development of the ARDL (Autoregressive Distributed Lags) model to predict the Cayenne peppers price. Multiple time series data are formed into a matrix of input-output pairs with various lag numbers of 3, 5, and 7. The dataset is normalized with the Min-max and Z score transformations. The ARDL predictor variables of each lag number and dataset combinations are selected using the forward selection method with a majority vote of four criteria namely the Cp (Cp Mallow), AIC (Akaike Information Criterion), BIC (Bayesian Information Criterion), and adjusted R2 . Each ARDL model is evaluated in the testing data with performance metrics of the RMSE (Root Mean Square Error), MAE (Mean Absolute Error), and R2 . Both AIC and adjusted R2 always form the majority vote in the determining optimal predictor variable of ARDL models in all scenarios. The ARDL predictor variables in each lag number are different but they are the same in the different dataset scenarios. The price of Cayenne pepper yesterday is the predictor variable with the most contribution in all of the 9 ARDL models yielded. The ARDL lag 3 with the original dataset outperforms in the RMSE and MAE metrics while the ARDL lag 3 with the Z score dataset outperforms in the R2 metric.
What problem does this paper attempt to address?