Predicting Next Application Most Likely Used with Word Embedding and Time-Series Data Encoding

G. Gweon,Taieui Song
DOI: https://doi.org/10.1109/BigComp57234.2023.00051
2023-02-01
Abstract:Predicting Next Application Most Likely Used, referred to as NAMLU, is a problem predicting what application a user will use at the next using prediction model. In this study, we propose the Time Series Deep learning (TED) model, which has two components that yield better model performance for the NAMLU task than the NAP (Natural App Processing), which is the existing state-of-the-art model. The TED model proposes the following architectural changes to address two challenges (1) To reduce data representation building cost, we use word embedding and time-series data encoding (2) To increase the model’s inference speed, we use CNN-based deep learning architecture. Two experiments were conducted to measure the best data embedding and encoding representation as well as the TED model performance. The experiments showed that using Doc2Vec for embedding and using Gramian Angular Field (GAF) for encoding yields the best performance at 84.5% (Reca11@5). This result is about 5% points higher than the SOTA model, NAP. In addition, through experiments, it was confirmed that using the transfer learning model improves prediction performance by 4% compared to the user independent model.
Computer Science
What problem does this paper attempt to address?