CoSEM: Contextual and Semantic Embedding for App Usage Prediction

Yonchanok Khaokaew,Mohammad Saiedur Rahaman,Ryen W. White,Flora D. Salim
DOI: https://doi.org/10.1145/3459637.3482076
2021-08-26
Abstract:App usage prediction is important for smartphone system optimization to enhance user experience. Existing modeling approaches utilize historical app usage logs along with a wide range of semantic information to predict the app usage; however, they are only effective in certain scenarios and cannot be generalized across different situations. This paper address this problem by developing a model called Contextual and Semantic Embedding model for App Usage Prediction (CoSEM) for app usage prediction that leverages integration of 1) semantic information embedding and 2) contextual information embedding based on historical app usage of individuals. Extensive experiments show that the combination of semantic information and history app usage information enables our model to outperform the baselines on three real-world datasets, achieving an MRR score over 0.55,0.57,0.86 and Hit rate scores of more than 0.71, 0.75, and 0.95, respectively.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the challenge of smartphone application usage prediction, especially the generality problem under different semantic information (such as tasks, search queries, locations and time). Although existing modeling methods use historical application usage logs and extensive semantic information to predict application usage, they work well in some scenarios but cannot be generalized in different situations. Specifically, several key issues mentioned in the paper include: 1. **Limitations of existing models**: Current methods usually require a large amount of feature extraction or context - triggered feature generation, which is difficult without appropriate domain knowledge. 2. **Dataset dependence**: Many proposed models are only applicable to specific types of datasets (i.e., specific context scenarios). For example, the model proposed by Yu et al. can only be used for Points - of - Interest datasets. 3. **Differences in semantic information**: Since the semantic information in different studies varies, it is difficult to generalize existing methods to all scenarios without affecting the prediction performance. To solve these problems, the paper proposes a new general method - **Contextual and Semantic Embedding Model (CoSEM)**, aiming to improve the performance of application usage prediction by combining semantic information embedding and context information embedding, and to maintain good generalization ability under different semantic information conditions. ### Formula representation The formulas involved in the paper mainly focus on the mathematical description of the model. The following is the Markdown - format representation of some formulas: - **Semantic information representation**: \[ S_u^T=\{s_u^1, s_u^2,\ldots, s_u^m\} \] where \(s\) is a semantic information block (for example, words in a search query, tasks, locations, time), and \(m\) is the total number of semantic information blocks. - **Context information representation**: \[ A_u^{t < T}=\{a_u^1, a_u^2,\ldots, a_u^n\} \] where \(a_u\) is an application, and \(n\) is the number of applications used by user \(u\) before time window \(T\). - **Formal definition of the prediction problem**: \[ z(S_u^T, A_u^{t < T})\to P_u^T \] where \(P_u^T\) is the predicted set of applications. - **Summary vector of semantic information embedding**: \[ SV_s = \frac{\sum_{l = 1}^m V_s^l}{m} \] - **Summary vector of context information embedding**: \[ SV_a=\frac{\sum_{l = 1}^n V_a^l}{n} \] Through these formulas, the paper shows how to combine semantic information and context information to improve the accuracy and generalization ability of application usage prediction.