Combining Crowd Contributions with Machine Learning to Detect Malicious Mobile Apps

Dahai Yao,Hailong Sun,Xudong Liu
DOI: https://doi.org/10.1145/2875913.2875941
2015-01-01
Abstract:Android is undoubtedly becoming the most popular smartphone platform. The popularity of Android, unfortunately, has also made the devices become the target of malware. Most of existing malicious mobile apps feature stealthy operations such as collecting user privacy, sending premium SMS messages and making unauthorized http connections with no legal notice to the affected user. However, transmission of sensitive data cannot indicate malicious behavior because some benign applications also need sensitive data to improve the user experience. Existing malware detection approaches focus on static or dynamic analysis without crowd user contributions. In this paper, we propose a novel technique which combining crowd contributions with machine learning to detect malicious mobile apps. We model privacy transmission as user-determined and undetermined with the help of real user decisions based on crowdsourcing. We apply static analysis to extract application basic information such as permissions and suspicious API calls. Then we use dynamic instrumentation technique to trace real API calls at runtime and collect the crowd user decisions to the prompted sensitive data transmission. Finally, we employ several different learning-based algorithms, such as SVM, Bayesian Network, Decision Tree and KNN to detect malicious apps. Experiments with 100 real application samples show that our system was capable of detecting malicious mobile apps: our system can detect 85% to 97% of the malware with low false positive rate.
What problem does this paper attempt to address?