Automated Detection and Classification of Third-Party Libraries in Large Scale Android Apps

Hao-Yu WANG,Yao GUO,Zi-Ang MA,Xiang-Qun CHEN
DOI: https://doi.org/10.13328/j.cnki.jos.005221
2017-01-01
Journal of Software
Abstract:Third-Party libraries are widely used in mobile applications such as Android apps.Much research on app analysis or access control needs to detect or classify third-party libraries first in order to provide accurate results.Most previous studies use a whitelist to identify third-party libraries and manually categorize them.However,it is impossible to build a complete whitelist of third-party libraries and classify them because:(1) there are too many of them;and (2) common techniques such as library obfuscation and library masquerading cannot be handled with a whitelist.In this paper,an automated approach is proposed to detect and classify frequently-used third-party libraries in Android apps.A multi-level clustering based method is presented to identify third-party libraries,and a machine learning based technique is applied to classify the libraries.Experiments on more than 130000 apps show that 4916 third-party libraries can be detected without prior knowledge.The classification result of 10-folds cross validation on sampled libraries is 84.28%.With the trained classifier,the proposed approach is able to classify more than 75% of the 4916 libraries into six categories with an accuracy of 75%.
What problem does this paper attempt to address?