A novel pattern recognition system for detecting Android malware by analyzing suspicious boot sequences

Jorge Maestre Vidal,Marco Antonio Sotelo Monge,Luis Javier García Villalba
DOI: https://doi.org/10.1016/j.knosys.2018.03.018
2024-02-06
Abstract:This paper introduces a malware detection system for smartphones based on studying the dynamic behavior of suspicious applications. The main goal is to prevent the installation of the malicious software on the victim systems. The approach focuses on identifying malware addressed against the Android platform. For that purpose, only the system calls performed during the boot process of the recently installed applications are studied. Thereby the amount of information to be considered is reduced, since only activities related with their initialization are taken into account. The proposal defines a pattern recognition system with three processing layers: monitoring, analysis and decision-making. First, in order to extract the sequences of system calls, the potentially compromised applications are executed on a safe and isolated environment. Then the analysis step generates the metrics required for decision-making. This level combines sequence alignment algorithms with bagging, which allow scoring the similarity between the extracted sequences considering their regions of greatest resemblance. At the decision-making stage, the Wilcoxon signed-rank test is implemented, which determines if the new software is labeled as legitimate or malicious. The proposal has been tested in different experiments that include an in-depth study of a particular use case, and the evaluation of its effectiveness when analyzing samples of well-known public datasets. Promising experimental results have been shown, hence demonstrating that the approach is a good complement to the strategies of the bibliography.
Cryptography and Security
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to prevent malware from being installed and executed on Android devices, especially by analyzing the sequence of suspicious system calls when the application is launched to detect malware. Specifically, the paper proposes a new pattern recognition system, aiming at: 1. **Reducing the amount of information**: Only focus on the sequence of system calls during the application startup, thereby reducing the amount of information that needs to be monitored. 2. **Improving detection accuracy**: By adopting the sequence alignment method in bioinformatics, evaluate the similarity between different applications to more accurately identify malware. 3. **Reducing the false positive rate**: Ensure that legal applications are not wrongly marked as malware during the detection process, thereby reducing the impact on user experience. 4. **Adapting to the resource limitations of mobile devices**: Considering the limited computing resources of mobile devices, delegate complex analysis tasks to external infrastructure to ensure the efficient operation of the system. ### Main Objectives The main objective of the paper is to develop an effective intrusion detection system that can prevent the installation of malware on user devices. To achieve this goal, the system focuses on the following: - **Dynamic behavior analysis**: Identify potential malware by studying the dynamic behavior when the application is launched, especially the sequence of system calls. - **Secure isolation environment**: Execute potentially malicious applications in a sandbox environment to avoid affecting the real system. - **Sequence alignment algorithm**: Utilize the sequence alignment methods in bioinformatics, such as the Longest Common Subsequence (LCS) algorithm, to compare and score the similarity between different applications. - **Decision - making mechanism**: Use the Wilcoxon signed - rank test to determine whether a new software is malware and classify it according to the result. ### Method Overview The system includes three processing layers: 1. **Monitoring layer**: Extract the sequence of system calls during the startup process of potentially malicious applications. 2. **Analysis layer**: Use the sequence alignment algorithm to generate measurement indicators for decision - making, with special attention to the maximum similarity area. 3. **Decision layer**: Apply the Wilcoxon signed - rank test to mark new software as legal or malicious. Through this method, the paper demonstrates the effectiveness of its system in detecting malware and verifies its effectiveness as a useful supplement to existing defense strategies.