Abstract:Currently the number and types of malware increase rapidly, and traditional malware family classification technologies become more and more difficult to deal with them. With the rise of deep learning technology, various malware family classification methods based on deep learning technologies have been proposed, and these methods have achieved excellent results. One problem of most deep learning based models is that they need the input data should have a fixed data relationship. However, no prior knowledge shows that there existed such fixed data relationships. Another problem is that in present the characteristics of malware are often be expressed as binary sequences, API call sequences, Opcode sequences etc. These features are low-level features, and are not easy to be understood. To solve these issues, we propose a method based on the point cloud model to detect malware families. In the point cloud model each malware behavior is mapped to a point in the high-dimensional space. The point cloud model can learn the relationships among these behaviors. The method avoids predetermining the relationships among data, which is more reasonable for malware detection. In addition, we use the behavior report to describe malware behavior features, which can be easily understand by people. We apply this method to classify malware families. The experimental results show that the average precision and recall for family classification reach 96.67%, 96.58%, surpassing traditional deep learning models such as LSTM, CNN, and LSTM with attention mechanism.

A Malware Family Classification Method Based on the Point Cloud Model DGCNN