Abstract:Malware is one of the most popular cyber-attacks, and it is becoming more common on the network every day. In contrast to benign transmission, which typically exhibits symmetrical patterns, malware communication often shows asymmetrical behaviours, making detection a complex challenge. Fortunately, malware can be distinguished and identified for actual activities utilizing a variety of artificial intelligence methods. However, insufficient work has been allocated to the problem of handling high-dimensional and huge data. This paper proposes a novel deep learning-based approach to identify malicious Uniform Resource Locators (URLs) specifically designed to handle the challenges posed by large-scale and complex data. Initially, input data is sourced from a comprehensive Kaggle dataset, which includes diverse and large-scale URL samples. The URLs are then transformed into vector representations using a Vector Embedding Module, which employs a character-level word embedding technique to capture intricate patterns within the URLs. To further refine the data, the Chaotic Kookaburra Efficient-Bo Network (CKEBO-Net) is applied to extract the most significant features from these vectors, effectively reducing the dimensionality and computational burden. Subsequently, the Cascaded Capsule Twin Attentional Dilated Convolutional Network (C 2 TA_DiCN) model is introduced to classify and identify malicious URLs with high precision. This model leverages the unique strengths of capsule networks and attentional mechanisms, enhancing its capability to capture subtle dependencies within the data. Furthermore, the Lyrebird Meta-heuristic Optimization (LMO) algorithm is used to fine-tune the model parameters appropriately, ensuring that the training process is efficient and robust. The proposed approach is implemented using Python and rigorously evaluated on the Kaggle dataset. Simulation results demonstrate that the proposed method significantly outperforms existing models, achieving a malicious URL detection accuracy of 99.7%.

Malicious Package Detection in NPM and PyPI Using a Single Model of Malicious Behavior Sequence

Killing Two Birds with One Stone: Malicious Package Detection in NPM and PyPI Using a Single Model of Malicious Behavior Sequence

On the Feasibility of Cross-Language Detection of Malicious Packages in npm and PyPI

A Machine Learning-Based Approach For Detecting Malicious PyPI Packages

Malicious Package Detection using Metadata Information

SpiderScan: Practical Detection of Malicious NPM Packages Based on Graph-Based Behavior Modeling and Matching

MalWuKong: Towards Fast, Accurate, and Multilingual Detection of Malicious Code Poisoning in OSS Supply Chains

PackageIntel: Leveraging Large Language Models for Automated Intelligence Extraction in Package Ecosystems

A Malicious Program Behavior Detection Model Based on API Call Sequences

DONAPI: Malicious NPM Packages Detector using Behavior Sequence Knowledge Mapping

CBSeq: A Channel-Level Behavior Sequence for Encrypted Malware Traffic Detection

A Hybrid Deep Learning Model for Malicious Behavior Detection

Malicious URL Detection via Pretrained Language Model Guided Multi-Level Feature Attention Network

Robust Neural Malware Detection Models for Emulation Sequence Learning

SIa-CBc: Sensitive Intent-Assisted and Crucial Behavior-Cognized Malware Detection Based on Human Brain Cognitive Theory

Practical Automated Detection of Malicious npm Packages

Cascaded capsule twin attentional dilated convolutional network for malicious URL detection

Detecting Malicious Domains with Behavioral Modeling and Graph Embedding

PyComm: Malicious commands detection model for python scripts

Deep hybrid approach with sequential feature extraction and classification for robust malware detection

SCGDet: Malware Detection using Semantic Features Based on Reachability Relation