Hermes: Unlocking Security Analysis of Cellular Network Protocols by Synthesizing Finite State Machines from Natural Language Specifications

Abdullah Al Ishtiaq,Sarkar Snigdha Sarathi Das,Syed Md Mukit Rashid,Ali Ranjbar,Kai Tu,Tianwei Wu,Zhezheng Song,Weixuan Wang,Mujtahid Akon,Rui Zhang,Syed Rafiul Hussain
2023-10-12
Abstract:In this paper, we present Hermes, an end-to-end framework to automatically generate formal representations from natural language cellular specifications. We first develop a neural constituency parser, NEUTREX, to process transition-relevant texts and extract transition components (i.e., states, conditions, and actions). We also design a domain-specific language to translate these transition components to logical formulas by leveraging dependency parse trees. Finally, we compile these logical formulas to generate transitions and create the formal model as finite state machines. To demonstrate the effectiveness of Hermes, we evaluate it on 4G NAS, 5G NAS, and 5G RRC specifications and obtain an overall accuracy of 81-87%, which is a substantial improvement over the state-of-the-art. Our security analysis of the extracted models uncovers 3 new vulnerabilities and identifies 19 previous attacks in 4G and 5G specifications, and 7 deviations in commercial 4G basebands.
Cryptography and Security,Artificial Intelligence,Computation and Language
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to automatically extract formal models from cellular network specifications described in natural language for more effective security analysis. Specifically, the authors propose a framework named Hermes, which can automatically generate formal representation methods, such as finite - state machines (FSM), from cellular network protocol specifications described in natural language. This process mainly includes the following steps: 1. **Developing the Neural Constituent Parser (NEUTREX)**: First, the paper introduces NEUTREX, a neural constituent parser used to process transformation - related texts and extract transformation components (i.e., states, conditions, and actions) from them. 2. **Designing a Domain - Specific Language (DSL)**: Next, the paper designs a domain - specific language that converts these transformation components into logical formulas by using dependency parse trees. 3. **Compiling Logical Formulas to Generate FSM**: Finally, these logical formulas are compiled to generate transitions and create formal models as finite - state machines. The paper also shows the effectiveness evaluation of Hermes on 4G NAS, 5G NAS, and 5G RRC specifications, with an overall accuracy rate reaching 81% - 87%, which is significantly better than the existing technology. In addition, through the security analysis of the extracted models, Hermes has discovered three new vulnerabilities, confirmed 19 previously known attacks in 4G and 5G specifications, and seven deviations in commercial 4G base stations. In conclusion, this paper aims to improve the efficiency of security analysis of cellular network protocols through automated tools, reduce the complexity and error rate of manually constructing formal models, and thus better identify and fix potential security vulnerabilities.