An In-depth Study of Digits in Passwords for Chinese Websites.

Rui Xu,Xiaojun Chen,Xingxing Wang,Jinqiao Shi
DOI: https://doi.org/10.1109/dsc.2018.00094
2018-01-01
Abstract:The current research mainly employs natural language processing techniques to study semantic patterns in passwords, but they do not explore deeply the construction of digits to generate more hittable guesses. In this paper, we utilize memory chunking method to split non-semantic digits in passwords, and define combination rules to split semantic digits patterns into small chunks. We use patterns customized for Chinese websites to extract structures and segments of letters and symbols. And we model them with Probabilistic Context-Free Grammars to generate password guesses. In addition, we introduce a definition called Password Guesses Probability Sparsity to describe the negative impact brought by sophisticated patterns, and develop measures to get substantial improvements. Experiment results based on large-scale datasets included over 160 million entries show our method guesses passwords much more effective than previous methods for guessing long numerical passwords and universal passwords. We achieve 107.18% relative improvements over John the Ripper for guessing long numerical passwords, and 14.46% relative gains than previous methods for guessing universal passwords.
What problem does this paper attempt to address?