Analysis of Words in Passwords from Three Different Countries

Xiaochun Gan,Dong Li,Hu Chen
DOI: https://doi.org/10.1109/itaic54216.2022.9836812
2022-01-01
Abstract:Previous studies have shown that the passwords set by users in different countries are related to their language and culture. This paper introduces a method of analyzing features of words in passwords based on corpora. The first step is constructing specific corpora, mainly including: collecting raw corpora, converting non-English words in raw corpora into ASCII strings, setting the priority of different corpora, and finally combining them according to the priority to build specific corpora. Based on corpus, the frequency of different sense-group in password can be evaluated by the coverage of corpora or a sense-group. According the above methods, the corpora of General Purpose, English, Russian, Japanese and Vietnamese are collected, and the features of words in passwords from Russia, Japan and Vietnam are analyzed. Recovering these features is of great significance to improve the strength of user password.
What problem does this paper attempt to address?