Using Generative Adversarial Networks to Break and Protect Text Captchas

Guixin Ye,Zhanyong Tang,Dingyi Fang,Zhanxing Zhu,Yansong Feng,Pengfei Xu,Xiaojiang Chen,Jungong Han,Zheng Wang
DOI: https://doi.org/10.1145/3378446
IF: 2.717
2020-01-01
ACM Transactions on Privacy and Security
Abstract:Text-based CAPTCHAs remains a popular scheme for distinguishing between a legitimate human user and an automated program. This article presents a novel genetic text captcha solver based on the generative adversarial network. As a departure from prior text captcha solvers that require a labor-intensive and time-consuming process to construct, our scheme needs significantly fewer real captchas but yields better performance in solving captchas. Our approach works by first learning a synthesizer to automatically generate synthetic captchas to construct a base solver. It then improves and fine-tunes the base solver using a small number of labeled real captchas. As a result, our attack requires only a small set of manually labeled captchas, which reduces the cost of launching an attack on a captcha scheme. We evaluate our scheme by applying it to 33 captcha schemes, of which 11 are currently used by 32 of the top-50 popular websites. Experimental results demonstrate that our scheme significantly outperforms four prior captcha solvers and can solve captcha schemes where others fail. As a countermeasure, we propose to add imperceptible perturbations onto a captcha image. We demonstrate that our countermeasure can greatly reduce the success rate of the attack.
What problem does this paper attempt to address?