Abstract:Although text-based captcha, which is used to differentiate between human users and bots, has faced many attack methods, it remains a widely used security mechanism and is employed by some websites. Some deep learning-based text captcha solvers have shown excellent results, but the labor-intensive and time-consuming labeling process severely limits their viability. Previous works attempted to create easy-to-use solvers using a limited collection of labeled data. However, they are hampered by inefficient preprocessing procedures and inability to recognize the captchas with complicated security features.In this paper, we propose GeeSolver, a generic, efficient, and effortless solver for breaking text-based captchas based on self-supervised learning. Our insight is that numerous difficult-to-attack captcha schemes that "damage" the standard font of characters are similar to image masks. And we could leverage masked autoencoders (MAE) to improve the captcha solver to learn the latent representation from the "unmasked" part of the captcha images. Specifically, our model consists of a ViT encoder as latent representation extractor and a well-designed decoder for captcha recognition. We apply MAE paradigm to train our encoder, which enables the encoder to extract latent representation from local information (i.e., without masking part) that can infer the corresponding character. Further, we freeze the parameters of the encoder and leverage a few labeled captchas and many unlabeled captchas to train our captcha decoder with semi-supervised learning.Our experiments with real-world captcha schemes demonstrate that GeeSolver outperforms the state-of-the-art methods by a large margin using a few labeled captchas. We also show that GeeSolver is highly efficient as it can solve a captcha within 25 ms using a desktop CPU and 9 ms using a desktop GPU. Besides, thanks to latent representation extraction, we successfully break the hard-to-attack captcha schemes, proving the generality of our solver. We hope that our work will help security experts to revisit the design and availability of text-based captchas. The code is available at https://github.com/NSSL-SJTU/GeeSolver.

Make Complex CAPTCHAs Simple: A Fast Text Captcha Solver Based on a Small Number of Samples

Yet Another Text Captcha Solver

GeeSolver: A Generic, Efficient, and Effortless Solver with Self-Supervised Learning for Breaking Text Captchas.

Using Generative Adversarial Networks to Break and Protect Text Captchas

A Semi-supervised Deep Learning-Based Solver for Breaking Text-Based CAPTCHAs

TICS: Text–image-Based Semantic CAPTCHA Synthesis Via Multi-Condition Adversarial Learning

Robust Text CAPTCHAs Using Adversarial Examples

Text Captcha is Dead? A Large Scale Deployment and Empirical Study.

3E-Solver: an Effortless, Easy-to-Update, and End-to-End Solver with Semi-Supervised Learning for Breaking Text-Based Captchas

An optimized system to solve text-based CAPTCHA

An End-to-End Attack on Text-based CAPTCHAs Based on Cycle-Consistent Generative Adversarial Network

Applying Visual Cryptography to Enhance Text Captchas

A machine learning attack against variable-length Chinese character CAPTCHAs

Framework for Evaluation of Text Captchas

The Robustness of a New 3D CAPTCHA

Adversarial CAPTCHAs

Neural CAPTCHA Networks.

Robust CAPTCHAs Towards Malicious OCR

Deep-CAPTCHA: a deep learning based CAPTCHA solver for vulnerability assessment

Towards Understanding the Security of Modern Image Captchas and Underground Captcha-Solving Services

Breaking reCAPTCHAv2