An Evaluation of OCR Systems Against Adversarial Machine Learning

Dan Sporici,Mihai Chiroiu,Dan Ciocîrlan
DOI: https://doi.org/10.1007/978-3-030-12942-2_11
2019-01-01
Abstract:Optical Character Recognition (OCR), while representing a significant progress in the field of computer vision can also contribute to malicious acts that imply automation. As an example, copycats of whole books use OCR technologies to eliminate the effort of typing by hand whenever a clear text version is not available; the same OCR process is also used by various bots in order to bypass CAPTCHA filters and gain access to certain functionalities. In this paper, we propose an approach for automatically converting text into unrecognizable characters for the OCR systems. This approach uses adversarial machine learning techniques, based on crafting inputs in an evolutionary manner, in order to adapt documents by performing a relatively small number of changes which should, in turn, make the text unrecognizable. We show that our mechanism can preserve the readability of text, while achieving great results against OCR services.
What problem does this paper attempt to address?