CONSENT: Context Sensitive Transformer for Bold Words Classification

Ionut-Catalin Sandu,Daniel Voinea,Alin-Ionut Popa
DOI: https://doi.org/10.48550/arXiv.2205.07683
2022-05-16
Computer Vision and Pattern Recognition
Abstract:We present CONSENT, a simple yet effective CONtext SENsitive Transformer framework for context-dependent object classification within a fully-trainable end-to-end deep learning pipeline. We exemplify the proposed framework on the task of bold words detection proving state-of-the-art results. Given an image containing text of unknown font-types (e.g. Arial, Calibri, Helvetica), unknown language, taken under various degrees of illumination, angle distortion and scale variation, we extract all the words and learn a context-dependent binary classification (i.e. bold versus non-bold) using an end-to-end transformer-based neural network ensemble. To prove the extensibility of our framework, we demonstrate competitive results against state-of-the-art for the game of rock-paper-scissors by training the model to determine the winner given a sequence with $2$ pictures depicting hand poses.
What problem does this paper attempt to address?