Image Pre-processing on NumtaDB for Bengali Handwritten Digit Recognition

Ovi Paul
DOI: https://doi.org/10.48550/arXiv.2008.07853
2020-08-18
Computer Vision and Pattern Recognition
Abstract:NumtaDB is by far the largest data-set collection for handwritten digits in Bengali. This is a diverse dataset containing more than 85000 images. But this diversity also makes this dataset very difficult to work with. The goal of this paper is to find the benchmark for pre-processed images which gives good accuracy on any machine learning models. The reason being, there are no available pre-processed data for Bengali digit recognition to work with like the English digits for MNIST.
What problem does this paper attempt to address?