Image Classification and Text Extraction using Machine Learning

R. Deepa,Kiran N Lalwani
DOI: https://doi.org/10.1109/iceca.2019.8821936
2019-06-01
Abstract:Machine Learning is a branch of Artificial Intelligence in which a system is capable of learning by itself without explicit programming or human assistance based on its prior knowledge and experience. It is used to predict or make decisions to perform certain task based on the training set that is provided. In the proposed system, image classification is implemented using Convolutional Neural Network (CNN). The text is then extracted from the classified image using Tesseract, which has implemented a Long Short-Term Memory (LSTM) based recognition engine. The LSTM networks are the units of Recurrent Neural Network. The CNN performs better on very large datasets, by overcoming the problem of overfitting. Also, single line text extraction is replaced by multiple line text extraction. Thus, the accuracy of this system can be improved by incorporating a large dataset and increasing the number of epochs. In addition, a trial-and-error methodology is used to determine the number of convolution and pooling layers with the number of nodes in each layer. Finally, CNNs use relatively few preprocessing compared to other image classification algorithms.
What problem does this paper attempt to address?