Mapreduce-Based Deep Learning With Handwritten Digit Recognition Case Study

Nada Basit, Yutong Zhang, Hao Wu, Haoran Liu, Jieming Bin, Yijun He, Abdeltawab Hendawi
DOI: https://doi.org/10.1109/bigdata.2016.7840783
2016-01-01
Abstract:Faced with the continuously increasing scale of data and expectation on response time, complex deep learning technologies, though highly accurate, present two non-rival challenges: a large amount of training data makes a model impossible to be built in short time and intolerable time-cost prohibits acceptable real-time responses. In this research we focus on improving the accuracy and efficiency of the handwritten digit recognition problem. We chose this problem because it is regarded as the prototype of a lot of complex recognition and classification problems. The success of classification of the handwritten digit dataset can be extended further to other advanced areas. The Convolutional Neural Network (CNN) is implemented to do the recognition. We further improvd the accuracy by adding elastic distortion to the input data, which helps the model better select the features. In addition we implement distributed computing to reduce the time cost. The training process is divided and a final model is formulated by the combination of each trained model. The results shows two facts: the elastic distortion helped the CNN model to improve the accuracy by 7-10%; and the distributed computing method reduced the training time consumption by about 50%.
What problem does this paper attempt to address?