Automatic Data Generation for Deep Learning Model Training of Image Classification Used for Augmented Reality on Pre-school Books

Huy Le,Minh Nguyen,Quan Nguyen,Hoa Nguyen,Wei Qi Yan
DOI: https://doi.org/10.1109/mapr49794.2020.9237760
2020-01-01
Abstract:Nowadays, Augmented Reality (AR) has rightfully been taking as one of the leading position. Still, there are many different AR markers with different encryption and decryption methods which provide the users with an excellent capability to augment computer graphics generated virtual information onto real-world objects (e.g. text-book pictures or diagrams). However, the users need to choose which marker provider that matches their needs and create suitable markers based on the chosen provider's requirements. “Is it worth to re-print the entire existing books in-order to add AR functions?”. In this paper, we describe a new architecture to set up and present AR experiences by applying the benefit of deep learning (DL), the power of smart devices, and the flexibility of the Client-Server Architecture of the Internet. To set up, photos of pages in a textbook (but not limited to all pages) are uploaded to our server. For each page, the server will automatically generate different 3D views (thousands with different light conditions and perspectives) of the pages to form a sufficiently large enough dataset. They are then trained with a chosen convolutional neural network such as Alexnet, GoogleNet, VGG, GoogLeNet, or ResNet. The obtained model is then stored and can be loaded back to the client to serve as a classification process on a web browser using TensorFlow.JS, to recognise pages of the book. TensorFlow.JS is capable of running on smart devices with their built-in cameras; the recognised page will be used to specify which 3D graphics is displaying on top the page. This novel AR marker generating method is not only capable of keeping the original images of the books but also believed to achieve a higher detection accuracy. Thus, it is a promising, low-cost AR approach to be used in many areas, including education and training.
What problem does this paper attempt to address?