Abstract:Convolutional Neural Network (CNN) is a special kind of feed - forward Artificial Neural Network that is generally used for fast and accurate image recognition. This capability is highly required in the field of embedded systems for various applications. Embedded systems present a compelling need for a portable, low power and area - efficient hardware accelerator. Also, the large amount of processing needed by CNN demands dedicated and custom-built hardware implementations. The convolution part of CNN is a highly parallelized Digital Signal Processing (DSP) algorithm which makes it fit for Field Programmable Gate Array (FPGA) implementation as FPGAs have an incontestable ability to maximize parallelism. In this paper, we present the implementation of CNN on FPGA. We came up with a smaller version of LeNet - 5 with a parametric reduction of about 95% and having accuracy of 95.33% for the application of digit recognition. The complete modified architecture is implemented using Hardware Description Language - Verilog with the aim of improving the timing performance in the inference phase. The proposed work is compared with a software implementation on an 8th generation i5 processor using Keras. The results obtained clearly demarcate acceleration. The hardware architecture is designed to fit on Kintex - 7 xck325ttfg900-2l FPGA optimally. The obtained results can be easily extrapolated for an improved architecture showcasing four times more parallelism for FPGAs having more DSP slices (e.g. Virtex 7 series).

Implementation of High Performance Hardware Architecture of Face Recognition Algorithm Based on Local Binary Pattern on Fpga

Design of Structured Light Depth Detection System Based on FPGA

Face Recognition with Hybrid Efficient Convolution Algorithms on FPGAs

A Fast Face Recognition System Based on Deep Learning

FPGA Implementation of Non-Parametric Stereo Matching Algorithm

FPGA Implementation of Feature Detection Algorithm Based on High Level Synthesis

High-performance Subpixel Edge Location Based on FPGA for Horizon Sensors

A Hardware Implementation Of Bag Of Words And Simhash For Image Recognition

Program Design of the Hardware of Human Face Recognition System

Hardware Implementation of Convolutional Neural Network for Face Feature Extraction

Face-recognition hardware implementation based on SOPC

Real-time Hardware Face Detection Based on Adaboost Algorithm

Memory-Tree Based Design of Optical Character Recognition in FPGA

A low cost architecture for high performance face detection

A High Performance Parallel Computing Architecture For Robust Image Features

VGG16 Hardware Design and Implementation for CNN in Image Recognition

Hardware Architecture for Fast General Object Detection using Aggregated Channel Features

Design of FPGA-based Handwriting Image Recognition System

A FPGA-based Accelerator of Convolutional Neural Network for Face Feature Extraction

Real-time face detection and lip feature extraction using field-programmable gate arrays

HARDWARE ACCELERATOR: IMPLEMENTATION OF CNN ON FPGA FOR DIGIT RECOGNITION