Evaluating FPGA Acceleration on Binarized Neural Networks and Quantized Neural Networks

Abstract:

The gap between complex deep learning models and the limited computing capability of an individual robot is a driving force to the hardware acceleration of image/video processing networks [1,2]. Therefore, this paper presents two case studies on the field-programmable gate array (FPGA) acceleration with deep neural networks. The hardware platform is mainly composed of a Xilinx PYNQ-Z2 board as a computing device and a USB camera as the image/video sensor. The main applications in image/video detection and recognition of traffic signs, road objects, and the comparison of execution time between FPGA and software are performed in this work. Compared with the software computation, the results of FPGA implementation show a significant speedup to the road sign recognition (shown in Fig. 1) using Binarized Neural Networks and achieve more than ×50 speedup to the object detection using Quantized Neural Networks (shown in Fig. 2). This paper focuses on the preliminary research into hardware acceleration on deep neural networks. It demonstrates a great potential to improve the classification rate with FPGA, particularly to the time-constrained systems like the self-driving cases. The final goal of this project is to provide parameterizable designs for complex neural networks with FPGA. The network will be configurable with different precisions and different numbers of layers/neurons associated with different design specifications.