Jan-Matthis Niermann
Resource-efficient Training of Convolutional Neural Networks on FPGAs
Abstract
Recent work has focused on utilizing FPGAs to accelerate Deep Neural Networks
(DNNs). While most of them focus on inference tasks, only a few have also implement
training. The approaches which have implemented training, are usually not open
source. STANN does implement training, but only for dense layers and this makes
the training of Convolutional Neural Networks (CNNs) impossible. Therefore, an
implementation for convolutional layers as an enhancement to the STANN library is
needed. The implementation aims to build a foundation that enables a synthesizable
implementation for high-level synthesis (HLS). It needs to implement the functions
needed for backpropagation as well as for updates of filters and biases. The proposed
implementation is able to perform the training task of a convolutional layer. Evaluation
shows that the results are comparable to those of a PyTorch implementation. The
average deviation per element is less than 0.00005 %.