12 nov 2019 ano - Retrain-Less Weight Quantization for Multiplier-Less Convolutional Neural Networks

Descrição:

This article presents an approximate signed digit representation (ASD) which quantizes the weights of convolutional neural networks (CNNs) in order to make multiplier-less CNNs without performing any retraining process. Unlike the existing methods that necessitate retraining for weight quantization, the proposed method directly converts full-precision weights of CNN models into low-precision ones, attaining accuracy comparable to that of full-precision models on the Image classification tasks without going through retraining. Therefore, it is effective in saving the retraining time as well as the related computational cost. As the proposed method simplifies the weights to have up to two non-zero digits, multiplication can be realized with only add and shift operations, resulting in a speed-up of inference time and a reduction of energy consumption and hardware complexity. Experiments conducted for famous CNN architectures, such as AlexNet, VGG-16, ResNet-18 and SqueezeNet, show that the proposed method reduces the model size by 73% at the cost of a little increase of error rate, which ranges from 0.09% to 1.5% on ImageNet dataset. Compared to the previous architecture built with multipliers, the proposed multiplier-less convolution architecture reduces the critical-path delay by 52% and mitigates the hardware complexity and power consumption by more than 50%.

Comments: 本文从一个新的角度来诠释量化的价值，即量化是为了降低DNN的计算成本，而本文虽然压缩率不高（8-9bit），但是自身不用重新训练，降低了重新训练的成本，同时又能实现一定的模型加速。

Adicionado na linha do tempo:

Quantization timeline

By龚成

29 mar 2020

Data:

12 nov 2019 ano

Agora

~ 4 years and 6 months ago