Scalar Quantization as Sparse Least Square Optimization

Chen Wang,Xiaomei Yang,Shaomin Fei,Kai Zhou,Xiaofeng Gong,Miao Du,Ruisen Luo
DOI: https://doi.org/10.1109/tpami.2019.2952096
IF: 23.6
2021-05-01
IEEE Transactions on Pattern Analysis and Machine Intelligence
Abstract:Quantization aims to form new vectors or matrices with shared values close to the original. In recent years, the popularity of scalar quantization has been soaring as it is found huge utilities in reducing the resource cost of neural networks. Popular clustering-based techniques suffers substantially from the problems of dependency on the seed, empty or out-of-the-range clusters, and high time complexity. To overcome the problems, in this paper, scalar quantization is examined from a new perspective, namely sparse least square optimization. Specifically, several quantization algorithms based on $l_1$<math>l1</math> least square are proposed and implemented. In addition, similar schemes with $l_1 + l_2$<math>l1+l2</math> and $l_0$<math>l0</math> regularization are proposed. Furthermore, to compute quantization results with given amount of values/clusters, this paper proposes an iterative method and a clustering-based method, and both of them are built on sparse least square optimization. The algorithms proposed are tested under three data scenarios and their computational performance, including information loss, time consumption, and distribution of values of sparse vectors are compared. The paper offers a new perspective to probe the area of qu-ntization, and the algorithms proposed are superior especially under bit-width reduction scenarios, where the required post-quantization resolution (the number of values) is not significantly lower than the original scalar.
computer science, artificial intelligence,engineering, electrical & electronic
What problem does this paper attempt to address?
This paper attempts to solve several key problems in scalar quantization from a new perspective - sparse least - squares optimization. Specifically, the paper aims to: 1. **Overcome the limitations of traditional clustering methods**: Traditional scalar quantization methods (such as those based on K - means clustering) have some significant problems, including the occurrence of empty clusters or unreasonable values, dependence on initial values, and high time complexity. These problems are particularly prominent when dealing with large - scale data. 2. **Propose new methods based on sparse least - squares optimization**: The paper proposes several quantization algorithms based on \( \ell_1 \) least - squares and implements these algorithms. In addition, similar schemes combining \( \ell_1+\ell_2 \) and \( \ell_0 \) regularization are also proposed. These methods reduce the number of quantized values by introducing sparsity while trying to preserve the information of the original data as much as possible. 3. **Design iterative methods and methods combining clustering**: In order to calculate the quantization results for a given number of quantization values, the paper proposes two methods: one is an iterative method, and the other is a least - squares optimization method combined with K - means clustering. Both of these methods are based on sparse least - squares. 4. **Verify the effectiveness of the new methods**: The paper tests the proposed algorithms in three different data scenarios and compares their computational performance, including information loss, time consumption, and the distribution of sparse vector values. The experimental results show that the proposed methods perform well when reducing the bit width, especially when the required post - quantization resolution (i.e., the number of values) is not significantly lower than the original scalar. ### Formula Summary - **Objective function**: \[ \min_{\alpha} \| \hat{w}-V\alpha \|_2^2+\lambda \| \alpha \|_1 \] where \( \hat{w} \) is the unique value of the original vector, \( V \) is the basis transformation matrix, \( \alpha \) is the coefficient vector to be optimized, and \( \lambda \) is the regularization parameter. - **Improved \( \ell_1 + \ell_2 \) objective function**: \[ \min_{\alpha} \| \hat{w}-V\alpha \|_2^2+\lambda_1 \| \alpha \|_1-\lambda_2 \| \alpha \|_2^2 \] - **\( \ell_0 \) - constrained objective function**: \[ \min_{\alpha} \| \hat{w}-\delta V^*\alpha \|_2^2 \quad \text{subject to} \quad \| \alpha \|_0\leq l \] where \( l \) is a manually set value, representing the upper limit of the number of different values after quantization. - **Update rule for the iterative method**: \[ \lambda_t^1=\lambda_0^1+(t - 1)\Delta\lambda \] where \( \lambda_0^1 \) is the initial regularization parameter and \( \Delta\lambda \) is the increment per iteration. ### Conclusion By introducing the method of sparse least - squares optimization, the paper provides a new perspective to solve the scalar quantization problem. The proposed methods perform well in reducing the computational time and improving the quantization accuracy, especially in application scenarios where the bit width needs to be reduced. These methods are not only applicable to neural network compression, but also can be used for general - purpose quantization tasks.