Abstract:With the increasing inference cost of machine learning models, there is a growing interest in models with fast and efficient inference. Recently, an approach for learning logic gate networks directly via a differentiable relaxation was proposed. Logic gate networks are faster than conventional neural network approaches because their inference only requires logic gate operators such as NAND, OR, and XOR, which are the underlying building blocks of current hardware and can be efficiently executed. We build on this idea, extending it by deep logic gate tree convolutions, logical OR pooling, and residual initializations. This allows scaling logic gate networks up by over one order of magnitude and utilizing the paradigm of convolution. On CIFAR-10, we achieve an accuracy of 86.29% using only 61 million logic gates, which improves over the SOTA while being 29x smaller.
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to improve the efficiency and speed of machine - learning model inference, especially the execution efficiency on hardware. Specifically, the authors focus on achieving efficient inference by directly learning Logic Gate Networks (LGNs), without the need to convert abstract neural network structures into executable logic - gate operations. Traditional methods such as Binary - Weight Neural Networks (BNNs) need to convert abstract operations such as matrix multiplication into specific logic - gate operations, which brings an additional computational burden.
### Main Problems and Solutions
1. **Improve Inference Efficiency**:
- Traditional deep - learning models require a large amount of computational resources and energy consumption during inference.
- The paper proposes to optimize the inference process by directly learning Logic Gate Networks (LGNs), because logic - gate operations (such as NAND, OR, XOR) can be directly and efficiently executed on hardware.
2. **Expand the Expressive Power of Logic Gate Networks**:
- The original differentiable LGNs are unable to capture the spatial relationships in images due to the random connection method, resulting in limited performance (for example, only achieving an accuracy rate of 62% on the CIFAR - 10 dataset).
- The paper introduces deep logic - gate tree convolutions, logical OR pooling, and residual initializations to enhance the expressive power of LGNs and their ability to handle complex tasks.
3. **Reduce Computational Resource Requirements**:
- Through the above improvements, the paper shows that an accuracy rate of 86.29% is achieved using 61 million logic gates on the CIFAR - 10 dataset, reducing the computational resource requirements by 29 times compared to the existing state - of - the - art method (SOTA).
### Formula Representation
To ensure the correctness and readability of the formulas, the following are the key formulas involved in the paper:
- **Relaxed Representation of Differentiable Logic Gates**:
\[
f_z(a_1, a_2)=\mathbb{E}_{i \sim S(z), A_1 \sim B(a_1), A_2 \sim B(a_2)}[g_i(A_1, A_2)]=\sum_{i = 0}^{15}\frac{\exp(z_i)}{\sum_j\exp(z_j)}\cdot g_i(a_1, a_2)
\]
where \(z\in\mathbb{R}^{16}\) is the training parameter vector, and \(g_i\) represents the \(i\)-th possible logic - gate operation.
- **Output Calculation of Convolutional Logic - Gate Networks**:
\[
A'[k, i, j]=f_k^3\left(f_k^1\left(A[CM[k, 1], CH[k, 1]+i, CW[k, 1]+j], A[CM[k, 2], CH[k, 2]+i, CW[k, 2]+j]\right), f_k^2\left(A[CM[k, 3], CH[k, 3]+i, CW[k, 3]+j], A[CM[k, 4], CH[k, 4]+i, CW[k, 4]+j]\right)\right)
\]
Through these improvements, the paper not only improves the performance of logic - gate networks but also significantly reduces the computational resource requirements, thus providing a new direction for efficient machine - learning inference.