Abstract:Neural networks have been extensively applied to a variety of tasks, achieving astounding results. Applying neural networks in the scientific field is an important research direction that is gaining increasing attention. In scientific applications, the scale of neural networks is generally moderate-size, mainly to ensure the speed of inference during application. Additionally, comparing neural networks to traditional algorithms in scientific applications is inevitable. These applications often require rapid computations, making the reduction of neural network sizes increasingly important. Existing work has found that the powerful capabilities of neural networks are primarily due to their non-linearity. Theoretical work has discovered that under strong non-linearity, neurons in the same layer tend to behave similarly, a phenomenon known as condensation. Condensation offers an opportunity to reduce the scale of neural networks to a smaller subnetwork with similar performance. In this article, we propose a condensation reduction algorithm to verify the feasibility of this idea in practical problems. Our reduction method can currently be applied to both fully connected networks and convolutional networks, achieving positive results. In complex combustion acceleration tasks, we reduced the size of the neural network to 41.7% of its original scale while maintaining prediction accuracy. In the CIFAR10 image classification task, we reduced the network size to 11.5% of the original scale, still maintaining a satisfactory validation accuracy. Our method can be applied to most trained neural networks, reducing computational pressure and improving inference speed.

Exponentially Increasing the Capacity-to-Computation Ratio for Conditional Computation in Deep Learning

Memorization Capacity of Neural Networks with Conditional Computation

Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer

AccEPT: an Acceleration Scheme for Speeding Up Edge Pipeline-parallel Training

Conditional computation in neural networks: principles and research trends

An Information-Theoretic Analysis of Compute-Optimal Neural Scaling Laws

Hyper-Compression: Model Compression via Hyperfunction

A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks

Implicit Compressibility of Overparametrized Neural Networks Trained with Heavy-Tailed SGD

Training compact neural networks via

More Compute Is What You Need

On the Complexity of Neural Computation in Superposition

Sparse Probabilistic Circuits via Pruning and Growing

Encoding innate ability through a genomic bottleneck

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Bypass Exponential Time Preprocessing: Fast Neural Network Training via Weight-Data Correlation Preprocessing

Understanding the Double Descent Phenomenon in Deep Learning

A Dynamical Model of Neural Scaling Laws

The Exponential Capacity of Dense Associative Memories

Increasing Model Capacity for Free: A Simple Strategy for Parameter Efficient Fine-tuning

Efficient and Flexible Method for Reducing Moderate-size Deep Neural Networks with Condensation