Abstract:Despite the recent success of artificial neural networks on a variety of tasks, we have little knowledge or control over the exact solutions these models implement. Instilling inductive biases -- preferences for some solutions over others -- into these models is one promising path toward understanding and controlling their behavior. Much work has been done to study the inherent inductive biases of models and instill different inductive biases through hand-designed architectures or carefully curated training regimens. In this work, we explore a more mechanistic approach: Subtask Induction. Our method discovers a functional subnetwork that implements a particular subtask within a trained model and uses it to instill inductive biases towards solutions utilizing that subtask. Subtask Induction is flexible and efficient, and we demonstrate its effectiveness with two experiments. First, we show that Subtask Induction significantly reduces the amount of training data required for a model to adopt a specific, generalizable solution to a modular arithmetic task. Second, we demonstrate that Subtask Induction successfully induces a human-like shape bias while increasing data efficiency for convolutional and transformer-based image classification models.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve the problem of how to introduce inductive biases in artificial neural networks. Although current neural networks have achieved success in many tasks, we know very little about the specific solutions implemented by these models, and it is also difficult to control their behavior. Therefore, researchers hope to introduce inductive biases to make the model more inclined to certain specific solutions, thereby improving the understandability and controllability of the model. Specifically, this paper proposes the **Subtask Induction** method to achieve this goal through the following steps: 1. **Discover functional sub - networks**: From a trained neural network, find a functional sub - network that performs a specific subtask. 2. **Migrate sub - networks**: Migrate this sub - network to a newly randomly initialized neural network, and only train the remaining randomly initialized weights, keeping the sub - network weights unchanged. This method can introduce a soft inductive bias to the new model, making it more inclined to use specific subtasks to solve problems. ### Main contributions of the paper 1. **Propose the subtask induction method**: A new method of using interpretability techniques to introduce inductive biases. 2. **Verify effectiveness**: Demonstrate the effectiveness of subtask induction in arithmetic tasks, significantly reducing the amount of required training data and improving generalization ability. 3. **Generate and publish the Mean - pooled ImageNet dataset**: By performing global shape - preserving and local texture - removing processing on ImageNet images, a new dataset is generated to evaluate the shape preference of the model. 4. **Apply to image classification**: Apply subtask induction on ResNet18 and ViT models, successfully introduce human - like shape biases, and improve the classification performance based on shape information. ### Application examples of subtask induction - **Arithmetic tasks**: By migrating sub - networks that perform specific arithmetic subtasks, the amount of data required for training is significantly reduced, and the learning efficiency of the model for new tasks is improved. - **Image classification**: By migrating sub - networks that perform shape recognition subtasks, the model depends more on shape information rather than texture information, thereby improving the model's classification accuracy on Mean - pooled ImageNet. Through these experiments, the paper proves the effectiveness and flexibility of the subtask induction method, providing new ideas for further research on how to systematically introduce inductive biases.

Instilling Inductive Biases with Subnetworks

Training the Untrainable: Introducing Inductive Bias via Representational Alignment

On Inductive Biases for Machine Learning in Data Constrained Settings

Learning Inductive Biases with Simple Neural Networks

Efficient Data Subset Selection to Generalize Training Across Models: Transductive and Inductive Networks

Towards Exact Computation of Inductive Bias

Meta-Learning the Inductive Biases of Simple Neural Circuits

The Inductive Bias of In-Context Learning: Rethinking Pretraining Example Design

On Inductive Biases That Enable Generalization of Diffusion Transformers

Tripod: Three Complementary Inductive Biases for Disentangled Representation Learning

On Inductive Biases in Deep Reinforcement Learning

Gradient-based inference of abstract task representations for generalization in neural networks

Measuring Inductive Biases of In-Context Learning with Underspecified Demonstrations

Combining Induction and Transduction for Abstract Reasoning

Neural Program Meta-Induction

Inductive biases of multi-task learning and finetuning: multiple regimes of feature reuse

Inductive Gradient Adjustment For Spectral Bias In Implicit Neural Representations

Distilling Inductive Bias: Knowledge Distillation Beyond Model Compression

Inductive Transfer for Neural Architecture Optimization

Theoretical Analysis of Inductive Biases in Deep Convolutional Networks

Co-advise: Cross Inductive Bias Distillation