What problem does this paper attempt to address?

The main problem that this paper attempts to solve is: how to construct an effective ensemble model from a single pre - trained model. Specifically, the existing methods are to perform multiple fine - tunings from the same pre - trained model, but there is a lack of diversity among these fine - tuned models, resulting in poor ensemble effects. To solve this problem, the paper proposes a new method named Multi - Ticket Ensemble. By fine - tuning different subnetworks in the pre - trained model, the diversity between models is increased, thereby improving the performance of the ensemble model. ### Background and Challenges of the Main Problem 1. **Effectiveness of Ensemble Learning**: - Ensemble learning is a method to improve prediction performance by combining the outputs of multiple models. - An ideal ensemble model requires sufficient diversity among member models so that they can provide complementary predictions on different data samples. 2. **Limitations of a Single Pre - trained Model**: - Due to the extremely high cost of large - scale pre - training, most researchers can only use a single pre - trained model provided by resource - rich organizations. - Multiple models fine - tuned from the same pre - trained model tend to have high similarity and lack diversity, so the ensemble effect is limited. 3. **Sources of Diversity**: - The paper points out that traditional fine - tuning methods (such as randomly initializing task - specific layers, shuffling datasets, Dropout, etc.) cannot significantly increase the diversity of models. - Therefore, new methods need to be introduced to increase the diversity between models while maintaining the accuracy of each sub - model. ### Proposed Solution The paper proposes the Multi - Ticket Ensemble method. Its core idea is to find the "winning tickets" in the pre - trained model through iterative magnitude pruning, that is, those sparse subnetworks that can still maintain high accuracy after fine - tuning. The specific steps are as follows: 1. **Iterative Magnitude Pruning**: - Starting from the pre - trained model, gradually prune the parameters with the smallest absolute values of weights to form different subnetworks. - After each pruning, re - fine - tune the remaining subnetworks until a predetermined pruning ratio (for example, 30%) is reached. 2. **Diversity and Accuracy of Subnetworks**: - Different subnetworks will utilize different sub - spaces of pre - trained knowledge during the fine - tuning process, thus obtaining different perspectives and increasing diversity. - At the same time, according to the "lottery hypothesis", these subnetworks can still maintain relatively high accuracy after fine - tuning. ### Experimental Results The paper proves through experiments that the subnetworks generated by the Multi - Ticket Ensemble method are more diverse than traditional dense networks, and its ensemble model performs better than traditional ensemble methods on some tasks. ### Formula Representation - Let \( f(x; \theta) \) represent the output of the model for input \( x \) under parameter \( \theta \). - The output of the ensemble model is: \[ f_M(x)=\frac{1}{|M|} \sum_{\theta \in M} f(x; \theta) \] where \( M =\{\theta_1, \theta_2,\ldots,\theta_{|M|}\} \) is the parameter set of member models. Through this method, the paper effectively solves the problem of constructing an efficient ensemble model from a single pre - trained model and improves the diversity and overall performance of the model.

Diverse Lottery Tickets Boost Ensemble from a Single Pretrained Model

Data-Efficient Double-Win Lottery Tickets from Robust Pre-training

Finding the Dominant Winning Ticket in Pre-Trained Language Models

Superposing Many Tickets into One: A Performance Booster for Sparse Neural Network Training

The Diversified Ensemble Neural Network.

Robust Tickets Can Transfer Better: Drawing More Transferable Subnetworks in Transfer Learning

Developing parsimonious ensembles using ensemble diversity within a reinforcement learning framework

Auto-Ensemble: An Adaptive Learning Rate Scheduling based Deep Learning Model Ensembling

Prune and Tune Ensembles: Low-Cost Ensemble Learning with Sparse Independent Subnetworks

Winning Lottery Tickets in Deep Generative Models

Playing the lottery with rewards and multiple languages: lottery tickets in RL and NLP

One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers

Efficient Diversity-Driven Ensemble for Deep Neural Networks

Dual Lottery Ticket Hypothesis

Playing Lottery Tickets with Vision and Language

LOFT: Finding Lottery Tickets through Filter-wise Training

Universality of Winning Tickets: A Renormalization Group Perspective

Learning Diverse Models for End-to-End Ensemble Tracking.

Sequential Bayesian Neural Subnetwork Ensembles

A Survey of Lottery Ticket Hypothesis

BatchEnsemble: An Alternative Approach to Efficient Ensemble and Lifelong Learning