Abstract:In recent years, there has been increasing interest in adopting published neural retrieval models learned from corpora for text retrieval. Although these models achieve excellent retrieval performance, in terms of popular accuracy metrics, on datasets they have been trained, their performance on new text data might degrade. To obtain the desired retrieval performance on both the data used in training and the latest data collected after training, the simple approach of learning a new model from both datasets is not always feasible since the annotated dataset used in training is often not published along with the learned model. Knowledge amalgamation (KA) is an emerging technique to deal with this problem of inaccessibility of data used in previous training. KA learns a new model (called a student model) from new data by reusing (called amalgamating) a number of trained models (called teacher models) instead of accessing the teachers' original training data. However, in order to efficiently learn an accurate student model, the classical KA approach requires manual selection of an appropriate subset of teacher models for amalgamation. This manual procedure for selecting teacher models prevents the classical KA from being scaled to retrieval tasks for which a large number of candidate teacher models are ready to be reused. This paper presents Arm, an intelligent system for efficiently learning a neural retrieval model with the desired accuracy on incoming data by automatically amalgamating a subset of teacher models (called a teacher model combination or simply combination ) among a large number of teacher models. o filter combinations that fail to produce accurate student models, Arm employs Bayesian optimization to derive an accuracy prediction model based on sampled amalgamation tasks. Then, Arm uses the derived prediction model to exclude unqualified combinations without training the rest combinations. To speed up training, Arm introduces a cost model that picks the teacher model combination with the minimal training cost among all qualified teacher model combinations to produce the final student model. This paper will demonstrate the major workflow of Arm and present the produced student models to users.

Federated selective aggregation for on-device knowledge amalgamation

Federated Selective Aggregation for Knowledge Amalgamation

Arm: Efficient Learning of Neural Retrieval Models with Desired Accuracy by Automatic Knowledge Amalgamation

MCKD: Mutually Collaborative Knowledge Distillation for Federated Domain Adaptation and Generalization

Customizing Student Networks From Heterogeneous Teachers Via Adaptive Knowledge Amalgamation

Hierarchical Knowledge Amalgamation with Dual Discriminative Feature Alignment

FedSiam-DA: Dual-aggregated Federated Learning Via Siamese Network for Non-Iid Data

Collaborative knowledge amalgamation: Preserving discriminability and transferability in unsupervised learning

Amalgamating Knowledge towards Comprehensive Classification

Towards addressing aggregation deviation for model training in resource-scarce edge environment

Handling Data Heterogeneity in Federated Learning via Knowledge Distillation and Fusion

Knowledge-Enhanced Semi-Supervised Federated Learning for Aggregating Heterogeneous Lightweight Clients in IoT

Collaborative Semantic Aggregation and Calibration for Federated Domain Generalization

Agglomerative Federated Learning: Empowering Larger Model Training via End-Edge-Cloud Collaboration

Federated Sensing : Edge-Cloud Elastic Collaborative Learning for Intelligent Sensing

FedPA: An adaptively partial model aggregation strategy in Federated Learning

Is Aggregation the Only Choice? Federated Learning via Layer-wise Model Recombination

Federated Knowledge Amalgamation with Unbiased Semantic Attributes under Cloud–edge Collaboration for Heterogeneous Fault Diagnosis

Fedadkd:heterogeneous federated learning via adaptive knowledge distillation

FedALA: Adaptive Local Aggregation for Personalized Federated Learning

FedCross: Towards Accurate Federated Learning via Multi-Model Cross-Aggregation