Abstract:Modern natural language processing (NLP) state-of-the-art (SoTA) deep learning (DL) models have hundreds of millions of parameters, making them extremely complex. Large datasets are required for training these models, and while pretraining has reduced this requirement, human-labelled datasets are still necessary for fine-tuning. Few-shot learning (FSL) techniques, such as meta-learning, try to train models from smaller datasets to mitigate this cost. However, the tasks used to evaluate these meta-learners frequently diverge from the problems in the real world that they are meant to resolve. This work aims to apply meta-learning to a problem that is more pertinent to the real world: class incremental learning (IL). In this scenario, after completing its training, the model learns to classify newly introduced classes. One unique quality of meta-learners is that they can generalise from a small sample size to classes that have never been seen before, which makes them especially useful for class incremental learning (IL). The method describes how to emulate class IL using proxy new classes. This method allows a meta-learner to complete the task without the need for retraining. To generate predictions, the transformer-based aggregation function in a meta-learner that modifies data from examples across all classes has been proposed. The principal contributions of the model include concurrently considering the entire support and query sets, and prioritising attention to crucial samples, such as the question, to increase the significance of its impact during inference. The outcomes demonstrate that the model surpasses prevailing benchmarks in the industry. Notably, most meta-learners demonstrate significant generalisation in the context of class IL even without specific training for this task. This paper establishes a high-performing baseline for subsequent transformer-based aggregation techniques, thereby emphasising the practical significance of meta-learners in class IL.

Learning a Decision Tree Algorithm with Transformers

Boosting-Based Sequential Meta-Tree Ensemble Construction for Improved Decision Trees

Metaformer: A Transformer That Tends to Mine Metaphorical-Level Information

ST-Tree with Interpretability for Multivariate Time Series Classification

Understanding the Training and Generalization of Pretrained Transformer for Sequential Decision Making

Dive into Decision Trees and Forests: A Theoretical Demonstration

Multi-Game Decision Transformers

Learning accurate and interpretable decision trees

Challenging Gradient Boosted Decision Trees with Tabular Transformers for Fraud Detection at Booking.com

Transformers with Stochastic Competition for Tabular Data Modelling

Rethinking Decision Transformer via Hierarchical Reinforcement Learning

Learning Top-k Subtask Planning Tree based on Discriminative Representation Pre-training for Decision Making

Decision Transformer: Reinforcement Learning via Sequence Modeling

Predictive Coding for Decision Transformer

Decision tree modeling using R

Decision Stream: Cultivating Deep Decision Trees

Learning to Branch with Tree-aware Branching Transformers

General-Purpose In-Context Learning by Meta-Learning Transformers

Meta-learning for real-world class incremental learning: a transformer-based approach

Learn Smart with Less: Building Better Online Decision Trees with Fewer Training Examples

Task-agnostic Decision Transformer for Multi-type Agent Control with Federated Split Training