Abstract:Automated machine learning streamlines the task of finding effective machine learning pipelines by automating model training, evaluation, and selection. Traditional evaluation strategies, like cross-validation (CV), generate one value that averages the accuracy of a pipeline's predictions. This single value, however, may not fully describe the generalizability of the pipeline. Here, we present Lexicase-based Validation (lexidate), a method that uses multiple, independent prediction values for selection. Lexidate splits training data into a learning set and a selection set. Pipelines are trained on the learning set and make predictions on the selection set. The predictions are graded for correctness and used by lexicase selection to identify parent pipelines. Compared to 10-fold CV, lexicase reduces the training time. We test the effectiveness of three lexidate configurations within the Tree-based Pipeline Optimization Tool 2 (TPOT2) package on six OpenML classification tasks. In one configuration, we detected no difference in the accuracy of the final model returned from TPOT2 on most tasks compared to 10-fold CV. All configurations studied here returned similar or less complex final pipelines compared to 10-fold CV.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the limitations of traditional model evaluation and selection methods (such as cross - validation, CV) in automated machine learning (AutoML). Specifically, although the traditional 10 - fold cross - validation (10 - fold CV) is effective, it has some problems: 1. **Limitations of a single performance metric**: Traditional methods usually generate an average value to represent the prediction accuracy of a model, but this may not comprehensively describe the generalization ability of the model. 2. **Low computational efficiency**: 10 - fold cross - validation requires multiple training and validation for each model, which increases the computational time and resource consumption. 3. **Overfitting risk**: Fixed data partitioning may lead to model overfitting, especially when dealing with small - sized data sets. To solve these problems, the author introduced a new method based on lexicase selection - **Lexicase - based Validation (Lexidate)**. The main features of Lexidate include: - **Multidimensional evaluation**: Use multiple independent predicted values for evaluation instead of a single average value. - **Improve computational efficiency**: Reduce the computational cost by reducing the number of training times. - **More flexible selection mechanism**: Through lexicase selection, selection pressure can be exerted on more difficult individual cases without sacrificing overall performance. To verify the effectiveness of Lexidate, the author compared it with 10 - fold cross - validation and tested the performance of three different Lexidate configurations (90/10, 70/30, 50/50 data partitioning) on six OpenML classification tasks. The experimental results show that on some tasks, Lexidate can achieve an accuracy rate similar to that of 10 - fold cross - validation, and the generated model has lower complexity, thus improving the computational efficiency. In summary, this paper aims to propose a new model evaluation and selection method, Lexidate, to overcome the limitations of traditional methods in AutoML, especially in terms of computational efficiency and model generalization ability.

Lexidate: Model Evaluation and Selection with Lexicase

Lexicase Selection at Scale

Scaling tree-based automated machine learning to biomedical big data with a dataset selector

AutoWeka4MCPS-AVATAR: Accelerating Automated Machine Learning Pipeline Composition and Optimisation

Model selection for metabolomics: predicting diagnosis of coronary artery disease using automated machine learning

Optimizing Neural Networks with Gradient Lexicase Selection

Identifying and Harnessing the Building Blocks of Machine Learning Pipelines for Sensible Initialization of a Data Science Automation Tool

Tree‐Based Pipeline Optimization‐Based Automated‐Machine Learning Model for Performance Prediction of Materials and Structures: Case Studies and UI Design

Scaling tree-based automated machine learning to biomedical big data with a feature set selector

MBL-CPDP: A Multi-objective Bilevel Method for Cross-Project Defect Prediction via Automated Machine Learning

On Speeding Up Language Model Evaluation

Data Efficient Evaluation of Large Language Models and Text-to-Image Models via Adaptive Sampling

TALEC: Teach Your LLM to Evaluate in Specific Domain with In-house Criteria by Criteria Division and Zero-shot Plus Few-shot

AVATAR -- Machine Learning Pipeline Evaluation Using Surrogate Model

DALex: Lexicase-like Selection via Diverse Aggregation

An Exploration of Exploration: Measuring the ability of lexicase selection to find obscure pathways to optimality

Lexicase-based Selection Methods with Down-sampling for Symbolic Regression Problems: Overview and Benchmark

Green Runner: A tool for efficient deep learning component selection

Lexicase Selection Parameter Analysis: Varying Population Size and Test Case Redundancy with Diagnostic Metrics

What is the best model? Application-driven Evaluation for Large Language Models

cedar: Optimized and Unified Machine Learning Input Data Pipelines