An Adaptive Error-Correcting Output Codes Algorithm Based on Gene Expression Programming and Similarity Measurement Matrix

Shutong Xie,Zongbao He,Lifang Pan,Kunhong Liu,Shubin Su
DOI: https://doi.org/10.1016/j.patcog.2023.109957
IF: 8
2024-01-01
Pattern Recognition
Abstract:The multi-class classification task is one of the most common tasks in machine learning. As a typical solution based on a partitioning strategy, Error-Correcting Output Codes (ECOC) can transform a multi-class classification problem into multiple binary classification problems. The key of ECOC is to construct an effective codematrix to represent a set of class decomposition schemes, which transforms a multiclass problem into a group of binary class problems. Consequently, the design of a fast and effective ECOC codematrix generation method is of great research significance and value for solving multi-class classification problems. In ECOC algorithms, the design of codematrix is treated as a combination problem between different code columns, in which the evolutionary algorithm shows a great advantage. Based on this consideration, the Gene Expression Programming (GEP) is applied to search for the codematrix with high performance because its expressive tree structure makes it well represent codematrcies for subsequent optimization operations. This paper proposes an adaptive ECOC algorithm based on Gene Expression Programming (GEP) and similarity measurement matrix, named GEP-ECOC. In our GEP, each individual represents a set of columns to form a random ECOC codematrix, which is optimized in the evolutionary process. Meanwhile, the crossover and mutation operations are modified to include a legality checking process to ensure that the generated codematrix satisfies the ECOC constraints. The GEP-based ECOC codematrix generation algorithm can quickly produce a codematrix with better performance, which ensures the efficiency of the algorithm to a certain extent. In addition, an adaptive algorithm based on a similarity measurement matrix is proposed to add new columns to the current codematrix, aiming to better handle hard classes. Our algorithm is compared with other algorithms on various data sets, and the experimental results confirm that our GEP-ECOC can balance the efficiency and performance of the algorithm and achieve higher performance.
What problem does this paper attempt to address?