RUMC: A Rule-based Classifier Inspired by Evolutionary Methods

Melvin Mokhtari
2024-12-11
Abstract:As the field of data analysis grows rapidly due to the large amounts of data being generated, effective data classification has become increasingly important. This paper introduces the RUle Mutation Classifier (RUMC), which represents a significant improvement over the Rule Aggregation ClassifiER (RACER). RUMC uses innovative rule mutation techniques based on evolutionary methods to improve classification accuracy. In tests with forty datasets from OpenML and the UCI Machine Learning Repository, RUMC consistently outperformed twenty other well-known classifiers, demonstrating its ability to uncover valuable insights from complex data.
Machine Learning,Neural and Evolutionary Computing
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve the limitations of existing algorithms in data classification when dealing with high - dimensional and low - sample - size data. In particular, the initial rules generated by the Rule Aggregation Classifier (RACER) are too specific, resulting in poor generalization ability on new data. Specifically: 1. **Rules are too specific**: The initial rules generated by RACER are usually too specific, which makes them perform poorly when dealing with new data, especially on data sets with small sample sizes. 2. **Difficulties in processing high - dimensional data**: When the number of features far exceeds the number of samples, many algorithms (including RACER) are difficult to accurately estimate parameters, leading to performance degradation. 3. **Improving classification accuracy**: Existing classification algorithms often fail to achieve ideal classification accuracy when dealing with complex data, especially in diverse data environments. To solve these problems, the author proposes the RUle Mutation Classifier (RUMC), which is a rule - mutation classifier based on evolutionary methods. RUMC iteratively optimizes the initial rule set by introducing the rule - mutation technique, thereby improving classification accuracy and outperforming other well - known classifiers on a variety of data sets. ### Main improvement points of RUMC - **Rule mutation**: RUMC uses the mutation mechanism in evolutionary algorithms to optimize the initial rules to generate more generalizable rules. - **Wide applicability**: RUMC not only performs well on high - dimensional and low - sample - size data, but also can maintain high classification accuracy on large - scale data sets. - **Efficient rule combination**: By performing a logical "OR" operation on the rules, RUMC can create more widely applicable rules, thereby improving classification performance. Through these improvements, RUMC significantly outperformed 20 other classifiers in tests on 40 public data sets, demonstrating its potential and advantages in the field of data classification. ### Summary The core problem of this paper is to improve the classification accuracy of rule - based classifiers in complex data environments, especially in response to the challenges of high - dimensional and low - sample - size data. RUMC successfully solves the limitations of existing algorithms by introducing the rule - mutation technique and the idea of evolutionary algorithms, providing higher classification accuracy and better generalization ability.