Abstract:The majority of theoretical analyses of evolutionary algorithms in the discrete domain focus on binary optimization algorithms, even though black-box optimization on the categorical domain has a lot of practical applications. In this paper, we consider a probabilistic model-based algorithm using the family of categorical distributions as its underlying distribution and set the sample size as two. We term this specific algorithm the categorical compact genetic algorithm (ccGA). The ccGA can be considered as an extension of the compact genetic algorithm (cGA), which is an efficient binary optimization algorithm. We theoretically analyze the dependency of the number of possible categories $K$, the number of dimensions $D$, and the learning rate $\eta$ on the runtime. We investigate the tail bound of the runtime on two typical linear functions on the categorical domain: categorical OneMax (COM) and KVal. We derive that the runtimes on COM and KVal are $O(\sqrt{D} \ln (DK) / \eta)$ and $\Theta(D \ln K/ \eta)$ with high probability, respectively. Our analysis is a generalization for that of the cGA on the binary domain.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper is mainly dedicated to solving the black - box optimization problem in the discrete domain (especially in the classification domain). Specifically, the author focuses on the performance of evolutionary algorithms based on probability models (such as the compact genetic algorithm, cGA) when dealing with categorical variables. Although most of the existing theoretical analyses focus on binary optimization algorithms, black - box optimization in the classification domain is of great significance in practical applications. Therefore, this paper proposes an extended version of the compact genetic algorithm - the categorical compact genetic algorithm (ccGA) and conducts in - depth theoretical analysis on it. #### Main research contents: 1. **Algorithm introduction**: - ccGA is an extension of cGA and is suitable for optimization problems of categorical variables. - ccGA uses the categorical distribution as its underlying distribution and sets the sample size to 2. 2. **Running - time analysis**: - The author analyzes the running time of ccGA on two typical linear functions: Categorical OneMax (COM) and KV AL. - For COM and KV AL, the running times are derived as $ O\left(\sqrt{D \ln(DK)} / \eta\right) $ and $ \Theta(D \ln K / \eta) $ respectively, and these results hold with high probability. 3. **Influence of the learning rate**: - The influence of different settings of the learning rate $\eta$ on the performance of ccGA is studied, especially how to select an appropriate $\eta$ to achieve efficient search. 4. **Application of the drift theorem**: - An improved drift theorem is proposed to estimate the tail bounds of the first - hitting time more precisely. These theorems consider conditional drift and skipping processes, making the analysis more rigorous. 5. **Experimental verification**: - The results of the theoretical analysis are verified by numerical simulation to ensure that they also hold in practical applications. #### Formula summary: - Running time of Categorical OneMax (COM): \[ O\left(\sqrt{D \ln(DK)} / \eta\right) \] - Running time of KV AL: \[ \Theta(D \ln K / \eta) \] where: - $ D $ is the number of dimensions, - $ K $ is the number of categories per dimension, - $ \eta $ is the learning rate. ### Conclusion Through in - depth analysis of ccGA, this paper fills the gap in the theoretical research of optimization algorithms in the classification domain and provides valuable guidance for practical applications.

Tail Bounds on the Runtime of Categorical Compact Genetic Algorithm

Faster Optimization Through Genetic Drift

Degree Preserving Based Crossover For Constrained Optimization Problems

The Compact Genetic Algorithm Struggles on Cliff Functions

Runtime Analysis of the $(1+(λ,λ))$ Genetic Algorithm on Random Satisfiable 3-CNF Formulas

Runtime Analysis of a Multi-Valued Compact Genetic Algorithm on Generalized OneMax

Running Time Analysis of the Non-dominated Sorting Genetic Algorithm II (NSGA-II) using Binary or Stochastic Tournament Selection

A Tight $O(4^k/p_c)$ Runtime Bound for a ($μ$+1) GA on Jump$_k$ for Realistic Crossover Probabilities

Runtime Analysis of Competitive Co-evolutionary Algorithms for Maximin Optimisation of a Bilinear Function

Towards Analyzing Crossover Operators in Evolutionary Search Via General Markov Chain Switching Theorem

Optimizing Genetic Algorithms Using the Binomial Distribution

Standard Steady State Genetic Algorithms Can Hillclimb Faster than Mutation-only Evolutionary Algorithms

Tight Runtime Bounds for Static Unary Unbiased Evolutionary Algorithms on Linear Functions

Runtime analysis of a coevolutionary algorithm on impartial combinatorial games

On the Benefits of Populations on the Exploitation Speed of Standard Steady-State Genetic Algorithms

Towards a More Practice-Aware Runtime Analysis of Evolutionary Algorithms

Fast genetic algorithms

Running-time Analysis of Evolutionary Programming Based on Lebesgue Measure of Searching Space

Sharp Bounds for Genetic Drift in Estimation of Distribution Algorithms

Level-based Analysis of Genetic Algorithms and other Search Processes