Preference-driven Pareto Front Exploitation for Bloat Control in Genetic Programming.

Jiayu Liang,Yuxin Liu,Yu Xue
DOI: https://doi.org/10.1016/j.asoc.2020.106254
IF: 8.7
2020-01-01
Applied Soft Computing
Abstract:As one of evolutionary algorithms (EAs), genetic programming (GP) has been applied in a wide range of areas, e.g. bioinformatics and robotics. Different from other EAs, GP can represent problems with variable length (e.g. trees), which makes it more flexible in evolving solutions, yet leads to a serious problem, bloat. It can cause evolving redundant parts and slowing down search. Multi-objective techniques are popularly used for reducing bloat in GP (termed as MOGP). Specifically, MOGP methods evolve trade-off solutions of all objectives, which constitute the so-called Pareto front. Then users select solutions on the front based on their preference for specific tasks. However, existing MOGP methods rarely consider users’ preference during evolution, which wastes computation power and time to search for useless solutions and cannot generate fine-grained interested regions on the Pareto front. Therefore, this paper investigates introducing users’ preference to guide multi-objective techniques to focus on the interested regions on Pareto front during evolution. Specifically, Pareto dominance is an important notion in multi-objective techniques for comparing two solutions. We design two preference-driven Pareto dominance mechanisms, scPd (static constraint Pareto dominance) and dcPd (dynamic constraint Pareto dominance), which are introduced in a base multi-objective technique and then are incorporated with GP respectively to form two new bloat control MOGP methods, i.e. scPd_MOGP and dcPd_MOGP. They are tested on benchmark symbolic regression tasks comparing with GP, two existing bloat control methods (i.e. a parsimony GP method (pGP) and a standard multi-objective GP method (sMOGP)), and four popularly-used symbolic regression methods. Results show that the proposed methods can reduce bloat in GP and outperform pGP in bloat control, and comparison with sMOGP shows that they can search front regions based on users’ preference where the solutions have better functionality, yet relatively larger sizes. In addition, compared with four popularly-used symbolic regression methods, scPd_MOGP is generally better; while dcPd_MOGP achieves varied results, yet it performs better or similar to the reference methods on the majority of the given test functions. Moreover, comparison between the two proposed methods suggests that the constraint in the Pareto dominance of scPd_MOGP is more relaxed than that of dcPd_MOGP.
What problem does this paper attempt to address?