Page 119 - 49A Field Guide to Genetic Programming
P. 119
11.3 Bloat 105
It is well known, for example, that depth thresholds lead to the popu-
lation filling up with very bushy programs where most branches reach the
depth limit (being effectively full trees). On the contrary, size limits produce
populations of stringy programs which tend to all approach the size limit.
See (Crane and McPhee, 2005; McPhee, Jarvis, and Crane, 2004) for more
on the impact of size and depth limits, and the differences between them.
The problem can be fixed by not returning parents if the offspring violates
a constraint. This can be realised with two different strategies. Firstly, one
can just return the oversize offspring, but give it a fitness of 0, so that
selection will get rid of it at the next generation. Secondly, one can simply
declare the genetic operation failed, and try again. This can be done in two
alternative ways: a) the same parent or parents are used again, but new
mutation or crossover points are randomly chosen (which can be done up
to a certain number of times before giving up on those parents), or b) new
parents are selected and the genetic operation is attempted again.
If a limit is used, programs must not be so tightly constrained that they
cannot express any solution to the problem. As a rule of thumb, one should
try to estimate the size of the minimum possible solution (using the terminals
and functions given to GP) and add some percentage (e.g., 50-200%) as a
safety margin. In general, however, it may be hard to heuristically come up
with good limits, so some trial and error may be required. Alternatively,
one can use one of the many techniques that have been proposed to adjust
size limits during runs. These can be both at the level of individuals and the
population. See for example the work by Silva and Almeida (2003); Silva
and Costa (2004, 2005a,b); Silva, Silva, and Costa (2005).
Anti-bloat Genetic Operators
One can control bloat by using genetic operators which directly or indirectly
have an anti-bloat effect.
Among the most recent bloat-control methods are size fair crossover
and size fair mutation (Crawford-Marks and Spector, 2002; Langdon, 2000).
These work by constraining the choices made during the execution of a
genetic operation so as to actively prevent growth. In size-fair crossover, for
example, the crossover point in the first parent is selected randomly, as in
standard crossover. Then the size of the subtree to be excised is calculated.
This is used to constrain the choice of the second crossover point so as
to guarantee that the subtree chosen from the second parent will not be
“unfairly” big.
Older methods include several mutation operators that may help control
the average tree size in the population while still introducing new genetic
material. Kinnear (1993) proposes a mutation operator which prevents the
offspring’s depth being more than 15% larger than its parent. Langdon
(1998) proposes two mutation operators in which the new random subtree is