Page 119 - 49A Field Guide to Genetic Programming

P. 119

11.3 Bloat 105

It is well known, for example, that depth thresholds lead to the popu-
lation ﬁlling up with very bushy programs where most branches reach the
depth limit (being eﬀectively full trees). On the contrary, size limits produce
populations of stringy programs which tend to all approach the size limit.
See (Crane and McPhee, 2005; McPhee, Jarvis, and Crane, 2004) for more
on the impact of size and depth limits, and the diﬀerences between them.
The problem can be ﬁxed by not returning parents if the oﬀspring violates
a constraint. This can be realised with two diﬀerent strategies. Firstly, one
can just return the oversize oﬀspring, but give it a ﬁtness of 0, so that
selection will get rid of it at the next generation. Secondly, one can simply
declare the genetic operation failed, and try again. This can be done in two
alternative ways: a) the same parent or parents are used again, but new
mutation or crossover points are randomly chosen (which can be done up
to a certain number of times before giving up on those parents), or b) new
parents are selected and the genetic operation is attempted again.
If a limit is used, programs must not be so tightly constrained that they
cannot express any solution to the problem. As a rule of thumb, one should
try to estimate the size of the minimum possible solution (using the terminals
and functions given to GP) and add some percentage (e.g., 50-200%) as a
safety margin. In general, however, it may be hard to heuristically come up
with good limits, so some trial and error may be required. Alternatively,
one can use one of the many techniques that have been proposed to adjust
size limits during runs. These can be both at the level of individuals and the
population. See for example the work by Silva and Almeida (2003); Silva
and Costa (2004, 2005a,b); Silva, Silva, and Costa (2005).

Anti-bloat Genetic Operators
One can control bloat by using genetic operators which directly or indirectly
have an anti-bloat eﬀect.
Among the most recent bloat-control methods are size fair crossover
and size fair mutation (Crawford-Marks and Spector, 2002; Langdon, 2000).
These work by constraining the choices made during the execution of a
genetic operation so as to actively prevent growth. In size-fair crossover, for
example, the crossover point in the ﬁrst parent is selected randomly, as in
standard crossover. Then the size of the subtree to be excised is calculated.
This is used to constrain the choice of the second crossover point so as
to guarantee that the subtree chosen from the second parent will not be
“unfairly” big.
Older methods include several mutation operators that may help control
the average tree size in the population while still introducing new genetic
material. Kinnear (1993) proposes a mutation operator which prevents the
oﬀspring’s depth being more than 15% larger than its parent. Langdon
(1998) proposes two mutation operators in which the new random subtree is

114 115 116 117 118 119 120 121 122 123 124