Page 120 - 49A Field Guide to Genetic Programming

P. 120

106 11 GP Theory and its Applications

on average the same size as the code it replaces. In Hoist mutation (Kinnear,
1994a) the new subtree is selected from the subtree being removed from the
parent, guaranteeing that the new program will be smaller than its parent.
Shrink mutation (Angeline, 1996) is a special case of subtree mutation where
the randomly chosen subtree is replaced by a randomly chosen terminal.
McPhee and Poli (2002) provides theoretical analysis and empirical evidence
that combinations of subtree crossover and subtree mutation operators can
control bloat in linear GP systems.
Other methods which control bloat by exploiting the bias of the operators
were discussed in Section 9.4.

Anti-Bloat Selection

As clariﬁed by the size evolution equation discussed in the previous section,
in systems with symmetric operators, bloat can only happen if there are
some longer-than-average programs that are ﬁtter than average or some
shorter-than-average programs that are less ﬁt than average, or both. So,
it stands to reason that in order to control bloat one needs to somehow
modulate the selection probabilities of programs based on their size.
As we have discussed in Section 9.2.1, recent methods also include the
use of multi-objective optimisation to control bloat. This typically involves
the use of a modiﬁed selection based on the Pareto criterion.
A recent technique, the Tarpeian method (Poli, 2003), controls bloat
by acting directly on the selection probabilities in Equation (11.2). This is
done by setting the ﬁtness of randomly chosen longer-than-average programs
to 0. This prevents them being parents. By changing how frequently this
is done the anti-bloat intensity of Tarpeian control can be modulated. An
advantage of the method is that the programs whose ﬁtness is zeroed are
never executed, thereby speeding up runs.
The well-known parsimony pressure method (Koza, 1992; Zhang and
M¨uhlenbein, 1993, 1995; Zhang et al., 1997) changes the selection probabili-
ties by subtracting a value based on the size of each program from its ﬁtness.
Bigger programs have more subtracted and, so, have lower ﬁtness and tend
to have fewer children. That is, the new ﬁtness function is f(x) − c × `(x),
where `(x) is the size of program x, f(x) is its original ﬁtness and c is a con-
stant known as the parsimony coeﬃcient. 2 Zhang and M¨uhlenbein (1995)
showed some beneﬁts of adaptively adjusting the coeﬃcient c at each gen-
eration but most implementations actually keep the parsimony coeﬃcient
constant.

2
While the new ﬁtness is used to guide evolution, one still needs to use the original
ﬁtness function to recognise solutions and stop runs.

115 116 117 118 119 120 121 122 123 124 125