Page 117 - 49A Field Guide to Genetic Programming
P. 117
11.3 Bloat 103
Size Evolution Equation
In (Poli, 2001b; Poli and McPhee, 2003b), a size evolution equation for ge-
netic programming was developed, which provided an exact formalisation of
the dynamics of average program size. The original equation was derived
from the exact schema theory for GP, and expressed mean program size as
a function of the size and selection probabilities of particular schemata rep-
resenting program shapes. The equation has recently been simplified (Poli
and McPhee, 2008b) giving:
X
E[µ(t + 1)] = ` × p(`, t), (11.1)
`
where µ(t + 1) is the mean size of the programs in the population at gen-
eration t + 1, E is the expectation operator, ` is a program size, and p(`, t)
is the probability of selecting programs of size ` from the population in
generation t.
This equation can be rewritten in terms of the expected change in average
program size as:
X
E[µ(t + 1) − µ(t)] = ` × (p(`, t) − Φ(`, t)), (11.2)
`
where Φ(`, t) is the proportion of programs of size ` in the current genera-
tion. Both equations apply to a GP system with selection and any form of
symmetric subtree crossover. 1
Note that Equations (11.1) and (11.2) do not directly explain bloat. They
are, however, important because they constrain what can and cannot hap-
pen size-wise in GP populations. Any explanation for bloat (including the
theories summarised above) has to agree with Equations (11.1) and (11.2).
In particular, Equation (11.1) predicts that, for symmetric subtree-
swapping crossover operators, the mean program size evolves as if selection
only was acting on the population. This means that if there is a change in
mean size (bloat, for example) it must be the result of some form of positive
or negative selective pressure on some or all of the length classes `. Equa-
tion (11.2) shows that there can be bloat only if the selection probability
p(`, t) is different from the proportion Φ(`, t) for at least some `. In par-
ticular, for bloat to happen there will have to be some small `’s for which
p(`, t) < Φ(`, t) and also some bigger `’s for which p(`, t) > Φ(`, t) (at least
on average).
1
In a symmetric operator the probability of selecting particular crossover points in the
parents does not depend on the order in which the parents are drawn from the population.