Page 151 - 49A Field Guide to Genetic Programming
P. 151
13.8 Embrace Approximation 137
Population variety If the variety —the number of distinct individuals in
the population— falls below 90% of the population size, this may in-
dicate that there is a problem. However, a high variety does not mean
the reverse. GP populations often contain introns (Section 11.3), and
so programs which are not identical may behave identically (Gustafson,
2004; McPhee and Hopper, 1999). Being different, these individuals
contribute to a high variety. So, a high variety need not indicate all
is well. Measuring phenotypic variation (i.e., diversity of behaviour)
may also be useful (McPhee, Ohs, and Hutchison, 2008).
6
Insufficient diversity may cause significant problems. Panmictic steady-
state populations with tournament selection, reproduction and crossover, for
example, are prone to premature convergence. If you find this to be an issue,
measures should be taken to encourage population diversity such as:
• Not using the reproduction operator.
• Adding one or more mutation operators.
• Using a weaker selection mechanism, e.g., using smaller tournament
sizes.
• Using uniform random selection (instead of the standard negative tour-
naments) to decide which individuals to remove from the population. 7
• Using a generational population model instead of a steady-state model.
• Splitting large populations into semi-isolated demes (Section 10.5). 8
• Using fitness sharing to encourage the formation of many fitness niches.
13.8 Embrace Approximation
There is a widespread belief that computer programs are fragile and that
any change to any bit in them will cause them to stop working. This is
fostered by the common knowledge that a small typing mistake by a human
programmer can sometimes introduce a troublesome bug into a program.
6 In a panmictic population no mating restrictions are imposed as to which individual
mates with which.
7 Doing this means that the selection scheme is no longer elitist, and it may be worth-
while to protect the best individual(s) to preserve the elitism.
8 What is meant by a “large population” has changed over time. In the early days
of GP, populations of 1,000 or more could be considered large. However, CPU speeds
and computer memory have increased exponentially over time. So, at the time of writing
it is not unusual to see populations of hundred of thousands or millions of individuals
being used in the solution of hard problems. Research indicates that there are benefits in
splitting populations into demes even for much smaller populations. See Section 10.5.