Page 40 - 49A Field Guide to Genetic Programming

P. 40

26 3 Getting Ready to Run Genetic Programming

In some problems we are interested in the output produced by a program,
namely the value returned when we evaluate the tree starting at the root
node. In other problems we are interested in the actions performed by a
program composed of functions with side eﬀects. In either case the ﬁtness
of a program typically depends on the results produced by its execution on
many diﬀerent inputs or under a variety of diﬀerent conditions. For example
the program might be tested on all possible combinations of inputs x1, x2,
..., xN. Alternatively, a robot control program might be tested with the
robot in a number of starting locations. These diﬀerent test cases typically
contribute to the ﬁtness value of a program incrementally, and for this reason
are called ﬁtness cases.
Another common feature of GP ﬁtness measures is that, for many prac-
tical problems, they are multi-objective, i.e., they combine two or more dif-
ferent elements that are often in competition with one another. The area of
multi-objective optimisation is a complex and active area of research in GP
and machine learning in general. See Chapter 9 and also (Deb, 2001).

3.4 Step 4: GP Parameters

The fourth preparatory step speciﬁes the control parameters for the run.
The most important control parameter is the population size. Other control
parameters include the probabilities of performing the genetic operations, the
maximum size for programs and other details of the run.
It is impossible to make general recommendations for setting optimal
parameter values, as these depend too much on the details of the application.
However, genetic programming is in practice robust, and it is likely that
many diﬀerent parameter values will work. As a consequence, one need not
typically spend a long time tuning GP for it to work adequately.
It is common to create the initial population randomly using ramped
half-and-half (Section 2.2) with a depth range of 2–6. The initial tree sizes
will depend upon the number of the functions, the number of terminals
and the arities of the functions. However, evolution will quickly move the
population away from its initial distribution.
Traditionally, 90% of children are created by subtree crossover. How-
ever, the use of a 50-50 mixture of crossover and a variety of mutations (cf.
Chapter 5) also appears to work well.
In many cases, the main limitation on the population size is the time
taken to evaluate the ﬁtnesses, not the space required to store the individ-
uals. As a rule one prefers to have the largest population size that your
system can handle gracefully; normally, the population size should be at
4
least 500, and people often use much larger populations. Often, to a ﬁrst

4 There are, however, GP systems that frequently use much smaller populations. These

35 36 37 38 39 40 41 42 43 44 45