Page 54 - 49A Field Guide to Genetic Programming
P. 54
40 5 Alternative Initialisations and Operators in Tree-based GP
5.1.1 Uniform Initialisation
The shape of the initial trees can be lost within a few generations (more on
this below). However, a good start given by the initial population can still be
crucial to the success of a GP run. In general, there are an infinite number of
possible computer programs. This means that it is impossible to search them
uniformly. Therefore, any method used to create the initial population will
have a bias. For example, ramped half-and-half tends to create bushy trees.
Such trees have a higher proportion of solutions to symmetric problems,
such as parity. Conversely, the smallest solution to the Sante Fe ant trail-
following problem is more randomly shaped (Langdon and Poli, 1998a). This
is partly why ramped half-and-half is very poor at finding programs which
can navigate the Sante Fe trail. Another reason is that many of the programs
generated by ramped half-and-half (with standard parameters) are simply
too small. Chellapilla (1997a) claims good results when the size of the initial
trees was more tightly controlled.
Iba (1996a) and Bohm and Geyer-Schulz (1996) report methods to pre-
cisely sample trees uniformly based on Alonso’s bijective algorithm (Alonso
and Schott, 1995). Although this algorithm has been criticised (Luke,
2000) for being computationally expensive, it can be readily used in prac-
tice. Langdon (2000) introduced the ramped uniform initialisation which
extends Alonso’s bijective algorithm by allowing the user to specify a range
of initial tree sizes. It then generates equal numbers of random trees
for each length in the chosen range. (C++ code can be obtained from
ftp://cs.ucl.ac.uk/genetic/gp-code/rand tree.cc.)
With these more “uniform” initialisations, most trees are asymmetric
with some leaves very close to the root of the tree. This is quite different
from the trees generated by ramped half-and-half which are on average some
distance from the root. Uniform sampling may be better in problems where
the desired solutions are asymmetric with some leaves being much more
important than others. For example, in data mining it is common to look
for solutions with a few dominant variables (which may be close to the root
node) whilst other variables are of little or no interest and may be some
distance from the root (or indeed not present in the tree). On the other
hand, problems like multiplexer or parity require all the inputs to be used
and are of similar importance. Bushier trees may be better at solving such
problems.
5.1.2 Initialisation may Affect Bloat
Crossover has a strong preference for creating a very non-uniform distri-
butions of tree sizes (Poli, Langdon, and Dignum, 2007). Crossover gener-
ates very short programs much more often than longer ones. Selection can
only partially combat this tendency. Typically, crossover will totally rear-