Page 85 - 49A Field Guide to Genetic Programming
P. 85
8.2 Pure EDA GP 71
bility. Also, simple calculations show that 31.25% of individuals generated
by this distribution will be at Hamming distance 1 from either (0, 0, 0, 0, 0)
or (1, 1, 1, 1, 1). 1 So, both optimal regions are sampled reasonably often.
However, it is clear that the majority (62.5%) of samples will be allocated
to less promising regions, where the Hamming distance will be 2 or 3 from
both (0, 0, 0, 0, 0) and (1, 1, 1, 1, 1). This is a significant concern, which is
why recently EDAs have often been used in combination with local search
(e.g., see (Zhang, Sun, and Tsang, 2005)).
There have been several applications of probabilistic model-based evolu-
tion (EDA-style) in the areas of tree-based and linear GP. We review them
in the rest of this chapter.
8.2 Pure EDA GP
The first EDA-type GP system was effectively an extension of PBIL to trees
called probabilistic incremental program evolution (PIPE) (Salustowicz and
Schmidhuber, 1997; Sa lustowicz, Wiering, and Schmidhuber, 1998; Salus-
towicz and Schmidhuber, 1999). In PIPE, the population is replaced by a
hierarchy of probability tables organised into a tree (such as the one in Fig-
ure 8.1). Each table represents the probability that a particular primitive
will be chosen at that specific location in a newly generated program tree.
At each generation a population of programs is created based on the current
tree of probability tables. The generation of a program begins by choosing a
root node based on the probabilities in the root table, and then continuing
down the hierarchy of probability tables until all branches of the tree are
complete (i.e., a terminal has been chosen on each branch). The fitness of
these new programs is computed, and the probability hierarchy is updated
on the basis of these fitnesses, so as to make the generation of above-average
fitness programs more likely in the next generation.
A positive feature of PIPE is that the probability of choosing a particular
primitive can vary with its depth (and, more generally, position) in the tree.
This makes it possible, for example, for terminals to become increasingly
probable as a node’s depth increases. A limitation of PIPE, however, is that
2
the primitives forming a tree are chosen independently from each other, so it
is impossible for PIPE to capture dependencies among primitives. Another
limitation is that the maximum size of the generated trees is constrained
by the size of the tree of probability tables. Ondas, Pelikan, and Sastry
(2005) compared the performance of PIPE and standard tree-based GP on
1 The Hamming distance between two strings (whether binary or not) is the number
of positions where the two strings differ.
2 There is a weak form of dependency, in that there can be a primitive in a particular
position only if the primitive just above it is a function. The choice of this parent primitive
does not, however, influence the choice of the child primitive.