Page 69 - 49A Field Guide to Genetic Programming
P. 69
6.2 Constraining Structures 55
This is because programs need to be executed in order to evaluate their
fitness, and this flat program representation often requires further trans-
formations before execution. It is, therefore, common to directly convert a
derivation tree into a standard GP tree.
Grammar-based GP approaches can be extended by incorporating con-
cepts from computational linguistics. For example, McKay and colleagues
used tree adjoining grammars (TAGs) (Joshi and Schabes, 1997) to de-
sign new genetic representations and operators that respect grammatical
constraints while allowing new types of structural modifications (Hoai and
McKay, 2004; Hoai et al., 2003; Hoai, McKay, Essam, and Hao, 2005).
Another major grammar-based approach is grammatical evolution (GE)
(O’Neill and Ryan, 2003; Ryan, Collins, and O’Neill, 1998). GE does not
use trees, instead it represents individuals as variable-length sequences of
integers (cf. Equation 6.4) which are interpreted in the context of a user
supplied grammar.
For each rule in the grammar, the set of alternatives on the right hand
side are numbered from 0 upwards. In the example grammar in Equa-
tion (6.3) above, the first rule only has one option on the right hand side; so
this would be numbered 0. The second rule has two options, which would
be numbered 0 and 1. The third rule has four options which would be num-
bered 0 to 3. Finally the fourth rule has three options numbered 0 to 2. To
create a program from a GE individual one uses the values in the individual
to “choose” which alternative to take in the production rules. For example,
suppose a GE individual is represented by the sequence
39, 7, 2, 83, 66, 92, 57, 80, 47, 94 (6.4)
then we start with 39 and the first syntax rule, tree. However tree has no
alternatives, so we move to 7 and rule E. Now E has two alternatives and
7 is used (via modulus) to chose between them. More of the translation
process is given in Figure 6.3.
In this example we did not need to use all the numbers in the sequence
to generate a complete program. Indeed the last integer, 94, was not used.
In general, “extra” genetic material is simply ignored. More problematic is
when a sequence is “too short” in the sense that the end of the sequence
is reached before the translation process is complete. There are a variety
of options in this case, including failure (assigning this individual the worst
possible fitness) and wrapping (continuing the translation process, moving
back to the front of the numeric sequence). Grammatical evolution has been
very successful and is widely used.
6.2.4 Constraints and Bias
One of the common arguments in favour of constraint systems like types
and grammars is that they limit the search space by restricting the kind