Page 23 - 49A Field Guide to Genetic Programming
P. 23

Chapter 2




            Representation,


            Initialisation and


            Operators in Tree-based


            GP





            This chapter introduces the basic tools and terminology used in genetic
            programming. In particular, it looks at how trial solutions are represented in
            most GP systems (Section 2.1), how one might construct the initial random
            population (Section 2.2), and how selection (Section 2.3) as well as crossover
            and mutation (Section 2.4) are used to construct new programs.



            2.1    Representation

            In GP, programs are usually expressed as syntax trees rather than as lines of
            code. For example Figure 2.1 shows the tree representation of the program
            max(x+x,x+3*y). The variables and constants in the program (x, y and 3)
            are leaves of the tree. In GP they are called terminals, whilst the arithmetic
            operations (+, * and max) are internal nodes called functions. The sets of
            allowed functions and terminals together form the primitive set of a GP
            system.
               In more advanced forms of GP, programs can be composed of multiple
            components (e.g., subroutines). In this case the representation used in GP
            is a set of trees (one for each component) grouped together under a special
            root node that acts as glue, as illustrated in Figure 2.2. We will call these
            (sub)trees branches. The number and type of the branches in a program,

                                            9
   18   19   20   21   22   23   24   25   26   27   28