Page 117 - 49A Field Guide to Genetic Programming
P. 117

11.3 Bloat                                                    103


            Size Evolution Equation
            In (Poli, 2001b; Poli and McPhee, 2003b), a size evolution equation for ge-
            netic programming was developed, which provided an exact formalisation of
            the dynamics of average program size. The original equation was derived
            from the exact schema theory for GP, and expressed mean program size as
            a function of the size and selection probabilities of particular schemata rep-
            resenting program shapes. The equation has recently been simplified (Poli
            and McPhee, 2008b) giving:

                                             X
                                E[µ(t + 1)] =   ` × p(`, t),            (11.1)
                                              `
            where µ(t + 1) is the mean size of the programs in the population at gen-
            eration t + 1, E is the expectation operator, ` is a program size, and p(`, t)
            is the probability of selecting programs of size ` from the population in
            generation t.
               This equation can be rewritten in terms of the expected change in average
            program size as:

                                           X
                         E[µ(t + 1) − µ(t)] =  ` × (p(`, t) − Φ(`, t)),  (11.2)
                                            `
            where Φ(`, t) is the proportion of programs of size ` in the current genera-
            tion. Both equations apply to a GP system with selection and any form of
            symmetric subtree crossover. 1
               Note that Equations (11.1) and (11.2) do not directly explain bloat. They
            are, however, important because they constrain what can and cannot hap-
            pen size-wise in GP populations. Any explanation for bloat (including the
            theories summarised above) has to agree with Equations (11.1) and (11.2).
               In particular, Equation (11.1) predicts that, for symmetric subtree-
            swapping crossover operators, the mean program size evolves as if selection
            only was acting on the population. This means that if there is a change in
            mean size (bloat, for example) it must be the result of some form of positive
            or negative selective pressure on some or all of the length classes `. Equa-
            tion (11.2) shows that there can be bloat only if the selection probability
            p(`, t) is different from the proportion Φ(`, t) for at least some `. In par-
            ticular, for bloat to happen there will have to be some small `’s for which
            p(`, t) < Φ(`, t) and also some bigger `’s for which p(`, t) > Φ(`, t) (at least
            on average).




               1
               In a symmetric operator the probability of selecting particular crossover points in the
            parents does not depend on the order in which the parents are drawn from the population.
   112   113   114   115   116   117   118   119   120   121   122