Page 85 - 49A Field Guide to Genetic Programming
P. 85

8.2 Pure EDA GP                                                71


            bility. Also, simple calculations show that 31.25% of individuals generated
            by this distribution will be at Hamming distance 1 from either (0, 0, 0, 0, 0)
            or (1, 1, 1, 1, 1). 1  So, both optimal regions are sampled reasonably often.
            However, it is clear that the majority (62.5%) of samples will be allocated
            to less promising regions, where the Hamming distance will be 2 or 3 from
            both (0, 0, 0, 0, 0) and (1, 1, 1, 1, 1). This is a significant concern, which is
            why recently EDAs have often been used in combination with local search
            (e.g., see (Zhang, Sun, and Tsang, 2005)).
               There have been several applications of probabilistic model-based evolu-
            tion (EDA-style) in the areas of tree-based and linear GP. We review them
            in the rest of this chapter.

            8.2    Pure EDA GP


            The first EDA-type GP system was effectively an extension of PBIL to trees
            called probabilistic incremental program evolution (PIPE) (Salustowicz and
            Schmidhuber, 1997; Sa lustowicz, Wiering, and Schmidhuber, 1998; Salus-
            towicz and Schmidhuber, 1999). In PIPE, the population is replaced by a
            hierarchy of probability tables organised into a tree (such as the one in Fig-
            ure 8.1). Each table represents the probability that a particular primitive
            will be chosen at that specific location in a newly generated program tree.
            At each generation a population of programs is created based on the current
            tree of probability tables. The generation of a program begins by choosing a
            root node based on the probabilities in the root table, and then continuing
            down the hierarchy of probability tables until all branches of the tree are
            complete (i.e., a terminal has been chosen on each branch). The fitness of
            these new programs is computed, and the probability hierarchy is updated
            on the basis of these fitnesses, so as to make the generation of above-average
            fitness programs more likely in the next generation.
               A positive feature of PIPE is that the probability of choosing a particular
            primitive can vary with its depth (and, more generally, position) in the tree.
            This makes it possible, for example, for terminals to become increasingly
            probable as a node’s depth increases. A limitation of PIPE, however, is that
                                                                        2
            the primitives forming a tree are chosen independently from each other, so it
            is impossible for PIPE to capture dependencies among primitives. Another
            limitation is that the maximum size of the generated trees is constrained
            by the size of the tree of probability tables. Ondas, Pelikan, and Sastry
            (2005) compared the performance of PIPE and standard tree-based GP on

               1 The Hamming distance between two strings (whether binary or not) is the number
            of positions where the two strings differ.
               2 There is a weak form of dependency, in that there can be a primitive in a particular
            position only if the primitive just above it is a function. The choice of this parent primitive
            does not, however, influence the choice of the child primitive.
   80   81   82   83   84   85   86   87   88   89   90