Page 101 - 49A Field Guide to Genetic Programming
P. 101

10.2 Reducing Cost of Fitness with Caches                      87


            only must the answer be stored, but the interpreter needs to know that the
            subtree’s inputs are the same too. The common practices of GP come to our
            aid here. Usually every tree in the population is run on exactly the same
            inputs for each of the fitness cases. Thus, for a cache to work, the interpreter
            does not need to know a tree’s inputs in detail, it need only know which of
            the fixed set of test cases was used.
               A simple means of implementing this type of cache is to store a vector of
            values returned by each subtree for each of the test cases. Whenever a sub-
            tree is created (i.e., in the initial generation, by crossover or by mutations)
            the interpreter is run and the cache of values for its root node is set. Note
            this is recursive, so caches can also be calculated for subtrees within it at
            the same time. Now, when the interpreter is run and comes to a subtree’s
            root node, it will simply retrieve the value it calculated earlier, using the
            test case’s number as an index into the cache vector.
               If a subtree is created by mutation, then its cache of values will be
            initially empty and will have to be calculated. However, this costs no more
            than it would without caches.
               When code is inserted into an existing tree, be it by mutation or
            crossover, the chance that the new code behaves identically to the old code
            is normally very small. This means that the caches of every node between
            the new code and the root node may be invalid. The simplest solution is
            to re-evaluate them all. This may sound expensive, but the caches in all
            the other parts of the individual remain valid and can be used when the
            cache above them is re-evaluated. Thus, in effect, if the crossed over code is
            inserted at depth d, only d nodes need to be evaluated.
               The whole question of monitoring how effective individual caches are,
            what their hit-rates are, etc. has been little explored. In practice, impressive
            savings have been achieved by simple implementations, with little monitor-
            ing and rudimentary garbage collection. Recent analysis (Ciesielski and Li,
            2004; Dignum and Poli, 2007; Langdon and Poli, 2002; Poli et al., 2007)
            has shown that GP trees tend not to have symmetric shapes, and many
            leaves are very close to the root. This provides a theoretical explanation for
            why considerable computational saving can be made by using fitness caches.
            While it is possible to use hashing schemes to efficiently find common code,
            in practice assuming that common code only arises because it was inherited
            from the same location (e.g., by crossing over) is sufficient.
               As well as the original Directed acyclic graph (DAG) implementation
            (Handley, 1994) other work includes (Ciesielski and Li, 2004; Keijzer, 1996;
            McPhee, Hopper, and Reierson, 1998; Yangiya, 1995). While so far we have
            only considered programs where no side effects take place, there are cases
            where caching can be extended outside this domain. For example, Langdon
            (1998) used fitness caches in evolved trees with side effects by exploiting
            syntax rules about where in the code the side-effects could lie.
   96   97   98   99   100   101   102   103   104   105   106