Page 100 - 49A Field Guide to Genetic Programming
P. 100

86                         10 Fast and Distributed Genetic Programming


            to reach a minimum fitness value in each stage in order for them to be
            allowed to progress to the next stage and acquire further fitness. Often
            different stages represent different requirements and constraints imposed on
            solutions.
               Recently, a sophisticated technique called backward chaining GP has
            been proposed (Poli, 2005; Poli and Langdon, 2005a,b, 2006a). In GP and
            other evolutionary algorithms which use tournament selection with small
            tournament sizes, backward chaining can radically reduce the number of
            fitness evaluations. Tournament selection randomly draws programs from
            the population to construct tournaments, the winners of which are then se-
            lected. Although this process is repeated many times in each generation,
            when the tournaments are small there is a significant probability that an
            individual in the current generation is never chosen to become a member
            of any tournament. By reordering the way operations are performed, back-
            ward chaining GP exploits this. It not only avoids fitness calculations for
            individuals that are never included in a tournament, but can also achieve
            higher fitness sooner.


            10.2     Reducing Cost of Fitness with Caches

            In computer hardware it is common to use data caches which automatically
            hold copies of data locally in order to avoid the delays associated with fetch-
            ing them from disk or over a network every time they are needed. This can
            work well when a small amount of data is needed many times over a short
            interval.
               Caches can also be used to store results of calculations, thereby avoiding
            the re-calculation of data (Handley, 1994). GP populations have enormous
            amounts of common code (Langdon, 1998; Langdon and Banzhaf, 2005;
            Langdon and Poli, 2008). This is, after all, how genetic search works: it
            promotes the genetic material of fit individuals. So, typically in each gener-
            ation we see many copies of successful code.
               In many (but by no means all) GP systems, subtrees have no side-effects.
            This means results pass through a program’s root node in a well organised
            and easy to understand fashion. Thus, if we remember a subtree’s inputs
            and output when it was run before, we can avoid re-executing code whenever
            we are required to run the subtree again. Note that this is true irrespective
            of whether we need to run the same subtree inside a different individual or
            at a different time (i.e., a later generation). Thus, if we stored the output
            with the root node, we would only need to run the subtree once for any
            given set of inputs. Whenever the interpreter comes to evaluate the subtree,
            it needs only to check if the subtree’s root contains a cache of the values the
            interpreter calculated last time, thus saving considerable computation time.
               In order to achieve this, however, we need to overcome a problem: not
   95   96   97   98   99   100   101   102   103   104   105