Page 99 - 49A Field Guide to Genetic Programming
P. 99

10.1 Reducing Fitness Evaluations/Increasing their Effectiveness  85


            used over time, the evolving population saw more of the training data and so
            was less liable to over fit a fraction of them. Thirdly, by randomly changing
            the fitness function, it became more difficult for evolution to produce an
            overspecialised individual which took over the population at the expense of
            solutions which were viable on other parts of the training data. Dynamic
            subset selection (DSS) appears to have been the most successful of Gather-
            cole’s suggested algorithms. It has been incorporated into Discipulus (see
            page 63), and was recently used in a large data mining application (Curry,
            Lichodzijewski, and Heywood, 2007).
               Where each fitness evaluation may take a long time, it may be attrac-
            tive to interrupt a long-running program in order to let others run. In GP
            systems which allow recursion or contain iterative elements (Brave, 1996;
            Langdon, 1998; Wilson and Heywood, 2007; Wong and Leung, 1996) it is
            common to enforce a time limit, a limit on the number of instructions ex-
            ecuted, or a bound on the number of times a loop is executed. Maxwell
            (1994) proposed a solution to the question of what fitness to give to a pro-
            gram that has been interrupted. He allowed each program in the population
            a quantum of CPU time. When the program used up its quantum it was
            check-pointed. 4  In Maxwell’s system, programs gained fitness as they ran,
            i.e., each time a program correctly processed a fitness case, its fitness was
            incremented. Tournament selection was then performed. If all members of
            the tournament had used the same number of CPU quanta, then the fitter
            program was the winner. If, however, one program had used less CPU than
            the others (and had a lower fitness) then it was restarted and run until it
            had used as much CPU as the others. Then fitnesses were compared in the
            normal way.
               Teller (1994) had a similar but slightly simpler approach: every indi-
            vidual in the population was run for the same amount of time. When the
            allotted time elapsed a program was aborted and an answer extracted from
            it, regardless of whether it had terminated or not. Teller called this an “any
            time” approach. This suits graph systems like Teller’s PADO (Section 7.2.2)
            or linear GP (Chapter 7.1) where it is easy to designate a register as the
            output register. The answer can then be extracted from this register or from
            an indexed memory cell at any point (including whilst the programming is
            running). Other any time approaches include (Spector and Alpern, 1995)
            and (Langdon and Poli, 2008).
               A simple technique to speed up the evaluation of complex fitness func-
            tions is to organise the fitness function into stages of progressively increasing
            computational cost. Individuals are evaluated stage by stage. Each stage
            contributes to the overall fitness of a program. However, individuals need

               4 When a program is check-pointed, sufficient information (principally the program
            counter and stack) is saved so that it can later be restarted from where it was stopped.
            Many multi-tasking operating systems do something similar.
   94   95   96   97   98   99   100   101   102   103   104