Page 99 - 49A Field Guide to Genetic Programming

P. 99

10.1 Reducing Fitness Evaluations/Increasing their Eﬀectiveness 85

used over time, the evolving population saw more of the training data and so
was less liable to over ﬁt a fraction of them. Thirdly, by randomly changing
the ﬁtness function, it became more diﬃcult for evolution to produce an
overspecialised individual which took over the population at the expense of
solutions which were viable on other parts of the training data. Dynamic
subset selection (DSS) appears to have been the most successful of Gather-
cole’s suggested algorithms. It has been incorporated into Discipulus (see
page 63), and was recently used in a large data mining application (Curry,
Lichodzijewski, and Heywood, 2007).
Where each ﬁtness evaluation may take a long time, it may be attrac-
tive to interrupt a long-running program in order to let others run. In GP
systems which allow recursion or contain iterative elements (Brave, 1996;
Langdon, 1998; Wilson and Heywood, 2007; Wong and Leung, 1996) it is
common to enforce a time limit, a limit on the number of instructions ex-
ecuted, or a bound on the number of times a loop is executed. Maxwell
(1994) proposed a solution to the question of what ﬁtness to give to a pro-
gram that has been interrupted. He allowed each program in the population
a quantum of CPU time. When the program used up its quantum it was
check-pointed. 4 In Maxwell’s system, programs gained ﬁtness as they ran,
i.e., each time a program correctly processed a ﬁtness case, its ﬁtness was
incremented. Tournament selection was then performed. If all members of
the tournament had used the same number of CPU quanta, then the ﬁtter
program was the winner. If, however, one program had used less CPU than
the others (and had a lower ﬁtness) then it was restarted and run until it
had used as much CPU as the others. Then ﬁtnesses were compared in the
normal way.
Teller (1994) had a similar but slightly simpler approach: every indi-
vidual in the population was run for the same amount of time. When the
allotted time elapsed a program was aborted and an answer extracted from
it, regardless of whether it had terminated or not. Teller called this an “any
time” approach. This suits graph systems like Teller’s PADO (Section 7.2.2)
or linear GP (Chapter 7.1) where it is easy to designate a register as the
output register. The answer can then be extracted from this register or from
an indexed memory cell at any point (including whilst the programming is
running). Other any time approaches include (Spector and Alpern, 1995)
and (Langdon and Poli, 2008).
A simple technique to speed up the evaluation of complex ﬁtness func-
tions is to organise the ﬁtness function into stages of progressively increasing
computational cost. Individuals are evaluated stage by stage. Each stage
contributes to the overall ﬁtness of a program. However, individuals need

4 When a program is check-pointed, suﬃcient information (principally the program
counter and stack) is saved so that it can later be restarted from where it was stopped.
Many multi-tasking operating systems do something similar.

94 95 96 97 98 99 100 101 102 103 104