Page 146 - 49A Field Guide to Genetic Programming

P. 146

132 13 Troubleshooting GP

13.2 Can you Trust your Results?

Since GP is a stochastic search algorithm, diﬀerent runs may have diﬀerent
outcomes and yield diﬀerent results. Because of this, one needs to be very
careful in making inferences regarding the degree of success of the system
from a small set of runs.
It is possible, for example, to run a GP system 10 times on a particular
problem, observe that all 10 runs failed to ﬁnd a solution, and conclude that
GP cannot solve the problem. However, if the success probability is say 5%
with a particular choice of parameters and representation, the probability of
doing 10 runs and all of them failing is almost 60%! So, the failure to solve
the problem in these 10 runs should not come as a surprise, even though
there’s a reasonable chance that you would ﬁnd a solution if you did more
runs.
For precisely this reason, it is very important to do enough runs and
use appropriate statistical tests to ensure that conclusions are statistically
signiﬁcant.
GP runs can often be very time consuming, especially if the ﬁtness func-
tion is computationally expensive. While parallel and distributed computing
(see Section 10.4) can signiﬁcantly speed up the process, tools from the de-
sign of experiments literature (Bartz-Beielstein, 2006) can also be used to
reduce the number of diﬀerent runs that are necessary to explore the space
in a statistically sound manner.
A common GP application is classiﬁcation, e.g., evolving a program or
function that can classify patient biopsy data into two categories: cancerous
or benign. There are numerous pitfalls in this type of work, such as using
all the available data as training data, thereby leaving nothing to use for
validating your evolved solution on unseen data. There is a broad literature
on this and related subjects, and numerous tools such as cross-validation
that one can use when not enough data are available. (See, for example,
(Hastie, Tibshirani, and Friedman, 2001).) The aim must be to ensure that
your results can be trusted to work in the real world, rather than in just the
synthetic environment created by the ﬁtness cases we chose.

13.3 There are No Silver Bullets

When working on real problems there are not likely to be any silver bullets.
No technique (including GP) is likely to solve all instances of an NP-hard
problem in an amount of time that grows linearly with the size of the prob-
lem. GP has proven extremely successful in a wide variety of domains (e.g.,
Chapter 12) but that doesn’t mean that it will work immediately or easily
in every domain, or even that it is the best tool for a speciﬁc domain.
While some of the successes in the ﬁeld have been “easy”, most were the

141 142 143 144 145 146 147 148 149 150 151