Page 146 - 49A Field Guide to Genetic Programming
P. 146
132 13 Troubleshooting GP
13.2 Can you Trust your Results?
Since GP is a stochastic search algorithm, different runs may have different
outcomes and yield different results. Because of this, one needs to be very
careful in making inferences regarding the degree of success of the system
from a small set of runs.
It is possible, for example, to run a GP system 10 times on a particular
problem, observe that all 10 runs failed to find a solution, and conclude that
GP cannot solve the problem. However, if the success probability is say 5%
with a particular choice of parameters and representation, the probability of
doing 10 runs and all of them failing is almost 60%! So, the failure to solve
the problem in these 10 runs should not come as a surprise, even though
there’s a reasonable chance that you would find a solution if you did more
runs.
For precisely this reason, it is very important to do enough runs and
use appropriate statistical tests to ensure that conclusions are statistically
significant.
GP runs can often be very time consuming, especially if the fitness func-
tion is computationally expensive. While parallel and distributed computing
(see Section 10.4) can significantly speed up the process, tools from the de-
sign of experiments literature (Bartz-Beielstein, 2006) can also be used to
reduce the number of different runs that are necessary to explore the space
in a statistically sound manner.
A common GP application is classification, e.g., evolving a program or
function that can classify patient biopsy data into two categories: cancerous
or benign. There are numerous pitfalls in this type of work, such as using
all the available data as training data, thereby leaving nothing to use for
validating your evolved solution on unseen data. There is a broad literature
on this and related subjects, and numerous tools such as cross-validation
that one can use when not enough data are available. (See, for example,
(Hastie, Tibshirani, and Friedman, 2001).) The aim must be to ensure that
your results can be trusted to work in the real world, rather than in just the
synthetic environment created by the fitness cases we chose.
13.3 There are No Silver Bullets
When working on real problems there are not likely to be any silver bullets.
No technique (including GP) is likely to solve all instances of an NP-hard
problem in an amount of time that grows linearly with the size of the prob-
lem. GP has proven extremely successful in a wide variety of domains (e.g.,
Chapter 12) but that doesn’t mean that it will work immediately or easily
in every domain, or even that it is the best tool for a specific domain.
While some of the successes in the field have been “easy”, most were the