Page 126 - 49A Field Guide to Genetic Programming
P. 126
112 12 Applications
to individual problems; unveil unexpected relationships among vari-
ables; and, sometimes GP can discover new concepts that can then be
applied in a wide variety of circumstances.
Finding the size and shape of the ultimate solution is a major
part of the problem. If the form of the solution is known, then
alternative search mechanisms that work on fixed size representations
(e.g., genetic algorithms) may be more efficient because they won’t
have to discover the size and shape of the solution.
Significant amounts of test data are available in computer-
readable form. GP (and most other machine learning and search
techniques) benefit from having significant pools of test data. At a
minimum there needs to be enough data to allow the system to learn
the salient features, while leaving enough at the end to use for valida-
tion and over-fitting tests. It is also useful if the test data are as clean
and accurate as possible. GP is capable of dealing gracefully with
certain amounts of noise in the data (especially if steps are taken to
reduce over-fitting), but cleaner data make the learning process easier
for any system, GP included.
There are good simulators to test the performance of tentative
solutions to a problem, but poor methods to directly obtain
good solutions. In many domains of science and engineering, sim-
ulators and analysis tools have been constructed that allow one to
evaluate the behaviour and performance of complex artifacts such as
aircraft, antennas, electronic circuits, control systems, optical systems,
games, etc. These simulators contain enormous amounts of knowledge
of the domain and have often required several years to create. These
tools solve the so-called direct problem of working out the behaviour
of a solution or tentative solution to a problem, given the solution it-
self. However, the knowledge stored in such systems cannot be easily
used to solve the inverse problem of designing an artifact from a set
of functional or performance requirements. A great advantage of GP
is that it is able to connect to simulators and analysis tools and to
“data-mine” the simulator to solve the inverse problem automatically.
That is, the user need not specify (or know) much about the form of
the eventual solution before starting.
Conventional mathematical analysis does not, or cannot, provide
analytic solutions. If there is a good exact analytic solution, one
probably wants to use it rather than spend the energy to evolve what
is likely to be an approximate solution. That said, GP might still be
a valuable option if the analytic solutions have undesirable properties
(e.g., unacceptable run times for large instances), or are based on