Page 126 - 49A Field Guide to Genetic Programming
P. 126

112                                                 12 Applications


                 to individual problems; unveil unexpected relationships among vari-
                 ables; and, sometimes GP can discover new concepts that can then be
                 applied in a wide variety of circumstances.
             Finding the size and shape of the ultimate solution is a major
                 part of the problem. If the form of the solution is known, then
                 alternative search mechanisms that work on fixed size representations
                 (e.g., genetic algorithms) may be more efficient because they won’t
                 have to discover the size and shape of the solution.

             Significant amounts of test data are available in computer-
                 readable form. GP (and most other machine learning and search
                 techniques) benefit from having significant pools of test data. At a
                 minimum there needs to be enough data to allow the system to learn
                 the salient features, while leaving enough at the end to use for valida-
                 tion and over-fitting tests. It is also useful if the test data are as clean
                 and accurate as possible. GP is capable of dealing gracefully with
                 certain amounts of noise in the data (especially if steps are taken to
                 reduce over-fitting), but cleaner data make the learning process easier
                 for any system, GP included.

             There are good simulators to test the performance of tentative
                 solutions to a problem, but poor methods to directly obtain
                 good solutions. In many domains of science and engineering, sim-
                 ulators and analysis tools have been constructed that allow one to
                 evaluate the behaviour and performance of complex artifacts such as
                 aircraft, antennas, electronic circuits, control systems, optical systems,
                 games, etc. These simulators contain enormous amounts of knowledge
                 of the domain and have often required several years to create. These
                 tools solve the so-called direct problem of working out the behaviour
                 of a solution or tentative solution to a problem, given the solution it-
                 self. However, the knowledge stored in such systems cannot be easily
                 used to solve the inverse problem of designing an artifact from a set
                 of functional or performance requirements. A great advantage of GP
                 is that it is able to connect to simulators and analysis tools and to
                 “data-mine” the simulator to solve the inverse problem automatically.
                 That is, the user need not specify (or know) much about the form of
                 the eventual solution before starting.
             Conventional mathematical analysis does not, or cannot, provide
                 analytic solutions. If there is a good exact analytic solution, one
                 probably wants to use it rather than spend the energy to evolve what
                 is likely to be an approximate solution. That said, GP might still be
                 a valuable option if the analytic solutions have undesirable properties
                 (e.g., unacceptable run times for large instances), or are based on
   121   122   123   124   125   126   127   128   129   130   131