Page 154 - 49A Field Guide to Genetic Programming
P. 154
140 13 Troubleshooting GP
In addition to reporting your results, make sure you also discuss their
implications. If, for example, what GP has evolved means the customer can
save money or could improve their process in some way, then this should
be highlighted. Also be careful to not construct excessively complex expla-
nations for the observations. It is very tempting to say “X is probably due
to Y”, but for this to be believable one should at least have made some
attempt to check if Y is indeed taking place, and whether modulations or
suppression of Y in fact produce modulations and/or suppression of X.
Finally, the most likely outcomes of a text that is badly written or badly
presented are: 1) your readers will misunderstand you, and 2) you will have
fewer readers. Spell checkers can help with typos, but whenever possible
one should ensure a native English speaker has proofread the text.
13.12 Convince your Customers
For any work in science, engineering, industry or commerce to make an im-
pact it must be presented in a form that can convince others of the validity
of its results and conclusions. This might include: a pitch within a corpo-
ration seeking continued financial support for a project, the submission of
a research paper to a journal or the presentation of a GP-based product to
potential customers.
The burden of proof is on the users of GP, and it is important to use the
customer’s language. If the fact that GP discovers a particular chemical is
important in a reaction or drug design, for example, one should make this
stand out during the presentation. A great advantage of GP over many AI
techniques in that its results are often simple equations. Ensure these are
intelligible to your customer, e.g., by simplification. Also make an effort to
present your results using your customer’s terminology. Your GP system
may produce answers as trees, but if the customers use spreadsheets, con-
sider translating the tree into a spreadsheet formula. Alternatively, your
customer may not be particularly interested in the details of the solution,
but instead care a great deal about which inputs the evolutionary process
tended to use.
Also, one should try to discover how the customers intend to validate
GP’s answer. Do not let them invent some totally new data which has
nothing to do with the data they supplied for training (“just to see how well
it does...”). Avoid customers with contrived data: GP is not omnipotent
and knows nothing about things it has not seen. At the same time you
should be scrupulous about your own use of holdout data. GP is a very
powerful machine learning technique, and with this comes the ever present
danger of over-fitting. One should never allow performance on data reserved
for validation to be used to choose which answer to present to the customer.