Page 240 - 20dynamics of cancer
P. 240
INHERITANCE 225
whereas relatively common changes must often have relatively few dele-
terious consequences. Another method, polymorphism phenotyping
(PolyPhen), combines evolutionary conservation with various measures
of biochemical structure and function (Ramensky et al. 2002).
I obtained two unpublished collections of PolyPhen scores for mis-
match repair gene variants and the associated ages of colorectal can-
cer onset. Figure 11.5 presents a preliminary analysis of those data. I
particularly wish to emphasize the importance of using the full age of
onset data. Many analyses simply classify age of onset as early or late,
throwing out the most valuable quantitative aspect of outcome. I have
emphasized throughout this book that age of onset provides the sum-
mary measure of outcome when studying how various causal factors
influence cancer progression.
Figure 11.5a shows the association between single amino acid sub-
stitutions and age of onset. These data came from a survey of the lit-
erature, in which each publication usually reported a single amino acid
variant believed to influence mismatch repair function and age of cancer
onset. These confirmed variants form a generally accepted set of DNA
repair variants with functional consequences on which we could test the
efficacy of the PolyPhen scoring method.
The raw data for Figure 11.5a scatter widely, because so many factors
influence the age of cancer onset for each individual case. I used a sliding
window analysis to illustrate the strong trend in the data (see figure
legend). The result shows a clear tendency for increased PolyPhen score
to predict the association between a substitution and the rate of cancer
progression measured by age of onset.
The confirmed variants in Figure 11.5a generally had some indepen-
dent evidence that suggested functional consequence for DNA repair
and cancer. If PolyPhen does indeed provide a computational method
for predicting consequence, then the method should also work on nu-
cleotide sequences obtained without any a priori information about the
functional consequence of variant sites.
Figure 11.5b shows unpublished data collected from individuals for
whom early-onset colorectal cancer runs in their family. For each in-
dividual, I received the age of colorectal cancer onset and the average
PolyPhen score over all 34 variant amino acid sites in the data set. I
excluded 26 individuals who did not have any variants and so did not
have a predictive PolyPhen score. The remaining 62 individuals each had