Data Correlations

Clinical presentation data and molecular data have been compared and displayed by M. Hossack in the following hypothesis-generating graphical representations.

∆∆G is a measurement of the difference in energy required to fold a protein with a mutation compared to the energy required to fold a wild-type (‘normal’) protein. Cells seek to minimize the energy required for activities. A positive ∆∆G value, indicating an increase in required energy, is a sign of disruption to cells as a result of the mutation. A negative ∆∆G value indicates that a mutation has reduced the amount of energy required to fold the protein, which is usually beneficial.

The age of onset of a patient’s symptoms are one way of measuring disease severity. A patient whose symptoms appear early in childhood is said to have a more severe case of disease in comparison to a patient whose symptoms appear in adolescence or adulthood.

Therefore, these variables would be expected to demonstrate a negative relationship.


It is possible that any change in the energy required for folding could have a negative impact on protein function. Analyzing the ‘absolute value’ or ‘magnitude’ of ∆∆G (|∆∆G|) provides information about the size of the change in required energy, regardless of whether it is an increase or a decrease.

If the magnitude of the energy change is the causative determinant of disease severity, then this correlation would be expected to display a positive relationship.


PROVEAN is an online tool that can determine the effect of a mutation on protein function. A ‘PROVEAN score’ is calculated for each mutation based on the prevalence of variation at the mutation site in other sequences. PROVEAN states that a score lower than -2.5 indicates that a mutation has a deleterious effect on protein function.

These variables would be expected to demonstrate a negative relationship.