The stature of men may be said to vary usually between limits of 62 and 76 inches, the average height being about 69 inches. In the complete absence of heredity in stature we should find that fathers of any given height, say 62 or 63 or 76 inches would have sons of no particular height but of all heights with an average of 69 inches, the same as in the whole group. Or if stature were completely heritable from one generation to the next the total generations being the units compared, then 62 or 63 or 76 inch fathers would have respectively sons all 62, 63, and 76 inches tall. When we examine the actual details of the resemblance we find, as a matter of fact, that neither of these possibilities is actually realized. What we do find is that fathers below or above the average height have sons whose average height is also below or above the general average but not so far below or above the general average as were the fathers. If we measured a large number of pairs of fathers and sons with respect to stature we should find each generation with a variability such as that illustrated in Fig. 3 of the stature of mothers, the limits here, however, being about 62 and 76 inches. But if we measured all the sons of 62-inch fathers they would be found to vary say from 62 to only 69 inches, averaging about 66 inches. Similarly 63-inch fathers would have sons from 62 to 70 inches tall, averaging about 66.5 inches, or 76-inch fathers might have sons from 69 to 76 inches in height, averaging about 72 inches, and so on for fathers of all heights. In general, then, we may say that fathers with a characteristic of a certain plus or minus deviation from the average of the whole group have sons who on the whole deviate in the same direction but less widely than the fathers, although the fact of variability comes in so that some few of the sons deviate as widely as, or even more widely than, the fathers, others deviate less widely than the fathers from the average of the whole group. This is the general and very important statistical fact of regression.

The phenomenon of regression may be made somewhat clearer by the aid of a simple diagram—Fig. 10. Here are plotted first the heights, by inches, of a group of fathers, giving the series of dots joined by the diagonal AB. Next are plotted the average heights of the sons of each class of fathers: 62-inch fathers give 66-inch sons, 63-inch fathers 66.5-inch sons, 64-inch fathers 67-inch sons, and so for all the classes of fathers. These dots are then joined by the line EF. This is the regression line. Had it been the case that there was no regression in stature the different classes of fathers would have had sons averaging just the same as themselves and the line representing the heights of the sons would have coincided with the line AB. Or if regression had been complete the fathers of any class would have had sons averaging about 69 inches—just the same as the average of the whole group—and the line representing their heights would have had the position of CD in the diagram. As a matter of fact, however, neither of these possibilities is actually realized and the regression line EF is approximated in an actual series of data. A similar relation has been found for many characters other than stature.

Fig. 10.—Diagram illustrating the phenomenon of regression.
Explanation in text.

The fact of regression is of considerable importance for the theory of evolution as well as for the subject of Eugenics when describing the phenomena of heredity in this statistical manner in whole groups without paying attention to particular individuals. Regression is found in all characteristics observed in this way, psychic as well as purely physical. "The father [i. e., fathers] with a great excess of the
character contributes [contribute] sons with
an excess, but a less excess of it; the father [fathers] with a great defect of the character contributes [contribute] sons with a defect, but less defect of it."

Now, whatever the actual extent of this regression is in a group we need to know how uniformly it occurs for all the classes of different
deviations from the general average,
that is, we need to know whether the extreme groups regress to the same relative extent as do those nearer the general average; and, further, we need to know how nearly the sons of fathers of any certain height are grouped about their own average. In other words, we should know, first, whether the regression of the sons of 62 and 76 or 67 and 71 inch fathers is proportionately the same in each case, and, second, to what extent the sons of 62-inch fathers vary, whether they vary as do the fathers of 62-inch sons, and so for each group. This kind of information we get by calculating what is called the coefficient of heredity. The calculation of this coefficient is a complicated process which it is unnecessary to describe here. It must suffice to say that a numerical coefficient can readily be determined, which will express the average closeness and regularity of the relationship between all the plus and minus deviations from the group average in fathers and the corresponding plus and minus deviations from the group average of their sons with respect to a given characteristic. This coefficient of heredity may vary between 0.0 and 1.0. When it is 0.0 there is, on the whole, no regularity in the relationship, i. e., no heredity; when it is 1.0 there is, on the whole, complete regularity, i. e., heredity is complete. Neither of these values is ever actually found in determining coefficients of heredity in the parental relation; these are usually between 0.3 and 0.5. It should be emphasized again that this comparison is between whole groups and not between individuals, and that it fails to allow for the distinction between fluctuations and true variations. And, further, it should be noted that the information derived from such a coefficient is defective in that it takes into account only the relationship between the son and one parent; the maternal relation is just as important but this has to be determined separately. There is no satisfactory method of determining the relation between children and both parents at the same time.

The coefficient of heredity is, therefore, an abstract numerical value which gives us a fairly precise estimate as to the probable closeness of the relation between deviations from the group average of any character in two groups of relatives. The coefficient of correlation is, in general, a measure of the relation between two different characteristics or conditions in a single group of individuals. The method of its determination and its limiting values are the same as for the coefficient of heredity.

By experience the coefficients of heredity and correlation in general are found to have the following significance:

0.00-no relation.
0.00-0.10—no significant relation.
0.10-0.25—low; relation slight though appreciable.
0.25-0.50—moderate; relation considerable.
0.50-0.75—high; relation marked.
0.75-0.90—very high; relation very marked.
0.90-1.00—nearly complete.
1.00—complete relation.

One further point remains to be considered, which applies not so much to coefficients of heredity as to coefficients of correlation in general, i. e., to the relatedness of two different characters or series of events in a single group of cases or individuals. This is that coefficients of correlation may be either positive or negative. That is, the real limits of the value of the coefficient are plus one and minus one. The example given above of stature of fathers and sons gives a positive coefficient. Whenever the deviation from the average of one group is accompanied in the second group by a deviation in the same direction, the coefficient is positive. A negative correlation means that deviation from the average in a given direction in the first group is accompanied in the second group by a deviation in the opposite direction. If we imagine that as one measurement increased above its average a second related measurement decreased below its average the correlation in such a case would be negative. For instance, if we measured the relation between the number of berry pickers employed and the quantity of berries remaining unpicked, in a number of different fields we would get a negative correlation coefficient. Some organisms are formed in such a way that increase in one dimension, such as length, is associated with decrease in another, such as breadth; measurement of the relatedness of these dimensions would give a coefficient of correlation that might be very high, indicating a considerable relation in the deviations, but it would be negative. In an instance of negative correlation the relation is that of "the more the fewer." As we shall see presently, a negative correlation may be just as important and significant as a positive correlation.