FREQUENCY DISTRIBUTIONS AND PROBABILITY

Let the reader keep a note of the number of trumps held by himself and partner in a large number of games of whist (the cards being cut for trump). In 200 hands he may get such results as the following:

No. of trumps in his own and partner’s hands—0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13.

No. of times this hand was held—0, 0, 0, 1, 9, 29, 53, 52, 35, 14, 6, 1, 0, 0.

He should note also the number of times that trumps were spades, clubs, diamonds, and hearts: he will get some such results as the following: spades, 46; clubs, 53; diamonds, 51; hearts, 50.

The numbers in the lower line of the first series form a “frequency distribution,” for they tell us the frequency of occurrence of the hands indicated in the numbers above them. “No. of trumps” is the independent variable, and “no. of times these nos. of trumps were held” is the dependent variable.

A frequency distribution represents the way in which the results of a series of experiments differ from the mean result. A particular result is expected from the operation of one, or a few, main causes. But a number of other relatively unimportant causes lead to the deviation of a number of results from this mean or characteristic one. Yet since one, or a few, main causes are predominant, the majority of the results of the experiment will approximate closely to the mean; and a relatively small proportion will deviate to variable distances on either side of the mean. If a pack of cards were shuffled so that all the suits were thoroughly mixed among each other, then we should expect the trumps to be as equally divided as possible between the four players. But a number of causes lead to irregularities in this desired uniform distribution, and so the results of a large number of deals deviate from the mean result. It is possible, by an application of the theory of probability, to calculate ideal, or theoretical frequency distributions, basing our reasoning on the considerations suggested above. We then find that the observed and calculated frequency distributions may be very much alike.

In biological investigation, far more than in physical investigation, we deal with mean results. It is, however, just as important that the mean should be considered as the individual divergences from the mean. We want to know the mean results, and the way and the extent in which the individual results diverge from the mean.

There is a mean or “ideal” result, but we must think of a great number of small independent causes which cause the actually obtained results to diverge from this mean. If these small un-co-ordinated causes are just as likely to cause the results to be less than the mean, as greater than the mean, we shall obtain a frequency distribution resembling the one given above, in that the variations from the mean are equal on both sides of the mean. But if the general tendency of the small un-co-ordinated causes is to cause the results, on the whole, to tend to be greater than the mean, then the frequency distribution will be “one-sided,” that is, if we represent it by a curve the latter will be an asymmetrical one. Curves which are asymmetrical are those most frequently obtained in biological, statistical investigations.