The Logic of Chance, 3rd edition / An Essay on the Foundations and Province of the Theory of Probability, With Especial Reference to Its Logical Bearings and Its Application to Moral and Social Science and to Statistics - John Venn

Now suppose we take 1000 such digits instead of 10. We can say nothing more about the larger number, with demonstrative certainty, than we could before about the smaller. If they were unequal to begin with (i.e.

if they were not all the same) then the average must be intermediate, but more than this cannot be proved arithmetically. By comparison with such purely arithmetical considerations there is what may be called a physical fact underlying our confidence in the growing stability of the average of the larger number. It is that the constituent elements from which the average is deduced will themselves betray a growing uniformity:—that the proportions in which the different digits come out will become more and more nearly equal as we take larger numbers of them. If the proportions in which the 1000 digits were distributed were the same as those of the 10 the averages would be the same. It is obvious therefore that the arithmetical process of obtaining an average goes a very little way towards securing the striking kind of uniformity which we find to be actually presented.

§ 20. There is another way in which the same thing may be put. It is sometimes said that whatever may have been the arrangement of the original elements the process of continual averaging will necessarily produce the peculiar binomial or exponential law of arrangement. This statement is perfectly true (with certain safeguards) but it is not in any way opposed to what has been said above. Let us take for consideration the example above referred to. The arrangement of the individual digits in the long run is the simplest possible. It would be represented, in a diagram, not by a curve but by a finite straight line, for each digit occurs about as often as any other, and this exhausts all the ‘arrangement’ that can be detected. Now, when we consider the results of taking averages of ten such digits, we see at once that there is an opening for a more extensive arrangement. The totals may range from 0 up to 100, and therefore the average will have 100 values from 0 to 9; and what we find is that the frequency of these numbers is determined according to the Binomial[5] or Exponential Law. The most frequent result is the true mean, viz. 4.5, and from this they diminish in each direction towards 0 and 10, which will each occur but once (on the average) in 10¹⁰ occasions.

The explanation here is of the same kind as in the former case. The resultant arrangement, so far as the averages are concerned, is only ‘necessary’ in the sense that it is a necessary result of certain physical assumptions or experiences. If all the digits tend to occur with equal frequency, and if they are ‘independent’ (i.e.

if each is associated indifferently with every other), then it is an arithmetical consequence that the averages when arranged in respect of their magnitude and prevalence will display the Law of Facility above indicated. Experience, so far as it can be appealed to, shows that the true randomness of the selection of the digits,—i.e.

their equally frequent recurrence, and the impartiality of their combination,—is very fairly secured in practice. Accordingly the theoretic deduction that whatever may have been the original Law of Facility of the individual results we shall always find the familiar Exponential Law asserting itself as the law of the averages, is fairly justified by experience in such a case.

The further discussion of certain corrections and refinements is reserved to the following chapter.

§ 21. In regard to the three kinds of average employed to test the amount of dispersion,—i.e.

the mean error, the probable error, and the error of mean square,—two important considerations must be borne in mind. They will both recur for fuller discussion and justification in the course of the next chapter, when we come to touch upon the Method of Least Squares, but their significance for logical purposes is so great that they ought not to be entirely passed by at present.

(1) In the first place, then, it must be remarked that in order to know what in any case is the real value of an error we ought in strictness to know what is the position of the limit or ultimate average, for the amount of an error is always theoretically measured from this point. But this is information which we do not always possess. Recurring once more to the three principal classes of events with which we are concerned, we can readily see that in the case of games of chance we mostly do possess this knowledge. Instead of appealing to experience to ascertain the limit, we practically deduce it by simple mechanical or arithmetical considerations, and then the ‘error’ in any individual case or group of cases is obviously found by comparing the results thus obtained with that which theory informs us would ultimately be obtained in the long run. In the case of deliberate efforts at an aim (the third class) we may or may not know accurately the value or position of this aim. In astronomical observations we do not know it, and the method of Least Squares is a method for helping us to ascertain it as well as we can; in such experimental results as firing at a mark we do know it, and may thus test the nature and amount of our failure by direct experience. In the remaining case, namely that of what we have termed natural kinds or groups of things, not only do we not know the ultimate limit, but its existence is always at least doubtful, and in many cases may be confidently denied. Where it does exist, that is, where the type seems for all practical purposes permanently fixed, we can only ascertain it by a laborious resort to statistics. Having done this, we may then test by it the results of observations on a small scale. For instance, if we find that the ultimate proportion of male to female births is about 106 to 100, we may then compare the statistics of some particular district or town and speak of the consequent ‘error,’ viz.