by repetitions of extremes in the individuals from which the averages were obtained. But intermediate results can be got at in two ways, viz.

either by intermediate individuals, or by combinations of individuals in opposite directions. In the case of the Binomial Law of Error this tendency to thicken towards the centre was already strongly predominant in the individual values before we took them in hand for our average; but owing to this characteristic of combinations we may lay it down (broadly speaking) that any sort of average applied to any sort of law of distribution will give a result which bears the same general relation to the individual values that the dotted lines above bear to the black line.[6]

§ 18. This being so, the speculative advantages of one method of combining, or averaging, or reducing, our observations, over another method,—irrespective, that is, of the practical conveniences in carrying them out,—will consist solely in the degree of rapidity with which it tends thus to cluster the result about the centre. We shall have to subject this merit to a somewhat further analysis, but for the present purpose it will suffice to say that if one kind of average gave the higher dotted line in the figure on [p. 479] and another gave the lower dotted line, we should say that the former was the better one. The advantage is of the same general kind as that which is furnished in algebraical calculation, by a series which converges rapidly towards the true value as compared with one which converges slowly. We can do the work sooner or later by the aid of either; but we get nearer the truth by the same amount of labour, or get as near by a less amount of labour, on one plan than on the other.

As we are here considering the case in which the individual observations are supposed to be grouped in accordance with the Binomial Law, it will suffice to say that in this case there is no doubt that the arithmetical average is not only the simplest and easiest to deal with, but is also the best in the above sense of the term. And since this Binomial Law, or something approximating to it, is of very wide prevalence, a strong primâ facie case is made out for the general employment of the familiar average.

§ 19. The analysis of a few pages back carried the results of the averaging process as far as could be conveniently done by the help of mere arithmetic. To go further we must appeal to higher mathematics, but the following indication of the sort of results obtained will suffice for our present purpose. After all, the successive steps, though demanding intricate reasoning for their proof, are nothing more than generalizations of processes which could be established by simple arithmetic.[7] Briefly, what we do is this:—

(1) We first extend the proof from the binomial form, with its finite number of elements, to the limiting or exponential form. Instead of confining ourselves to a small number of discrete errors, we then recognize the possibility of any number of errors of any magnitude whatever.

(2) In the next place, instead of confining ourselves to the consideration of an average of two or three only,—already, as we have seen, a tedious piece of arithmetic,—we calculate the result of an average of any number, n. The actual result is extremely simple. If the modulus of the single errors is c, that of the average of n of these will be c ÷ √n.

(3) Finally we draw similar conclusions in reference to the sum or difference of two averages of any numbers. Suppose, for instance, that m errors were first taken and averaged, and then n similarly taken and averaged. These averages will be nearly, but not quite, equal. Their sum or difference,—these, of course, are indistinguishable in the end, since positive and negative errors are supposed to be equal and opposite,—will itself be an ‘error’, every magnitude of which will have a certain assignable probability or facility of occurrence. What we do is to assign the modulus of these errors. The actual result again is simple. If c had been the modulus of the single errors, that of the sum or difference of the averages of m and n of them will be

c √ 1/m + 1/n.

§ 20. So far, the problem under investigation has been of a direct kind. We have supposed that the ultimate mean value or central position has been given to us; either à priori (as in many games of chance), or from more immediate physical considerations (as in aiming at a mark), or from extensive statistics (as in tables of human stature). In all such cases therefore the main desideratum is already taken for granted, and it may reasonably be asked what remains to be done. The answers are various. For one thing we may want to estimate the value of an average of many when compared with an average of a few. Suppose that one man has collected statistics including 1000 instances, and another has collected 4000 similar instances. Common sense can recognize that the latter are better than the former; but it has no idea how much better they are. Here, as elsewhere, quantitative precision is the privilege of science. The answer we receive from this quarter is that, in the long run, the modulus,—and with this the probable error, the mean error, and the error of mean square, which all vary in proportion,—diminishes inversely as the square root of the number of measurements or observations. (This follows from the second of the above formulæ.) Accordingly the probable error of the more extensive statistics here is one half that of the less extensive. Take another instance. Observation shows that “the mean height of 2,315 criminals differs from the mean height of 8,585 members of the general adult population by about two inches” (v.