(I.) For one plan we may make a direct appeal to experience, by collecting sets of statistics and observing what is their law of distribution. As remarked above, this has been done in a great variety of cases, and in some instances to a very considerable extent, by Quetelet and others. His researches have made it abundantly convincing that many classes of things and processes, differing widely in their nature and origin, do nevertheless appear to conform with a considerable degree of accuracy to one and the same[5] law. At least this is made plain for the more central values, for those that is which are situated most nearly about the mean. With regard to the extreme values there is, on the other hand, some difficulty. For instance in the arrangements of the heights of a number of men, these extremes are rather a stumbling-block; indeed it has been proposed to reject them from both ends of the scale on the plea that they are monstrosities, the fact being that their relative numbers do not seem to be by any means those which theory would assign.[6] Such a plan of rejection is however quite unauthorized, for these dwarfs and giants are born into the world like their more normally sized brethren, and have precisely as much right as any others to be included in the formulæ we draw up.
Besides the instance of the heights of men, other classes of observations of a somewhat similar character have been already referred to as collected and arranged by Quetelet. From the nature of the case, however, there are not many appropriate ones at hand; for when our object is, not to illustrate a law which can be otherwise proved, but to obtain actual direct proof of it, the collection of observations and measurements ought to be made upon such a large scale as to deter any but the most persevering computers from undergoing the requisite labour. Some of the remarks made in the course of the note on the opposite page will serve to illustrate the difficulties which would lie in the way of such a mode of proof.
We are speaking here, it must be understood, only of symmetrical curves: if there is asymmetry, i.e.
if the Law of Error is different on different sides of the mean,—a comparatively very small number of observations would suffice to detect the fact. But, granted symmetry and rapid decrease of frequency on each side of the mean, we could generally select some one species of the exponential curve which should pretty closely represent our statistics in the neighbourhood of the mean. That is, where the statistics are numerous we could secure agreement; and where we could not secure agreement the statistics would be comparatively so scarce that we should have to continue the observations for a very long time in order to prove the disagreement.
§ 6. Allowing the various statistics such credit as they deserve, for their extent, appropriateness, accuracy and so on, the general conclusion which will on the whole be drawn by almost every one who takes the trouble to consult them, is that they do, in large part, conform approximately to one type or law, at any rate for all except the extreme values. So much as this must be fully admitted. But that they do not, indeed we may say that they cannot, always do so in the case of the extreme values, will become obvious on a little consideration. In some of the classes of things to which the law is supposed to apply, for example, the successions of heads and tails in the throws of a penny, there is no limit to the magnitude of the fluctuations which may and will occur. Postulate as long a succession of heads or of tails as we please, and if we could only live and toss long enough for it we should succeed in getting it at length. In other cases, including many of the applications of Probability to natural phenomena, there can hardly fail to be such limits. Deviations exceeding a certain range may not be merely improbable, that is of very rare occurrence, but they may often from the nature of the case be actually impossible. And even when they are not actually impossible it may frequently appear on examination that they are only rendered possible by the occasional introduction of agencies which are not supposed to be available in the production of the more ordinary or intermediate values. When, for instance, we are making observations with any kind of instrument, the nature of its construction may put an absolute limit upon the possible amount of error. And even if there be not an absolute limit under all kinds of usage it may nevertheless be the case that there is one under fair and proper usage; it being the case that only when the instrument is designedly or carelessly tampered with will any new causes of divergence be introduced which were not confined within the old limits.
Suppose, for instance, that a man is firing at a mark. His worst shots must be supposed to be brought about by a combination of such causes as were acting, or prepared to act, in every other case; the extreme instance of what we may thus term ‘fair usage’ being when a number of distinct causes have happened to conspire together so as to tend in the same direction, instead of, as in the other cases, more or less neutralizing one another's work. But the aggregate effect of such causes may well be supposed to be limited. The man will not discharge his shot nearly at right angles to the true line of fire unless some entirely new cause comes in, as by some unusual circumstance having distracted his attention, or by his having had some spasmodic seizure. But influences of this kind were not supposed to have been available before; and even if they were we are taking a bold step in assuming that these occasional great disturbances are subject to the same kind of laws as are the aggregates of innumerable little ones.
We cannot indeed lay much stress upon an example of this last kind, as compared with those in which we can see for certain that there is a fixed limit to the range of error. It is therefore offered rather for illustration than for proof. The enormous, in fact inconceivable magnitude of the numbers expressive of the chance of very rare combinations, such as those in question, has such a bewildering effect upon the mind that one may be sometimes apt to confound the impossible with the higher degrees of the merely mathematically improbable.
§ 7. At the time the first edition of this essay was composed writers on Statistics were, I think, still for the most part under the influence of Quetelet, and inclined to overvalue his authority on this particular subject: of late however attention has been repeatedly drawn to the necessity of taking account of other laws of arrangement than the binomial or exponential.
Mr Galton, for instance,—to whom every branch of the theory of statistics owes so much,—has insisted[7] that the “assumption which lies at the basis of the well-known law of ‘Frequency of Error’… is incorrect in many groups of vital and social phenomena…. For example, suppose we endeavour to match a tint; Fechner's law, in its approximative and simplest form of sensation = log stimulus, tells us that a series of tints, in which the quantities of white scattered on a black ground are as 1, 2, 4, 8, 16, 32, &c., will appear to the eye to be separated by equal intervals of tint. Therefore, in matching a grey that contains 8 portions of white, we are just as likely to err by selecting one that has 16 portions as one that has 4 portions. In the first case there would be an error in excess, of 8; in the second there would be an error, in deficiency, of 4. Therefore, an error of the same magnitude in excess or in deficiency is not equally probable.” The consequences of this assumption are worked out in a remarkable paper by Dr D. McAlister, to which allusion will have to be made again hereafter. All that concerns us here to point out is that when the results of statistics of this character are arranged graphically we do not get a curve which is symmetrical on both sides of a central axis.
§ 8. More recently, Mr F. Y. Edgeworth (in a report of a Committee of the British Association appointed to enquire into the variation of the monetary standard) has urged the same considerations in respect of prices of commodities. He gives a number of statistics “drawn from the prices of twelve commodities during the two periods 1782–1820, 1820–1865. The maximum and minimum entry for each series having been noted, it is found that the number of entries above the ‘middle point,’ half-way between the maximum and minimum,[8] is in every instance less than half the total number of entries in the series. In the twenty-four trials there is not a single exception to the rule, and in very few cases even an approach to an exception. We may presume then that the curves are of the lop-sided character indicated by the accompanying diagram.” The same facts are also ascertained in respect to place variations as distinguished from time variations. To these may be added some statistics of my own, referring to the heights of the barometer taken at the same hour on more than 4000 successive days (v. Nature, Sept. 2, 1887). So far as these go they show a marked asymmetry of arrangement.