A System of Logic, Ratiocinative and Inductive - John Stuart Mill

§ 2. Suppose that we are required to take a ball from a box, of which we only know that it contains balls both black and white, and none of any other color. We know that the ball we select will be either a black or a white ball; but we have no ground for expecting black rather than white, or white rather than black. In that case, if we are obliged to make a choice, and to stake something on one or the other supposition, it will, as a question of prudence, be perfectly indifferent which; and we shall act precisely as we should have acted if we had known beforehand that the box contained an equal number of black and white balls. But though our conduct would be the same, it would not be founded on any surmise that the balls were in fact thus equally divided; for we might, on the contrary, know by authentic information that the box contained ninety-nine balls of one color, and only one of the other; still, if we are not told which color has only one, and which has ninety-nine, the drawing of a white and of a black ball will be equally probable to us. We shall have no reason for staking any thing on the one event rather than on the other; the option between the two will be a matter of indifference; in other words, it will be an even chance.

But let it now be supposed that instead of two there are three colors—white, black, and red; and that we are entirely ignorant of the proportion in which they are mingled. We should then have no reason for expecting one more than another, and if obliged to bet, should venture our stake on red, white, or black with equal indifference. But should we be indifferent whether we betted for or against some one color, as, for instance, white? Surely not. From the very fact that black and red are each of them separately equally probable to us with white, the two together must be twice as probable. We should in this case expect not white rather than white, and so much rather that we would lay two to one upon it. It is true, there might, for aught we knew, be more white balls than black and red together; and if so, our bet would, if we knew more, be seen to be a disadvantageous one. But so also, for aught we knew, might there be more red balls than black and white, or more black balls than white and red, and in such case [pg 381] the effect of additional knowledge would be to prove to us that our bet was more advantageous than we had supposed it to be. There is in the existing state of our knowledge a rational probability of two to one against white; a probability fit to be made a basis of conduct. No reasonable person would lay an even wager in favor of white against black and red; though against black alone or red alone he might do so without imprudence.

The common theory, therefore, of the calculation of chances, appears to be tenable. Even when we know nothing except the number of the possible and mutually excluding contingencies, and are entirely ignorant of their comparative frequency, we may have grounds, and grounds numerically appreciable, for acting on one supposition rather than on another; and this is the meaning of Probability.

§ 3. The principle, however, on which the reasoning proceeds, is sufficiently evident. It is the obvious one that when the cases which exist are shared among several kinds, it is impossible that each of those kinds should be a majority of the whole: on the contrary, there must be a majority against each kind, except one at most; and if any kind has more than its share in proportion to the total number, the others collectively must have less. Granting this axiom, and assuming that we have no ground for selecting any one kind as more likely than the rest to surpass the average proportion, it follows that we can not rationally presume this of any, which we should do if we were to bet in favor of it, receiving less odds than in the ratio of the number of the other kinds. Even, therefore, in this extreme case of the calculation of probabilities, which does not rest on special experience at all, the logical ground of the process is our knowledge—such knowledge as we then have—of the laws governing the frequency of occurrence of the different cases; but in this case the knowledge is limited to that which, being universal and axiomatic, does not require reference to specific experience, or to any considerations arising out of the special nature of the problem under discussion.

Except, however, in such cases as games of chance, where the very purpose in view requires ignorance instead of knowledge, I can conceive no case in which we ought to be satisfied with such an estimate of chances as this—an estimate founded on the absolute minimum of knowledge respecting the subject. It is plain that, in the case of the colored balls, a very slight ground of surmise that the white balls were really more numerous than either of the other colors, would suffice to vitiate the whole of the calculations made in our previous state of indifference. It would place us in that position of more advanced knowledge, in which the probabilities, to us, would be different from what they were before; and in estimating these new probabilities we should have to proceed on a totally different set of data, furnished no longer by mere counting of possible suppositions, but by specific knowledge of facts. Such data it should always be our endeavor to obtain; and in all inquiries, unless on subjects equally beyond the range of our means of knowledge and our practical uses, they may be obtained, if not good, at least better than none at all.[177]

It is obvious, too, that even when the probabilities are derived from observation and experiment, a very slight improvement in the data, by better observations, or by taking into fuller consideration the special circumstances of the case, is of more use than the most elaborate application of the calculus to probabilities founded on the data in their previous state of inferiority. The neglect of this obvious reflection has given rise to misapplications of the calculus of probabilities which have made it the real opprobrium of mathematics. It is sufficient to refer to the applications made of it to the credibility of witnesses, and to the correctness of the verdicts of juries. In regard to the first, common sense would dictate that it is impossible to strike a general average of the veracity and other qualifications for true testimony of mankind, or of any class of them; and even if it were possible, the employment of it for such a purpose implies a misapprehension of the use of averages, which serve, indeed, to protect those whose interest is at stake, against mistaking the general result of large masses of instances, but are of extremely small value as grounds of expectation in any one individual instance, unless the case be one of those in which the great majority of individual instances do not differ much from the average. In the case of a witness, persons of common sense would draw their conclusions from the degree of consistency of his statements, his conduct under cross-examination, and the relation of the case itself to his interests, his partialities, and his mental capacity, instead of applying so rude a standard (even if it were capable of being verified) as the ratio between the number of true and the number of erroneous statements which he may be supposed to make in the course of his life.

Again, on the subject of juries or other tribunals, some mathematicians have set out from the proposition that the judgment of any one judge or juryman is, at least in some small degree, more likely to be right than wrong, and have concluded that the chance of a number of persons concurring in a wrong verdict is diminished the more the number is increased; so that if the judges are only made sufficiently numerous, the correctness of the judgment may be reduced almost to certainty. I say nothing of the disregard shown to the effect produced on the moral position of the judges by multiplying their numbers, the virtual destruction of their individual responsibility, and weakening of the application of their minds to the subject. I remark only the fallacy of reasoning from a wide average to cases necessarily differing greatly from any average. It may be true that, taking all causes one with another, the opinion of any one of the judges would be oftener right than wrong; but the argument forgets that in all but the more simple cases, in all cases in which it is really of much consequence what the tribunal is, the proposition might probably be reversed; besides which, the cause of error, whether arising from the intricacy of the case or from some common prejudice or mental infirmity, if it acted upon one judge, would be extremely likely to affect all the others in the same manner, [pg 383] or at least a majority, and thus render a wrong instead of a right decision more probable the more the number was increased.

These are but samples of the errors frequently committed by men who, having made themselves familiar with the difficult formulæ which algebra affords for the estimation of chances under suppositions of a complex character, like better to employ those formulæ in computing what are the probabilities to a person half informed about a case than to look out for means of being better informed. Before applying the doctrine of chances to any scientific purpose, the foundation must be laid for an evaluation of the chances, by possessing ourselves of the utmost attainable amount of positive knowledge. The knowledge required is that of the comparative frequency with which the different events in fact occur. For the purposes, therefore, of the present work, it is allowable to suppose that conclusions respecting the probability of a fact of a particular kind rest on our knowledge of the proportion between the cases in which facts of that kind occur, and those in which they do not occur; this knowledge being either derived from specific experiment, or deduced from our knowledge of the causes in operation which tend to produce, compared with those which tend to prevent, the fact in question.

Such calculation of chances is grounded on an induction; and to render the calculation legitimate, the induction must be a valid one. It is not less an induction, though it does not prove that the event occurs in all cases of a given description, but only that out of a given number of such cases it occurs in about so many. The fraction which mathematicians use to designate the probability of an event is the ratio of these two numbers; the ascertained proportion between the number of cases in which the event occurs and the sum of all the cases, those in which it occurs and in which it does not occur, taken together. In playing at cross and pile, the description of cases concerned are throws, and the probability of cross is one-half, because if we throw often enough cross is thrown about once in every two throws. In the cast of a die, the probability of ace is one-sixth; not simply because there are six possible throws, of which ace is one, and because we do not know any reason why one should turn up rather than another—though I have admitted the validity of this ground in default of a better—but because we do actually know, either by reasoning or by experience, that in a hundred or a million of throws ace is thrown in about one-sixth of that number, or once in six times.

§ 4. I say, “either by reasoning or by experience,” meaning specific experience. But in estimating probabilities, it is not a matter of indifference from which of these two sources we derive our assurance. The probability of events, as calculated from their mere frequency in past experience, affords a less secure basis for practical guidance than their probability as deduced from an equally accurate knowledge of the frequency of occurrence of their causes.