A System of Logic: Ratiocinative and Inductive, 7th Edition, Vol. II - John Stuart Mill

When a judgment applied to an individual instance is grounded on two approximate generalizations taken in conjunction, the propositions may co-operate towards the result in two different ways. In the one, each proposition is separately applicable to the case in hand, and our object in combining them is to give to the conclusion in that particular case the double probability arising from the two propositions separately. This may be called joining two probabilities by way of Addition; and the result is a probability greater than either. The other mode is, when only one of the propositions is directly applicable to the case, the second being only applicable to it by virtue of the application of the first. This is joining two probabilities by way of Ratiocination or Deduction; the result of which is a less probability than either. The type of the first argument is, Most A are B; most C are B; this thing is both an A and a C; therefore it is probably a B. The type of the second is, Most A are B; most C are A; this is a C; therefore it is probably an A, therefore it is probably a B. The first is exemplified when we prove a fact by the testimony of two unconnected witnesses; the second, when we adduce only the testimony of one witness that he has heard the thing asserted by another. Or again, in the first mode it may be argued that the accused committed the crime, because he concealed himself, and because his clothes were stained with blood; in the second, that he committed it because he washed or destroyed his clothes, which is supposed to render it probable that they were stained with blood. Instead of only two links, as in these instances, we may suppose chains of any length. A chain of the former kind was termed by Bentham [35] a self-corroborative chain of evidence; the second, a self-infirmative chain.

When approximate generalizations are joined by way of addition, we may deduce from the theory of probabilities laid down in a former chapter, in what manner each of them adds to the probability of a conclusion which has the warrant of them all.

In the early editions of this treatise, the joint probability arising from the sum of two independent probabilities was estimated in the following manner. If, on an average, two of every three As are Bs, and three of every four Cs are Bs, the probability that something which is both an A and a C is a B, will be more than two in three, or than three in four. Of every twelve things which are As, all except four are Bs by the supposition; and if the whole twelve, and consequently those four, have the characters of C likewise, three of these will be Bs on that ground. Therefore, out of twelve which are both As and Cs, eleven are Bs. To state the argument in another way; a thing which is both an A and a C, but which is not a B, is found in only one of three sections of the class A, and in only one of four sections of the class C; but this fourth of C being spread over the whole of A indiscriminately, only one-third part of it (or one-twelfth of the whole number) belongs to the third section of A; therefore a thing which is not a B occurs only once, among twelve things which are both As and Cs. The argument would in the language of the doctrine of chances, be thus expressed: the chance that an A is not a B is 1/3, the chance that a C is not a B is 1/4; hence if the thing be both an A and a C, the chance is 1/3 of 1/4 = 1/12.

It has, however, been pointed out to me by a mathematical friend, that in this statement the evaluation of the chances is erroneous. The correct mode of setting out the possibilities is as follows. If the thing (let us call it T) which is both an A and a C, is a B, something is true which is only true twice in every thrice, and something else which is only true thrice in every four times. The first fact being true eight times in twelve, and the second being true six times in every eight, and consequently six times in those eight; both facts will be true only six times in twelve. On the other hand if T, although it is both an A and a C, is not a B, something is true which is only true once in every thrice, and something else which is only true once in every four times. The former being true four times out of twelve, and the latter once in every four, and therefore once in those four; both are only true in one case out of twelve. So that T is a B six times in twelve, and T is not a B, only once: making the comparative probabilities, not eleven to one, as I had previously made them, but six to one.

It may be asked, what happens in the remaining cases? since in this calculation seven out of twelve cases seem to have exhausted the possibilities. If T is a B in only six cases of every twelve, and a not-B in only one, what is it in the other five? The only supposition remaining for those cases is that it is neither a B nor not a B, which is impossible. But this impossibility merely proves that the state of things supposed in the hypothesis does not exist in those cases. They are cases that do not furnish anything which is both an A and a C.

To make this intelligible, we will substitute for our symbols a concrete case. Let there be two witnesses, M and N, whose probabilities of veracity correspond with the ratios of the preceding example: M speaks truth twice in every thrice, N thrice in every four times. The question is, what is the probability that a statement, in which they both concur, will be true. The cases may be classed as follows. Both the witnesses will speak truly six in every twelve times; both falsely once in twelve times. Therefore, if they both agree in an assertion, it will be true six times, for once that it will be false. What happens in the remaining cases is here evident; there will be five cases in every twelve in which the witnesses will not agree. M will speak truth and N falsehood in two cases of every twelve; N will speak truth and M falsehood in three cases, making in all five. In these cases, however, the witnesses will not agree in their testimony. But disagreement between them is excluded by the supposition. There are, therefore, only seven cases which are within the conditions of the hypothesis; of which seven, veracity exists in six, and falsehood in one. Resuming our former symbols, in five cases out of twelve T is not both an A and a C, but an A only, or a C only. The cases in which it is both are only seven, in six of which it is a B, in one not a B, making the chance six to one, or 6/7 and 1/7 respectively.

In this correct, as in the former incorrect computation, it is of course presupposed that the probabilities arising from A and C are independent of each other. There must not be any such connexion between A and C, that when a thing belongs to the one class it will therefore belong to the other, or even have a greater chance of doing so. Otherwise the not-Bs which are Cs may be, most or even all of them, identical with the not-Bs which are As; in which last case the probability arising from A and C together will be no greater than that arising from A alone.

When approximate generalizations are joined together in the other mode, that of deduction, the degree of probability of the inference, instead of increasing, diminishes at each step. From two such premises as Most A are B, Most B are C, we cannot with certainty conclude that even a single A is C; for the whole of the portion of A which in any way falls under B, may perhaps be comprised in the exceptional part of it. Still, the two propositions in question afford an appreciable probability that any given A is C, provided the average on which the second proposition is grounded, was taken fairly with reference to the first; provided the proposition, Most B are C, was arrived at in a manner leaving no suspicion that the probability arising from it is otherwise than fairly distributed over the section of B which belongs to A. For though the instances which are A may be all in the minority, they may, also, be all in the majority; and the one possibility is to be set against the other. On the whole, the probability arising from the two propositions taken together, will be correctly measured by the probability arising from the one, abated in the ratio of that arising from the other. If nine out of ten Swedes have light hair, and eight out of nine inhabitants of Stockholm are Swedes, the probability arising from these two propositions, that any given inhabitant of Stockholm is light-haired, will amount to eight in ten; though it is rigorously possible that the whole Swedish population of Stockholm might belong to that tenth section of the people of Sweden who are an exception to the rest.

If the premises are known to be true not of a bare majority, but of nearly the whole, of their respective subjects, we may go on joining one such proposition to another for several steps, before we reach a conclusion not presumably true even of a majority. The error of the conclusion will amount to the aggregate of the errors of all the premises. Let the proposition, Most A are B, be true of nine in ten; Most B are C, of eight in nine: then not only will one A in ten not be C, because not B, but even of the nine-tenths which are B, only eight-ninths will be C: that is, the cases of A which are C will be only 8/9 of 9/10, or four-fifths. Let us now add Most C are D, and suppose this to be true of seven cases out of eight; the proportion of A which is D will be only 7/8 of 8/9 of 9/10, or 7/10. Thus the probability progressively dwindles. The experience, however, on which our approximate generalizations are grounded, has so rarely been subjected to, or admits of, accurate numerical estimation, that we cannot in general apply any measurement to the diminution of probability which takes place at each illation; but must be content with remembering that it does diminish at every step, and that unless the premises approach very nearly indeed to being universally true, the conclusion after a very few steps is worth nothing. A hearsay of a hearsay, or an argument from presumptive evidence depending not on immediate marks but on marks of marks, is worthless at a very few removes from the first stage.

[§ 7.] There are, however, two cases in which reasonings depending on approximate generalizations may be carried to any length we please with as much assurance, and are as strictly scientific, as if they were composed of universal laws of nature. But these cases are exceptions of the sort which are currently said to prove the rule. The approximate generalizations are as suitable, in the cases in question, for purposes of ratiocination, as if they were complete generalizations, because they are capable of being transformed into complete generalizations exactly equivalent.