A System of Logic, Ratiocinative and Inductive - John Stuart Mill

There is some difference, however, in the degree of certainty of the proposition, Most A are B, according as that approximate generalization composes the whole of our knowledge of the subject, or not. Suppose, first, that the former is the case. We know only that most A are B, not why they are so, nor in what respect those which are differ from those which are not. How, then, did we learn that most A are B? Precisely in the manner in which we should have learned, had such happened to be the fact that all A are B. We collected a number of instances sufficient to eliminate chance, and, having done so, compared the number of instances [pg 419] in the affirmative with the number in the negative. The result, like other unresolved derivative laws, can be relied on solely within the limits not only of place and time, but also of circumstance, under which its truth has been actually observed; for, as we are supposed to be ignorant of the causes which make the proposition true, we can not tell in what manner any new circumstance might perhaps affect it. The proposition, Most judges are inaccessible to bribes, would probably be found true of Englishmen, Frenchmen, Germans, North Americans, and so forth; but if on this evidence alone we extended the assertion to Orientals, we should step beyond the limits, not only of place but of circumstance, within which the fact had been observed, and should let in possibilities of the absence of the determining causes, or the presence of counteracting ones, which might be fatal to the approximate generalization.

In the case where the approximate proposition is not the ultimatum of our scientific knowledge, but only the most available form of it for practical guidance; where we know, not only that most A have the attribute B, but also the causes of B, or some properties by which the portion of A which has that attribute is distinguished from the portion which has it not, we are rather more favorably situated than in the preceding case. For we have now a double mode of ascertaining whether it be true that most A are B; the direct mode, as before, and an indirect one, that of examining whether the proposition admits of being deduced from the known cause, or from any known criterion, of B. Let the question, for example, be whether most Scotchmen can read? We may not have observed, or received the testimony of others respecting, a sufficient number and variety of Scotchmen to ascertain this fact; but when we consider that the cause of being able to read is the having been taught it, another mode of determining the question presents itself, namely, by inquiring whether most Scotchmen have been sent to schools where reading is effectually taught. Of these two modes, sometimes one and sometimes the other is the more available. In some cases, the frequency of the effect is the more accessible to that extensive and varied observation which is indispensable to the establishment of an empirical law; at other times, the frequency of the causes, or of some collateral indications. It commonly happens that neither is susceptible of so satisfactory an induction as could be desired, and that the grounds on which the conclusion is received are compounded of both. Thus a person may believe that most Scotchmen can read, because, so far as his information extends, most Scotchmen have been sent to school, and most Scotch schools teach reading effectually; and also because most of the Scotchmen whom he has known or heard of could read; though neither of these two sets of observations may by itself fulfill the necessary conditions of extent and variety.

Although the approximate generalization may in most cases be indispensable for our guidance, even when we know the cause, or some certain mark, of the attribute predicated, it needs hardly be observed that we may always replace the uncertain indication by a certain one, in any case in which we can actually recognize the existence of the cause or mark. For example, an assertion is made by a witness, and the question is whether to believe it. If we do not look to any of the individual circumstances of the case, we have nothing to direct us but the approximate generalization, that truth is more common than falsehood, or, in other words, that most persons, on most occasions, speak truth. But if we consider in what circumstances the cases where truth is spoken differ from those in which it is [pg 420] not, we find, for instance, the following: the witness’s being an honest person or not; his being an accurate observer or not; his having an interest to serve in the matter or not. Now, not only may we be able to obtain other approximate generalizations respecting the degree of frequency of these various possibilities, but we may know which of them is positively realized in the individual case. That the witness has or has not an interest to serve, we perhaps know directly; and the other two points indirectly, by means of marks; as, for example, from his conduct on some former occasion; or from his reputation, which, though a very uncertain mark, affords an approximate generalization (as, for instance, Most persons who are believed to be honest by those with whom they have had frequent dealings, are really so), which approaches nearer to a universal truth than the approximate general proposition with which we set out, viz., Most persons on most occasions speak truth.

As it seems unnecessary to dwell further on the question of the evidence of approximate generalizations, we shall proceed to a not less important topic, that of the cautions to be observed in arguing from these incompletely universal propositions to particular cases.

§ 5. So far as regards the direct application of an approximate generalization to an individual instance, this question presents no difficulty. If the proposition, Most A are B, has been established, by a sufficient induction, as an empirical law, we may conclude that any particular A is B with a probability proportioned to the preponderance of the number of affirmative instances over the number of exceptions. If it has been found practicable to attain numerical precision in the data, a corresponding degree of precision may be given to the evaluation of the chances of error in the conclusion. If it can be established as an empirical law that nine out of every ten A are B, there will be one chance in ten of error in assuming that any A, not individually known to us, is a B: but this of course holds only within the limits of time, place, and circumstance, embraced in the observations, and therefore can not be counted on for any sub-class or variety of A (or for A in any set of external circumstances) which were not included in the average. It must be added, that we can guide ourselves by the proposition, Nine out of every ten A are B, only in cases of which we know nothing except that they fall within the class A. For if we know, of any particular instances i, not only that it falls under A, but to what species or variety of A it belongs, we shall generally err in applying to i the average struck for the whole genus, from which the average corresponding to that species alone would, in all probability, materially differ. And so if i, instead of being a particular sort of instance, is an instance known to be under the influence of a particular set of circumstances, the presumption drawn from the numerical proportions in the whole genus would probably, in such a case, only mislead. A general average should only be applied to cases which are neither known, nor can be presumed, to be other than average cases. Such averages, therefore, are commonly of little use for the practical guidance of any affairs but those which concern large numbers. Tables of the chances of life are useful to insurance offices, but they go a very little way toward informing any one of the chances of his own life, or any other life in which he is interested, since almost every life is either better or worse than the average. Such averages can only be considered as supplying the first term in a series of approximations; the subsequent terms proceeding on an appreciation of the circumstances belonging to the particular case.

§ 6. From the application of a single approximate generalization to individual cases, we proceed to the application of two or more of them together to the same case.

When a judgment applied to an individual instance is grounded on two approximate generalizations taken in conjunction, the propositions may cooperate toward the result in two different ways. In the one, each proposition is separately applicable to the case in hand, and our object in combining them is to give to the conclusion in that particular case the double probability arising from the two propositions separately. This may be called joining two probabilities by way of Addition; and the result is a probability greater than either. The other mode is, when only one of the propositions is directly applicable to the case, the second being only applicable to it by virtue of the application of the first. This is joining two probabilities by way of Ratiocination or Deduction; the result of which is a less probability than either. The type of the first argument is, Most A are B; most C are B; this thing is both an A and a C; therefore it is probably a B. The type of the second is, Most A are B; most C are A; this is a C; therefore it is probably an A, therefore it is probably a B. The first is exemplified when we prove a fact by the testimony of two unconnected witnesses; the second, when we adduce only the testimony of one witness that he has heard the thing asserted by another. Or again, in the first mode it may be argued that the accused committed the crime, because he concealed himself, and because his clothes were stained with blood; in the second, that he committed it because he washed or destroyed his clothes, which is supposed to render it probable that they were stained with blood. Instead of only two links, as in these instances, we may suppose chains of any length. A chain of the former kind was termed by Bentham[195] a self-corroborative chain of evidence; the second, a self-infirmative chain.

When approximate generalizations are joined by way of addition, we may deduce from the theory of probabilities laid down in a former chapter, in what manner each of them adds to the probability of a conclusion which has the warrant of them all.

If, on an average, two of every three As are Bs, and three of every four Cs are Bs, the probability that something which is both an A and a C is a B, will be more than two in three, or than three in four. Of every twelve things which are As, all except four are Bs by the supposition; and if the whole twelve, and consequently those four, have the characters of C likewise, three of these will be Bs on that ground. Therefore, out of twelve which are both As and Cs, eleven are Bs. To state the argument in another way; a thing which is both an A and a C, but which is not a B, is found in only one of three sections of the class A, and in only one of four sections of the class C; but this fourth of C being spread over the whole of A indiscriminately, only one-third part of it (or one-twelfth of the whole number) belongs to the third section of A; therefore a thing which is not a B occurs only once, among twelve things which are both As and Cs. The argument would, in the language of the doctrine of chances, be thus expressed: the chance that an A is not a B is ⅓, the chance that a C is not a B is ¼; hence if the thing be both an A and a C, the chance is ⅓ of ¼ = ¹⁄₁₂.[196]

In this computation it is of course supposed that the probabilities arising from A and C are independent of each other. There must not be any such connection between A and C, that when a thing belongs to the one class it will therefore belong to the other, or even have a greater chance of doing so. Otherwise the not-Bs which are Cs may be, most or even all of them, identical with the not-Bs which are As; in which last case the probability arising from A and C together will be no greater than that arising from A alone.