Encyclopaedia Britannica, 11th Edition, 'Evangelical Church Conference' to 'Fairbairn, Sir William' / Volume 10, Slice 1 - Various

In the award of scholarships, &c. , it should be definitely decided whether the scholarship is to be awarded (1) for attainment, in which case the examination-test pure and simple may suffice, or (2) for promise, in which case personal information and a curriculum vitae are necessary. To take a simple instance: a candidate partly educated in Germany may obtain more marks in German at a scholarship examination than another who is more gifted, but whose opportunities have been less; the question at once arises, are the examiners to take the circumstances of the candidate into account or not? It is understood that at the colleges of the older universities such circumstances are considered. It must again be decided whether the financial circumstances of candidates are to be taken into account; are scholarships intended as prizes, or as a means of enabling poor students to obtain a university education? In some cases wealthy students have been known to return the emoluments of scholarships. In many universities of the United States there is a definite understanding that emoluments shall only be accepted by those needing them. It would not be difficult to ask candidates to make a confidential declaration on this subject on entrance and to establish in Great Britain a tradition similar to that of the United States, and steps in this direction have been taken both at Oxford and Cambridge (Lord Curzon of Kedleston, University Reform, p. 86).

A special allowance may be made for age. In certain scholarship examinations held formerly by the London County Council a percentage was added to the marks of each candidate proportionate to the number of months by which his age fell short of the maximum age for entry. The whole subject of entrance scholarships at English schools and universities, and especially their tendency to produce premature specialization, has recently been much discussed.

6. The Organization and Conduct of Examinations.—The organization and conduct of examinations, in such a way that each candidate shall be treated in precisely the same way as every other candidate, is a complex matter, especially where several thousand candidates are concerned. The greatest precautions must be taken to ensure the secrecy of the examination papers before the examination, and the effective isolation of individual candidates during the examination. The supervision should be adequate to remove all temptation to copying. The hygienic conditions should be such as to reduce the strain to a minimum. The question of the mental fatigue produced by examinations has been studied by certain German observers, but has not yet been fully investigated.

7. Marking, Classification and Errors of Detail.—In applying a single test in a qualifying examination it would be sufficient to mark candidates as passing or failing. But examinations consist as a rule of a number of tests, each one of which is complex; and a mark is recorded in respect of each test or portion of a test in order to enable the examining body to estimate the performance, considered as a whole, of the candidate. At Oxford the marks are not numerical, but the papers are judged as of this or that supposed “class,” and various degrees of merit are indicated by the symbols α, β, γ, δ, to which the signs + or − may be prefixed, according as they are above or below a certain standard within each class. At Cambridge, numerical marks are used. The advantage of numerical marks is that they are more easily manipulated than symbols; the disadvantage, that they produce the false impression that merit can be estimated with mathematical accuracy. Professor F.Y. Edgeworth, in two papers on “The Statistics of Examinations” and the “Element of Chance in Competitive Examinations” (Journal of the Royal Statistical Society, 1888 and 1890), has dealt with the subject, although on somewhat limited lines. His investigations show clearly that with candidates near the border-line of failure, which must necessarily be fixed at a given point (subject to certain allowances, where more than one subject is considered), the element of chance necessarily enters largely into the question of pass and failure. The fact may be stated in this way:—the general efficiency of the test being granted, it is true to say that the large majority of those who pass an examination will be superior in efficiency to those who fail; but a few of those who fail may be superior to a few of those who pass. These errors are not peculiar to the examination system, they are inherent in all human judgments. It is necessary to allow for them in considering the failure of an individual candidate as an index of inefficiency.

The element of chance, which prevails in the region on either side of the border between pass and failure, obviously prevails equally on either side of the border between “classes,” where candidates are classified; it has been suggested by Dr Schuster that numerical order should accompany classification so as to avoid the creation of an artificial gap between the last candidate in one class and the highest in the next. Edgeworth’s objection to such an argument is that the number of uncertainties is far less when candidates are classed than when they are placed in ostensible order of merit.

The difficulties of comparison of marks are further complicated when students take different subjects and it is necessary to compare their merit by means of marks allotted by different examiners and added together. In a pass examination the question has to be considered how far, if at all, excellence in one subject shall compensate for deficiency in another, a question which is indeterminate until the precise object of the whole examination is formulated. In the competitive examination for the Indian civil service, places are allotted on the aggregate of marks obtained in a number of subjects selected by the candidate from a list of thirty-two. The successful candidates are compared a year later on the results of another examination in which there is again a choice, though a much more limited one. The order of merit in the two examinations is, as a rule, very different.

Two further points may be noted. An examiner may have underestimated the time required to answer the questions which he has set; this will be obvious if with a large number of candidates (say 300 or 400) none approaches the maximum mark. In this case the maximum should be reduced. Again, it is generally recognized to be undesirable to give marks for a smattering. In order to avoid this various devices are adopted. The simplest is to award a proportion of marks (say 10 to 15, or even 20%) for “general impression.” In some examinations, unless say 20% or more marks are obtained for a particular subject, no credit is given for the paper in that subject. Latham (The Action of Examinations, 1877, p. 490) describes other numerical adjustments used to meet this difficulty, especially that used in English civil service examinations. The numerical results of the civil service examinations are reduced so as to conform to a certain symmetrical “frequency-curve,” of which the abscissae represent percentages of marks between definite limits and the ordinates the number of candidates obtaining marks between those limits. C.E. Fawsitt (The Education of the Examiner, Royal Philosophical Society of Glasgow, 1905) shows that frequency-curves deduced from actual investigation of class-marks are not symmetrical, but have two maxima corresponding to the performance of “non-workers” and of “workers.” In pass examinations of a well-known character there is a maximum just beyond the pass mark, this being the point of efficiency at which many students aim.

8. The Object and Efficiency of Examinations, and their Indirect Effects.—In order to estimate the efficiency of an examination as a test, the precise question should be asked in each case—what is it intended to test? Much of the evil attributed to, and resulting from, examinations is due to the fact that this question has not been definitely put, and that a test legitimate for certain purposes has been used for others to which it is unsuited. Examinations are suited in the first instance for the purpose for which they were originally designed in medieval universities—the test of technical and professional capacity; it has never been proposed to abolish qualifying examinations for doctors, pharmaceutical chemists, &c. ; the tests applied are (or should be) direct tests of capacity carried out under conditions as nearly as possible like those of actual practice. If a student can auscultate correctly, or make up a prescription, at an examination, he will in all probability be able to do so in other circumstances.

Examinations as tests of the knowledge of isolated facts are necessarily of relatively small value, because the memory of such facts is transient; and memorization of a large number of facts for examination purposes is generally admitted to be specially transient; the “knowledge-test,” considered apart from a test of capacity, is in fact not a test of permanent knowledge, but of the power of retaining facts for a length of time which it is impossible to estimate and which with some candidates extends over a few weeks only. When used as tests of “general culture,” examinations, in the view of Paulsen, based on a study of German education, not only fail in their purpose, but tend to destroy the faculties which it is desired to develop (Geschichte des gelehrten Unterrichts, ii. 684 et seq.); to prepare ready answers to the numberless questions which an examiner may ask on a large variety of subjects is to paralyse the natural and free activity of the mind (cf. A.C. Benson on the results of English secondary classical education, From a College Window, 3rd ed., 1906, pp. 154-177). If pushed to its logical conclusion the view of Paulsen must, it is submitted, lead to the complete abandonment at examinations of tests of “knowledge” as distinguished from direct tests of capacity. Thus isolated questions on details of grammar would disappear from papers on the mother-tongue and on foreign languages, in which the test would consist mainly or entirely of composition and translation. Erudition would be tested by the power of writing, at leisure, a dissertation on some subject selected by the examiners or the candidate or, in the case of a teacher, by the delivery of a lecture on the subject. At the French agrégation candidates are given twenty-four hours for the preparation of a lecture of this kind. Such examinations would test the “skill in the manipulation of facts which is the true sign of a trained intelligence” (cf. K. Pearson, “The Function of Science in the Modern State,” Ency. Brit. 10th ed. xxxii. Prefatory essay). They might possibly be supplemented by easy oral examinations to test both range of knowledge and readiness of mind. But in the case of a pupil who had passed through a good secondary school it would be as safe to rely for supplementary information under this head on the testimony of his teachers, as it is to rely on their evidence with regard to the fundamental and all-important element on which no examination supplies direct information—personal character.

The main arguments of those opposed to the examination system may be summarized as follows: (i.) Examinations tend to destroy natural interests and exclude from the attention of the pupil all matters outside the purview of the examination (they would not do so if examinations were so limited in character that preparation therefor could absorb only a fraction of the pupil’s time); (ii.) they tend to cultivate a personal judgment where no personal basis of judgment is possible (this argument, directed mainly against the Oxford essay system, applies not to examinations in general, but to the character of the subjects set for essays); (iii.) competitive examinations on the home and Indian civil services scheme tend to diffuse mental energy over too many subjects (but see (xviii.) below); (iv.) examinations, especially competitive examinations, tend to become more and more difficult, difficulty being confused with efficiency—this has shown itself with the Cambridge mathematical tripos, in which for years questions of increasing difficulty were set on relatively unimportant subjects, until the examination was reformed (reply: all examinations should be overhauled periodically); (v.) they tend to paralyse the powers of exposition, all statements of knowledge being thrown into a form suitable, not for an uninstructed person, but for one who already possesses it, the examiner (this tendency should be counteracted by definite training in composition); (vi.) the sample of knowledge and capacity yielded at an examination is frequently not a fair sample; it is liable to extreme variations in a favourable sense, if the candidate happens to have prepared the precise questions asked; in an unfavourable sense, if the candidate is suffering from misfortune or from accidental ill-health, the latter, owing to the periodic function, occurring much more frequently in the case of women than of men—[the reform of examination methods may remove to a great extent the element of chance in questions set; in a competitive examination it is impossible to allow for ill-health; in a qualifying examination it is difficult to make any allowance unless the examination is definitely conducted in whole or in part by the teachers, and the past record of the candidate is taken into account (cf. Paulsen, The German Universities, pp. 344-345)]; (vii.) examinations of several hundred candidates at a time cannot be rationally conducted so as to be equally fair to the individuality of all candidates; the individual test is the only complete one (it is admitted that examinations on a large scale necessarily involve a margin of error; but this error may be reduced to a minimum, especially by a combination of oral and practical with written work); (viii.) the multiplicity of school examinations required for different reasons produces confusion in our secondary education (there is a growing tendency to admit equivalence of “school-leaving” and entrance examinations; thus entrance examinations of Oxford, Cambridge and London, and the Northern Universities Joint Board are interchangeable under certain conditions); (ix.) the multiplicity of examinations tends to “underselling” (the success of the London examinations in medicine proves that a high standard attracts candidates as well as a low one; possibly intermediate standards may be killed in the competition; it is by no means obvious that a uniform system of examinations would conduce to efficiency); (x.) examinations produce physical damage to health, especially in the case of women-students (on this point more statistical evidence is needed; see, however, Engelmann quoted by G. Stanley Hall, Adolescence, 1905, ii. 588 et seq.); (xi.) examinations have in England mechanically cast the education of women into the same mould as that of men, without reference to the different social functions of the two sexes (the remedy is obvious); (xii.) it is unjustifiable to give a man a university position on the results of his performance in the examination room, a practice common in England though almost unknown on the continent; a just estimate of a man’s powers in research or for teaching can only be properly based on his performance. The present system merely leads to the transmission of the sterile art of passing examinations. (At Oxford and Cambridge many fellowships are now awarded on the results of examination; it is sometimes stated, in defence of this system, that young men cannot be expected to carry out research in classics or philosophy.)