The advantage of such a well-selected “team” of tests is not so much that it selects various grades of ability more accurately than supervisors could select it after many months of experience in trying to train the new material, but that the tests make a satisfactory classification immediately, which saves the salaries and time of those applicants who would certainly fail in the training period. Even with the very best coefficients of coördination between the tests and actual demonstrated ability in the trade or position, the tests will not be infallible. On the other hand, no supervisor’s judgment would be infallible, either. And the supervisor would be much more likely to make errors through personal likes and dislikes than the impersonal tests could possibly be.
The tests are an invaluable aid, when they are themselves chosen with the scientific care outlined above, although it would be a short-sighted policy for any firm to trust entirely to the results of intelligence tests in the employment of its personnel. Appearance, voice, education, manners, physical size, and many other qualities are sometimes quite as important as the degree of intelligence, and the intelligence tests do not measure other elements of personality than the mental qualities.
Warning should also be given against using a particular set of intelligence tests, selected because they show high correspondence with ability in salesmanship, for example, as a measure of the intellectual qualities of candidates for some other position. Sets of tests, selected because they have been found accurate in classifying soldiers or school children for instruction, may not be of maximum usefulness in classifying machinists or business managers. The Mentimeter tests offer a wide variety, from which it is proposed that only those shall be used which have actually proved useful in classifying candidates for the particular task concerned. There is no reason to believe that exactly the same type of intelligence is required in all positions.
Having chosen certain promising tests for experiment, having proved the validity of these tests by checking up the relation of their results to the true abilities of a group of old employees or persons whose relative capacities are known perfectly, and having selected those tests whose results relate most directly to intellectual ability and least directly to one another, one may begin to employ the tests thus selected for the sorting and classification of new recruits or applicants. The question which will at once confront the reader who is not experienced in the employment of statistics of this sort is “How shall the test results be recorded and interpreted?”
The answer to the question regarding test records is that the exact score of each person should be kept for each test to which that person is “exposed.” One difficulty with the records kept of certain other group intelligence tests is that only the final total score is retained, while all the wealth of detail furnished by the different tests included in the series is lost. The total score on a series of six or eight intelligence tests is worth keeping, but the separate scores on each of the six or eight may prove to be even more illuminating than the total score. Two candidates may make the same total score on a series of tests but the one may make his points chiefly in memory tests with little help from the tests calling for complex thought, while the other may do very poorly in the memory work and very well in the thought tests. If only the total score on the series were retained, the usefulness of the series would be practically destroyed for many purposes.
For the interpretation of the result recorded on any test, one will need to use some short but intelligible scheme for stating the true relation of the score of any individual to the scores of the remainder of his group or to the scores of the other group of old employees used as a standard in selecting the tests to be regularly employed. It is not always safe to say merely that Mr. K—— is below the average of his group. As an extreme case of how unjust this might be, let us suppose that in one of the Mentimeter tests, A made a score of 0; B made a score of 2; C, a score of 1; D, 2; E, 3; F, 0; G, 10; H, 2; I, 3; J, 9; and K, 3. The average score of this small group, obtained by adding the eleven scores and dividing by 11, is 3.18. Mr. K—— therefore obtained a score which was below the average of the group, even though fewer than 20 per cent. of his group made better scores than he. The average score is too much influenced by extremely low or extremely high scores.
To arrive at a proper perspective for interpreting the score of any individual, it is necessary first of all to have a distribution of the scores made by all the persons in the group with which the individual is to be compared. Such a distribution should show how frequently each possible score was made. The table on the left illustrates the idea of a distribution, using as material the scores quoted above for eleven individuals tested by a Mentimeter test. This table shows that one person had a score of 10, that one other had a score of 9, and that 3 was the next highest score made. The mode, or most common score, in this distribution is a 2 or a 3, which fact makes K’s score of 3 appear as quite typical of his group. The modal or most frequent score is a really useful score with which to compare the record of any individual, although it is not as safe a measure of the central tendency of a distribution as is the median score.
| DISTRIBUTION | |
|---|---|
| SIZE OF SCORE | FREQUENCY |
| 10 | 1 |
| 9 | 1 |
| 8 | 0 |
| 7 | 0 |
| 6 | 0 |
| 5 | 0 |
| 4 | 0 |
| 3 | 3 |
| 2 | 3 |
| 1 | 1 |
| 0 | 2 |
| Total | 11 |
The median score of a distribution is the middle score, than which there are just as many larger as smaller. The median score is found by beginning at one end of a distribution and counting through half of the frequencies. To count through half of the eleven frequencies in the above distribution would bring us into the midst of the three who had scores of 2, and therefore 2 is the median score with which K’s score, or the score of any other individual, should be compared.
The reader who is mathematically inclined may wish to find the median point in the distribution, the point which bisects the distribution. To find this, one needs to study his facts carefully and make such assumptions as seem most probable for the facts which are not perfectly apparent. For example, of the three persons who scored 2 points, one individual may have had the third problem thought out and have been in the very act of writing the correct answer to it when the time was up, while another may have just finished problem two without having begun to read the third problem, and the third person may have been right in the middle of his thought about problem three. Not knowing what the exact truth is, we may assume that of the three who had a score of 2, one’s true score was between 2 and 2.33, another’s was between 2.33 and 2.66 and that the third’s was between 2.67 and 3.00.