CONTENTS

CHAPTER		PAGE
I	Science Versus Guesswork	[3]

II	The Applications of Psychological Tests	[16]

III	What These Tests Measure	[23]

IV	Standards for Mental Tests	[33]

V	Different Types of Mental Tests	[44]

VI	Mental Tests in the Army	[53]

VII	Psychological Tests in Education	[63]

VIII	Mental Tests in Industry	[76]

IX	How to Use the Mentimeter Tests	[88]

X	The Mentimeter Tests	[109]

XI	Trade Tests or Tests of Skill	[274]

Appendices		[287]

MEASURE YOUR MIND

CHAPTER I
SCIENCE VERSUS GUESSWORK

There are two ways, and only two, in which we can find out what a machine is capable of doing. One of these is to try it out, to “put it through its paces” by using it for every sort of work which it is expected to perform and observing whether or not it does what we want it to do. The other way is to measure it (or to take the measurements of it as supplied by its responsible manufacturer) and compare these with the measurements of the essential parts of machines with the performance of which we are already familiar.

Unless it is a brand-new type of machine, designed to do something that has never before been done by machinery, or to do it by a different mechanical method, there is obviously a great saving of time and money in buying a machine from specifications that insure the correct performance of its expected duty over the other plan of first buying the machine and then trying it out in practice to see whether it will do what we want done.

The manufacturer or business man who would purchase machinery of any sort without first making certain that its dimensions, speeds, weight, power-consumption, controls, and the materials used in its construction were such as to adapt it precisely to the work he expected to do with it would speedily bankrupt his business. It takes but a moment’s thought for the reader to prove this to himself.

On the other hand, however, we find business men constantly employing men and women to perform specific duties without applying any tests or measurements, other than the most rudimentary ones, to determine in advance whether the person so employed is fitted for the work he or she is expected to do. And as every employer knows, one of the most costly wastes in almost every business or manufacturing establishment is the expense of constantly “breaking in” new employees to take the places of those who have left or have been dismissed because they were found, after trial, not to be fitted for the duties to be done.

Because the installation of machinery of any kind involves an initial outlay of money, it long ago became apparent to everybody that the “trial and error” method of buying machines or other commodities was wasteful and ruinous. It was not until recent years, however, that the closer study of operating costs disclosed the fact that the expense of “labour turnover,” that is to say the proportion of employees in any given business whose places have to be filled annually, is one of the heaviest avoidable drains on income. This was long overlooked because no capital investment is involved in the initial employment of labour. The cost of training new employees is much larger, it is now learned, in most businesses, than is generally understood, not only in the direct outlay in salary or wages before the new employee has mastered the duties of the new position as well as he or she is able, but in loss through spoiled materials, reduced individual output, and often in the slowing down of an entire chain of manufacturing operations through the inability of the inexperienced worker to maintain the pace of the rest of the links in the chain.

If, then, as so often happens, it is found after experiment that the new employee is not capable of performing the work efficiently, the whole process must be repeated. The employee who has failed leaves, is dismissed, or is transferred to another department, and a new and equally inexperienced worker employed to fill the vacancy, with the whole cost of training to be incurred over again. Even though the new worker may be experienced in the particular class of work to be done, there is an appreciable loss due to the unavoidable frictions and hesitations that occur whenever a worker is being fitted into a new environment.

There is, moreover, no guarantee that even an experienced person in a special sort of work is fitted to do that particular work as well as it can be done or should be done. He or she may have got into that sort of work by accident. That is usually the way in which a boy or girl begins a business or industrial career. He or she may have continued in it merely because the experience gained in the first job enables its possessor to pass the superficial scrutiny of foremen, managers, or others who employ “help” in the first instance. But just as all the experience and training in the world will not make a Paderewski out of a person who was not born with the precise combination of sensory and nervous qualities that the master musician possesses, though almost any one with ten fingers and an ear for harmony can be taught to play the piano after a fashion, so it is true that while in the all-important business of earning a living almost anybody can be trained to do most of the ordinary manufacturing and business operations, after a fashion, it is only those who were born with certain combinations of nerve endings and sensory apparatus who can be trained into first-rate salesmen, or expert tool-makers. And this holds true all the way down the line, to the simplest and most automatic operations necessary in business industry.

Individuals themselves are seldom aware of their own capacities; even less generally of their own limitations. Occasionally, by accident, a man or a woman finds at the right time the opportunity to do precisely the things he or she is best fitted to do. Often the individual’s strong personal instincts or inclinations lead him or her to seek opportunity to do certain kinds of work without any clear understanding why that sort of work appeals while other kinds do not. Few human beings analyze their inclinations closely. Yet it may be and frequently is the case that the work one most strongly desires to undertake is not that in which he or she is best fitted to succeed. The inclination may be counterbalanced by inhibitions of which neither the possessor nor his or her employer becomes aware until repeated failure has demonstrated the lack of adaptability, sometimes after it is, or seems to be, too late to take up another occupation. Then the worker usually drifts into the ranks of “casuals,” constantly moving from job to job, chronically “out of work”; the ready dupe of agitators and the prophets of social unrest and revolution; disheartened, anti-social, and perennially unhappy; the most expensive sort of an employee in any position, no matter how small the wage—yet a human being, and, as such, entitled to liberty and the pursuit of happiness!

That is an extreme picture. Yet if such tragedies occur (as every reader knows from his own observation and experience they do occur too often) among those who have voluntarily chosen their own lines of work, how much more frequently must they occur among those whose daily occupations have been determined for them, not through any voluntary choice or intelligent guidance but solely through the accident of having been “thrown into” certain jobs when they were young?

That is the way in which the vast majority of individuals have their careers shaped for them. The world of business and industry and of the professions is full of blacksmiths who ought to be carpenters, indifferent lawyers who would have made good dentists, teachers who are failures because they should have been trained as stenographers, good cooks who have been spoiled to make mediocre shop attendants, and so on through the list of possible occupations. Within every business organization, moreover, there are grades and degrees of requirements and responsibilities into which some employees may fit perfectly, others less perfectly and others not at all, though all be drawn from the same group or from those performing the same general class of service. Here, as in the matter of original employment, the general custom of dealing with the human element in industry is the wasteful “hire-and-fire” system, analogous to the purchasing of machinery or equipment without first ascertaining whether it will do the work, and scrapping it when it fails.

We found out long ago that we couldn’t afford to do that sort of thing with machinery. We are just beginning to find out that it is even more expensive to do it with the human element in industry.

It would perhaps be going too far to claim that the whole problem of the “labour turnover” arises from the effort to fit square pegs into round holes, but it is certain that a very large share of all human troubles, industrial unrest, discontent, inefficiency and unhappiness is traceable to the lack of proper adjustment between the man and the job, and this in turn is due in large part to the failure to determine in advance the fitness of the particular individual for the particular task.

What is needed, obviously, is a measure of human capacities, just as we have means of measuring every phase of the machine’s capacities.

Just as we measure a machine by the most precise gauges and tests available, why not measure the human individual by the most precise means we are able to apply?

The word “measure” in the preceding paragraph does not mean, either in the case of the machine or of the man, the gross dimensions of length, breadth, and thickness; these are equally immaterial, in most cases, whether the subject of measurement be a man or a machine. One measures a machine to determine its capacity for certain work, and is little concerned about its characteristics that have no bearing upon those qualities that fit it for those particular duties. So the measurements of a human being whose capacity for certain duties is to be determined must be of those qualities which enable him or her to perform according to a certain pre-determined requirement.

These qualities, in man, woman, or child, can be measured; not with the precision with which an engineer measures the parts of a machine that must fit within a thousandth of an inch, but with sufficient accuracy to determine quickly, inexpensively, and simply whether a given individual has the capacity to learn and perform any given task or class of work.

To explain how these tests can be made, how science can be and is being substituted for guesswork in the selection of human beings for jobs and of jobs for human beings, just as science has displaced guesswork in the selection of material commodities, is the purpose of this book.

Let us first point out clearly the difference between science and guesswork. The vast majority of jobs are filled by guesswork. The farmer who hires a field hand, the housewife who employs a cook, the foreman who takes on a new “hand” in the factory, and even employers hiring persons for more responsible positions, all do it, to a greater degree than they imagine, by guesswork. They may make inquiries, more or less thorough depending upon the compensation and responsibility involved, of persons who are reputed to know by observation something of the candidate’s qualifications. Unless the individual under consideration be flagrantly and patently unfit the reports thus obtained are almost always favourable. In many cases no effort is made even to obtain such reports.

Many persons who regard themselves as intelligent employ men and women for all sorts of delicate operations and confidential and responsible relations as a result of observation alone; yet observation alone will tell no more about a man or a woman than it will about an automobile—the shape and the colouring.

When you observe a human being you can determine certain physical characteristics, such as size, complexion, colour of eyes and hair, soundness of teeth, shape of body and head, contour of face, features, and expression. You make up your mind that you like the person or you do not. But as for determining by means of anything your unaided observation discloses whether or not the person under examination is qualified either to perform or to learn how to perform efficiently a given task or set of tasks, you might as well expect to discover the hillclimbing power of an automobile by merely looking at it.

Yet that is precisely the way in which, in the vast majority of cases, the supremely important work of fitting individuals and jobs together is done in the world of business and industry.

True, the prospective employer usually asks a few questions, but the applicant’s manner and tone of voice have usually as much to do with the final decision as the actual replies.

Men and women are usually hired, in short, on their looks and on the impressions made at a single short interview. That it is too much to expect persons so selected to fit into even the simplest sort of a business or industrial organization should be obvious to every intelligent person; that sometimes they do fit should be no less obviously recognized as largely accidental.

We do not recognize the absurdity of this method of selecting persons for particular positions, partly because this is the only way most of us have ever known and partly because there is in almost every human being a secret or subconscious belief in his own peculiar powers of judging others by means of surface indications.

The fallacy of the belief that one may arrive at accurate conclusions as to individual capacity and characteristics by merely looking at the individuals concerned has been well set forth by Prof. L. M. Terman of Stanford University. Much of the popular belief in the efficacy of this method, Doctor Terman believes, is due to the fact that the public does not know that the pretensions of the pseudo-science of “phrenology” were long ago shown to be unwarranted. According to phrenology, definite and constant relations are believed to exist between certain mental traits and the contour of the head. Phrenologists teach, for example, that one’s endowment in such traits as intelligence, combativeness, sympathy, tenderness, honesty, religious fervour, and courage may be judged by the prominence of various parts of the skull. While the sincerity of Gall, the French physiologist of a century ago who invented the so-called science, and of his followers, is not to be questioned, the pretensions of phrenology itself have been thoroughly exploded. It has been demonstrated that traits like those above mentioned do not have separate and well-defined seats in the brain and that skull contour is not a reliable index of the brain development beneath.

“In the underworld of pseudo-science, however,” says Professor Terman, “phrenology and kindred fakes survive. Hundreds of men and women still earn their living by ‘feeling bumps on the head,’ reading character from the lines of the hand, etc.

“But if the rating of men by pseudo-science is misleading, perhaps science is still unnecessary. It may be argued that mental traits can be rated accurately enough for all practical purposes on the basis of ordinary observation of one’s behaviour, speech, and appearance. We are constantly judging people by this offhand method, because we are compelled to do so. Consequently we all acquire a certain facility in handling the method. For ordinary purposes it is infinitely better than nothing. A skilful observer can estimate roughly the height of an airplane; but if we would know its real height we must use the methods of science and perform a mathematical computation.

“The trouble with the observational method is its lack of a universal standard of judgment. One observer may use a high, another a low standard of comparison. A four-story building in the midst of New York’s ‘sky-scrapers’ looks very low; placed in the midst of a wide expanse of one-story structures it would look very tall.

“Moreover, we are easily misled by appearances. The writer knows a young man who looks so foolish that he is often mistaken by casual acquaintances for a mental defective. In reality he is one of the half dozen brightest students in a large university. Another man who in reality has the mentality of a ten-year-old child is so intelligent looking that he was able to secure employment as a city policeman.

“Language is a great deceiver. The fluent talker is likely to be overrated, the person of stumbling or monosyllabic speech to be underrated. Similar errors are made in judging the intelligence of the sprightly and the stolid, the aggressive and the timid, etc. Our tendency is also to overestimate the intellectual quality of our friends and to underestimate that of persons we do not like.

“If the method of offhand judgment were reliable, different judges would agree in their ratings of the same individual. When the judges disagree it is evident that not all can be correct. When intelligence is rated in this way wide differences of opinion invariably appear. Twenty-five members of a university class who had worked together intimately for a year were asked to rate the individuals of the class from 1 to 25 in order of intelligence. The result was surprising. Almost every member of the class was rated among the brightest by someone, and almost every member of the class among the dullest by someone. Doubtless the judges were misled by all sorts of irrelevant matters, such as personal appearance, fluency of speech, positiveness of manner, personal likes and dislikes, etc.

“The method of personal estimate is much better than the method of external signs (phrenology), but to be reliable it must be supplemented by a method which is objective, that is, a method which is not influenced by the personal bias of the judge or by such irrelevant factors as the appearance, speech, or bearing of the one to be rated. Such is the method of intelligence tests.”

It would, of course, as Professor Terman points out, be absurd to contend that it is impossible to arrive at a rough estimate of an individual’s capacities and character by observation, as it is absurd to pretend that accurate measurements of an individual’s capacities can be made by the same method. There are men who have by long experience learned to judge on very brief contacts the possibilities of applicants for positions. Actually, what such employers do is to apply, though crudely and unscientifically, a limited number of tests which might fairly be classed as psychological. Out of a long experience they have accumulated an accurate knowledge of the work to be done and of the general type of individual who has been found best qualified to perform that work. This sort of ability, however, is acquired solely through long experience, and even then it can only be acquired by men or women who themselves possess certain mental qualities, which might easily be gauged and classified, the possession of which enables them to accumulate and utilize experience in this manner.

This sort of ability can by no means be transferred from one individual to another by description or by mere training. It is precisely like the ability which an experienced automobile repair man possesses, that enables him to tell by a quick inspection and after only a few minutes of observation what are the principal things the matter with a car and what service it is probably able to perform. But a repair man cannot tell anybody else how to size up an automobile at a glance, because the only way any one can learn to do it is by going through the same process of taking automobiles apart and putting them together again for a period of years. And as everyone who has ever had occasion to deal with automobile repairs is aware, the most experienced repair men are seldom positive that they know just what is wrong and all that is wrong without applying precise measurements and painstaking tests.

It is easy enough to determine that a delicate, small-boned, slender person is not the best type to employ for digging coal, loading freight cars, or other arduous manual labour. There are, of course, many classes of occupations the fitness or unfitness for which of a particular individual must be determined in the first place by that individual’s physical characteristics. So far the observation method suffices. But the very fact that every industry and business is full of misfits and that it is a matter of common knowledge that the most difficult problem the employer has to face is that of finding the right person for each particular job that calls for anything more than mere physical strength, is the best evidence that even the most experienced and accurate observers are far from infallible in their judgments of individual capacities.

For that matter, there is no infallible test. No true scientist claims infallibility. The possibility of error is always present wherever the human element is involved. It is a safe assumption that any method or estimate that purports to be infallible is fraudulent. There is in almost every human mind a lurking, subconscious belief in the possibility of perfection. It is this which makes humanity credulous when claims of infallibility are plausibly presented.

It is extremely difficult to satisfy by logic and reason the type of mind that is strongly influenced by glittering generalities and emphatic, though unsupported, assertions. It is equally difficult to convince the skeptic whose mind is closed to the introduction of new thoughts and who, in his self-satisfaction with his own mental limitations, rejects every fact that does not tally with his preconceived ideas.

This book is written neither for the super-skeptical nor the ultra-credulous. It makes no pretension to infallibility, nor does any scientifically trained psychologist pretend that there has yet been evolved a method of measuring every dimension and capacity of the human mind beyond the possibility of error. The methods described in this book are the fruit of years of experiment, research, and practical application of the results of experiment and research, and are designed to reflect the development of the science of psychology in its application to mental measurements as closely as it is possible to do so within the limits of a single volume written primarily for the reader who has no special scientific training along psychological lines.

The reader who is not prepared and willing to examine facts and at least to take all the ascertainable facts into consideration before forming his conclusions is not likely to be interested. The scientific method of character analysis or mental measurement is based upon the comparison of the largest possible collection of ascertained facts. Guess work has no place in it. Psychology has small dealings with intuition and instinct nor is it in any way derived from magic or concerned with the occult. There are no unfathomable mysteries. There is no fact about the operation of the human mind which cannot be subjected to scientific investigation and measurement by any intelligent person. The scientific method requires that every conclusion must square with the results obtained by the experimental application of all related facts or be discredited as worthless. Theories have no place in science, except as something to be disproved if possible, and a single fact which does not square with any theory disproves the theory.

The scientific method of mental measurement has passed the theoretical stage. It has squared with the facts wherever it has been intelligently applied. It has been demonstrated in a wide range of business and industrial applications, in education and in its use in determining the qualities and fitness of officers and men in the Army and Navy. What it offers is the shortest, simplest, and most accurate means available of determining human capacities and qualities.

Professor Terman has admirably summarized the advantages of the scientific method of testing intelligence, as follows:

“1. It gives us a universal standard of comparison. The result is absolutely uninfluenced by the general intellectual level of the group with which the subject to be rated happens to be associated. It is like measuring the height of a house instead of estimating it by comparison with the height of surrounding buildings.

“2. It multiplies enormously the significance of mental performance. It does this by making fine distinctions which would be overlooked by the method of offhand judgment. It is like placing a smeared glass under a microscope and discovering that the smear is a complicated network of organic matter.

“3. The test method is objective; that is, free from the influence of personal bias. It gives approximately the same verdict to-day, next week, or next year. It does not change its opinion. More important still, the verdict will be approximately the same whoever makes the test, whether a relative, a stranger, a friend, or an enemy, provided only that the rules of procedure be rigidly followed.

“4. The test result is little influenced by the subject’s educational advantages. In this it differs greatly from offhand judgment, which so easily mistakes the results of schooling for real intelligence. The test method probes beneath the veneer of education and gives an index of raw ‘brain power.’ For example, a young woman who had been stolen in early childhood by gypsies and had spent her life with them was given the Binet-Simon intelligence test. She had never attended school a day in her life and had only learned to read by bribing a little school girl to teach her the alphabet; yet she made a higher score than the average found for two hundred high school pupils who were given the same test.

“No wonder,” Professor Terman concludes, “mentality tests have acquired such a wide vogue in the ten years since Binet gave to the world the first successful intelligence scale. In that time they have demonstrated their usefulness in the study of the feeble-minded, in the grading of school children, in determining the mental responsibility of offenders, and in the selection of employees. Their largest and most useful applications have been in the mental classification of men in the United States Army.”

CHAPTER II
THE APPLICATIONS OF PSYCHOLOGICAL TESTS

The intelligent reader has by this time begun to see for himself some of the possibilities opened up by the use of scientific mental tests, and to perceive their applicability in a wide diversity of fields. In later chapters specific examples of such applications are given in detail, and suggestions offered for still other uses of the tests which are contained in this book.

The usefulness of all mental tests, whether those which are offered in this volume under the general title of “The Mentimeter,” or others that may be set up with equal scientific precision, depends upon, or at least is greatly enhanced by, the most complete understanding of the underlying principles on the part of the person who undertakes to apply them. The purpose of this and the next three succeeding chapters is to make these principles so plain that by the time the reader has reached the tests themselves he will have a perfectly clear understanding, not only of what the Mentimeter tests are but of why they have been put into the form in which he finds them, and of how their use will enable him to gauge human intelligence and capacity with a greater degree of accuracy than he has found possible by other means. If, perchance, psychologists find in this volume much that is to them elementary, it should be kept in mind that it is only through the widest possible spread of sound understanding of psychological principles that the wider application of them in the ordinary walks of life can be brought about. That, the authors take it, is the great end toward which scientific psychologists are aiming, and that is the aim of this book.

The general purpose of psychological tests is to determine how individuals compare with one another in mental capacity, or with standards based upon the capacities of individuals known to possess certain qualities. Thus, it may be desirable, as it frequently is, to determine the relative abilities of the individuals of a certain group, like a school or college class, clerks employed in a similar form of work, a number of applicants for a certain position for which only the most capable among the candidates is desired, or the soldiers of a particular company or regiment. What is required here is a method of grading these individuals with reference to one another, by means of tests which need not necessarily have any relation to any external standard of mental perfection.

The process here is as if one were to be told to pick out of all the automobiles in a garage the best one, the next best, and so on, classifying these particular cars and no others with relation to one another and not with respect to any standards of automobile perfection introduced from outside. None of the cars might be perfect; perhaps the best one of the lot has leaky piston-rings and none of them will climb a 10 per cent. grade on high gear. It is a comparatively easy matter, however, to devise a few simple tests that will grade a dozen or fifty automobiles in regard to their relative ability to climb hills, carry loads, and perform the other services expected of an automobile. The one that will climb hills the best may not also be the one that will carry the heaviest load or travel the most miles on a gallon of gasolene, but out of such a series of group tests any one interested can readily determine which of all the automobiles in the group is the best general purpose car, which the poorest, and about where the others grade with reference to these two extremes.

That is about the process that a man engaged in the automobile trucking business would use in determining which one of the cars he has available is best adapted for a particular piece of hauling that is to be done. He wants to know which of his cars he can rely upon for any one of many different sorts of service, but he particularly wants to know all the time which of them are worth spending money on for repairs and improvements and general overhauling and which are either so poorly constructed in the first place or so hopelessly out of repair that it is cheaper to scrap them than to spend any more money on trying to make them fit for service.

In other words, the automobile owner needs to know which of his cars, however poor its present ability, has such inherent qualities as to justify the belief that it can be made more efficient by proper attention and reasonable expenditure of money.

Now, that is precisely what the employer of workers, the commander of soldiers, the teacher of a class or any one else charged with responsibility for the performance of any sort of tasks by any group of human beings, wants to know about the individuals under his direction. He should know or be able to determine readily not alone what each of the individuals can actually do and which ones can do particular tasks better than the others, but it is important for him to know the relative capacities of the entire group, so that he can determine, as in the case of the automobiles which have been used as an illustration, which of them are most worth spending time and effort upon in the expectation that they will learn to do even more difficult tasks, and which of them are so hopelessly incapable that nothing is to be expected of them except the simplest routine performances.

Now, the man who operates a fleet of automobile trucks does not stop when he has made a comparison of the vehicles in his garage with one another, but is constantly comparing the performance of each with standards established by other cars, machines of different makes, and with new machines. There exists, and he is constantly conscious of its existence, an ideal standard of performance for automobiles to which his cars must conform as nearly as possible if their service is to be satisfactory.

So, in measuring human capacity, it is not enough to compare the individuals of a group with one another, though this is essential and for some purposes temporarily sufficient; there are available standards based upon the actual achievements of individuals of known capacity by which the mental powers of any and all individuals may be gauged. It will readily be seen that the employer of a number of persons—engaged, let us say, in some specific mechanical or clerical operation—needs to know not only whether some of these are capable of being trained to do better work and some so incapable of further training that it would be cheaper to discharge them and fill their places with more intelligent persons, but also to know how any particular group of employees compares in average intelligence and how each one of the group compares in individual intelligence with the average or higher-than-average capacity of those outside of his own particular business establishment who are engaged in similar work.

This is a matter of dollars and cents to the employer. If he can obtain a standard that is universal or nearly so, that tells him, in fact, what all of the employers in his line of business have found to be the average or the limits of mental capacity possessed by workers of a particular class, then he is in a position to determine whether he is getting equally good service for the wages which he pays as is obtained by other employers requiring similar service.

To illustrate concretely: in an office employing twenty stenographers on correspondence, it is not only necessary for the employer to know which of these stenographers is the most competent and which the least and whether the less competent are incapable merely because they are beginners or because they lack the necessary mental capacity ever to become competent. He should also be able to measure the mental capacity of the entire group by some standard based upon the performance of thousands and tens of thousands of stenographers of known degrees of ability. He may discover that the most competent of his entire staff is only as capable as the average of good stenographers everywhere. Obviously, his business is handicapped by having a stenographic force which is inferior in capacity and, consequently, in accuracy, speed, and other essential requirements, to the average of stenographic office staffs in business generally. Once this has been determined, the intelligent employer will proceed to replace the stenographers who are incapable of improvement, as indicated by the tests applied, with stenographers who respond to the standard tests with a score well above the average.

So, too, with the teacher. It is comparatively easy for the teacher to classify his or her pupils into bright, stupid, and mediocre, through observation alone. What is more important, however, is to determine several things about each pupil which observation alone does not tell. Are the stupid ones really stupid or merely inattentive? Have they the necessary mental capacity to perform the assigned work of the class or are they simply lazy? Few teachers can answer this question; none with any degree of accuracy without the application of scientific tests of mental capacity. Are the bright children really bright by comparison with other children of the same age and school grade or do they merely shine by contrast with the dull members of the class? This question can by no means be answered accurately except by the application of mental capacity tests. In another chapter some of the concrete applications of mental tests in education are described at length. The point to be emphasized here is that the measurement of the mental capacities of any group should be based upon standards that will not only determine the relative capacities of the members of the group but will, at the same time, compare them all with standards that reflect the known average and maximum capacities of all others of similar age and environment.

The purpose of these tests might be summed up somewhat as follows:

To measure, by comparison with the group average or with very carefully determined standards, some of the infinite number of qualities and abilities, the possession of which by the individual renders him more or less susceptible to education and training or more or less capable of successfully performing certain actions requiring conscious direction from the mind.

It requires no special argument to point out how a general application of tests that disclose actual mental capacities might profoundly affect our judgment of men of all classes and walks of life. Were it possible to ticket and catalogue the whole human race in accordance with the capacity of each individual as disclosed under properly devised psychological tests, we would no longer permit the superficial absence of polish and taste to blind us to the inherent powers and capacities of the self-made man, nor, on the other hand, would we be so ready to assume that the well-dressed, fluent talker, no matter how prepossessing in appearance and manner, was necessarily able and worthy of confidence. Likewise, once such a classification became universal, it is conceivable that many business men and others who are prone to criticize the universities and their products would be more tolerant of the recent graduate, whose mental capacity is in no wise reflected by the particular variety of contents with which his mind has been filled in college.

Besides the application of scientific mental tests as already indicated, in business and industry and in education, by the employer or the teacher, there is another and important use to which standardized tests, based upon determined capacities of groups and individuals of known ability, may be put. This is the use of such tests by the individual upon himself for the purpose of determining his own mental capacity in a particular direction or of a particular kind as compared with the mental capacity of others. The man or woman bent on self-improvement or advancement may thus, within certain limits, assess by the application of standardized tests his or her own mental quality and capacity.

Again it is unnecessary to point out the advantage to the young man or young woman endeavouring to decide upon a career or to determine what particular course of study to pursue or line of business to enter, in being enabled to obtain an accurate gauge of his or her own qualities, powers, and limitations. Taste and inclination are no safe guides to life unless there is coupled with them inherent capacity for the competent exercise of the faculties which make the gratification of one’s individual tastes and inclinations possible. Thus it may be that the individual’s inclinations and tastes run strongly toward music, toward art in any of its various forms, but that physical and mental inhibitions, the presence or absence of which may be readily determined, make it impossible for the possessor of such tastes to hope to be able to perform creditably the acts which a successful artist or musician must perform.

Properly devised and applied psychological tests may conceivably disclose the existence of mental powers and capacities unsuspected or neglected because overshadowed by strong inclinations in other directions; early knowledge of the possession of such capacities may easily direct their possessors into fields in which they can thrive and prosper and achieve far greater happiness and contentment than would ever be possible through a lifetime of striving to do that for which they are not fitted by inheritance.

CHAPTER III
WHAT THESE TESTS MEASURE

The most natural question and one that is frequently asked is:

“What, precisely, do psychological tests measure?”

It is a question that is easier to ask than to answer.

It is simple enough to say that mental tests are designed to measure the natural or inherent mental capacity of the individual, but in order to approach a clear understanding of just what this means we must first define what is meant by the term “mental capacity.”

As a matter of scientific fact, the term “mental capacity” can hardly be regarded as accurate, although it is the best term we have to describe the qualities which determine the individual’s ability to perform acts requiring conscious thought. Psychological and biological science no longer regard the human mind as something different from or in any way apart from the human body. The idea that there is such an entity as a mind that operates even in the slightest degree without reference to and independent of the physical body must be dismissed, if we are to grasp clearly the principles and methods of mental tests.

To the psychologist the mind is merely a specialized organ of the physical body. The intangible something, which is what is usually meant when persons speak of the human mind, is merely the sum of all the sensations, feelings, and judgments resulting from the delicate adjustment of an almost infinite number of nerve fibres which in themselves are a part of the physical body. One may have at birth a plentiful supply or a poor supply of potential nerve endings which are ready to be organized and coördinated by experience and training, but unless one has the opportunity to learn from study and experience, the desirable connections may never be developed.

The maximum capacity of the mind in any particular field is, therefore, practically determined by physical inheritance of an abundant supply of nerve endings. Thus, it may be that one individual is born with two or three times as many nerve terminals connecting at the point at the back of the eyeball where the optic nerve—which is simply a bundle or rope of nerve fibres—is attached to the mechanical apparatus upon which the reflection of objects passing before the field of vision is registered. Such an individual’s powers of observation are normally greater than those of the person of less fortunate heredity in this respect, whose lesser number of terminals of the optic nerve fibres limit his powers of optical perception and observation. Thus, one person may see at a glance a hundred details, all of which register sharply upon his consciousness, while another sees only the gross outlines and shadows, and in between is the average person who sees some details but not all.

It is well known to psychologists and biologists, although not generally understood by those who have not made a special study of these sciences, that mental capacity does not change or increase materially after the individual has reached maturity. It may be diminished through accident or disease, but the chief increase in adult life is in the volume and variety of stored-up impressions. The average girl of eighteen or boy of twenty has reached the approximate limit of his or her mental capacity. The mental tank will never grow much larger. It may be half empty or almost entirely vacant, but unless at the average age of university sophomores scientific mental tests prove the individual to be possessed of average or better than average mental capacity, it is futile to expect any great intellectual development to take place in later life.

But while the maximum capacity of the mind depends upon physical inheritance, the actual ability which is necessarily reflected in the scores made by a person subjected to mental tests is determined by the number and variety of nerve connections that have actually been made by environment or training. Inheritance sets the maximum limit, but as a matter of practice this maximum is never reached, or at least is so seldom reached by any individual that it can hardly be said of any human being that he has developed his mind in any direction to the utmost limit of its capacity. What we actually measure in scientific mental tests is a complex of natural or inherent abilities plus the results of education and training; because, while it is possible to a considerable extent to eliminate by properly devised tests a record of the individual’s acquired knowledge, it is practically impossible to distinguish between acquired and inherent mental ability.

Note carefully the distinction between mental ability and mental capacity. Mental ability in any individual is always less than his mental capacity. If, therefore, the mental ability as determined by scientific tests reaches the highest point on the scale of measurement, whatever that may be, it follows that the mental capacity of the individual making a perfect score is even greater than the scale is designed to measure, and how much greater can only be determined by setting up new tests based upon higher standards.

The result of any scientific test simply indicates the wealth of nerve connections that are ready to be made when the stimulus necessary to their establishment is applied. It must be understood that no one having a sound claim to the possession of scientific knowledge can contend that there are tests in existence that actually measure with complete precision the inherited as distinguished from the acquired mental characteristics. It is not conceded, however, that such precise measurements cannot be made if at any time it becomes necessary or desirable to do so. For all practical present-day purposes it is sufficient that psychological tests shall measure mental qualities which are manifested by the individual’s ability to express them by action or speech. The classification of individuals relative to one another and with reference to the possession of a particular mental ability or group of abilities is, therefore, necessarily based upon their relative ability to express in some intelligible and unmistakable fashion their mental power and qualities.

Back of this power of expression may lie hidden and undreamed-of capacities of which the individual himself may be vaguely conscious but of which he can give no outward manifestation. It may be, for example, that an individual is gifted with unusual powers of perception through the eyes, ears, and the senses of touch, smell, and taste but that he is deficient in nerve fibres and connections controlling the voluntary muscles by which human beings translate sensations into action and speech. This is hardly likely, as a physiological fact, to occur; the individual born with rich nerve endings in one part of the physical body is more likely to have a proportionate supply of nerve endings in all other parts of the body than to be deficient in one part and amply supplied in another. As rare exceptions, however, there are individuals who in infancy have, through accident or disease, lost certain groups of nerve connections while retaining unusually rich groups in other parts of the body. There is, of course, the most famous case in modern history, that of Helen Keller, whose auditory and optical nerve connections were lost through disease in early infancy, but whose unusual inherent mental capacity has been able to demonstrate itself through other and extraordinary means as a result of training and education.

But in ordinary life, if a man or a woman has some mental quality which does not express itself in an action which other persons can see or hear and know about, then it is not socially important. It is of consequence only to the individual and it is of little social service to undertake to measure these obscure and unexpressed and inexpressible capacities, as they can never, until they find means of expression, affect the individual’s ability or efficiency in any occupation. It is not that these things cannot be measured. The case of Helen Keller is one demonstration that they can be measured. Anything whatever that makes a difference in the way different individuals act is conceivably measurable, although it may not at the present time be capable of exact calculation because it has not been worth anybody’s time and effort to undertake to measure it.

To repeat, and possibly to make the preceding paragraphs more clear, let us recapitulate the different mental qualities to which reference has been made.

First, mental capacity. This is what the individual has inherited. It is the size of the tank into which sensations, perceptions, all that makes up the sum of knowledge, are poured throughout his life, by his education and his experience. While this capacity in the case of any individual can doubtless be measured, it is not necessary to measure it precisely but merely to determine whether it is large enough for the purposes in view.

Second, mental ability. This is the sum of experience and education within the limits of the individual’s mental capacity. It is represented by the individual’s ability to express himself in speech or action in the performance of any one of a number of specific acts. This mental ability can be quite definitely measured, and the possession of a certain degree of mental ability demonstrates the possession of a mental capacity greater than the ability which the individual has already reached.

Third, acquired knowledge. It is not the purpose of tests of mental capacity to measure acquired knowledge, although for many purposes it is desirable to measure the individual’s acquired knowledge in addition to his inherent ability, and in a still larger number of instances the most practical way of arriving at a fairly accurate estimate of an individual’s ability involves, among other tests, an examination into the extent of the knowledge which he has acquired through observation or training along lines definitely related to his particular occupation or pursuit in life.

The ordinary and standardized school and university examinations, civil-service examinations, etc., which have long been the accepted test of the individual’s ability, do not, and do not purport to, measure anything more than this last item, that of acquired knowledge. But while certain gross dimensions of individual capacity may be roughly estimated from the results of a written or an oral examination based entirely upon the subject’s stored-up knowledge, it is a matter of common knowledge, and almost every reader will be able to furnish examples out of his own experience, that such tests are frequently totally misleading. Professor Terman has reported on a comparison of the results of civil-service examinations for policemen and firemen in a California city with scientific tests applied to the individuals who successfully passed the civil-service examinations. The results were in many instances astounding. Men of such low mental capacity that they might almost be classed as feeble-minded were found to have passed with a fair degree of satisfaction the simple knowledge and physical tests set up by the city and to have obtained appointments to these responsible posts as guardians of the city’s property and lives.

While it is, therefore, the object of scientific mental tests to exclude as far as possible the acquired abilities resulting from education and environment and the knowledge that has been stored up through observation and training, it is found in practice that for all ordinary purposes it is sufficient to measure a complex of native and acquired abilities. The purpose of these tests is, in short, to discover what the individual is actually able to do, regardless of the source of that ability, provided, however, that the test of ability is so devised as to make a clear distinction between mere feats of memory and the actual exercise of original thought.

Now, it must be obvious that for the measurement of anything so complex and multi-dimensioned as the human mind, no single test or scale can be established. One cannot measure the power of visual perception, for example, by the same scale that is used to measure attentiveness or initiative. As a matter of fact, psychologists no longer attempt to classify human abilities as narrowly as was once the popular practice. It is almost impossible for even an expert psychologist to be sure he knows just what qualities and all the qualities any particular test measures. This is because modern psychologists no longer group reactions into general functions such as memory, attention, reason, etc., but simply describe accurately the stimulus given and the conditions under which it was given and then describe just as accurately what the reaction is. The test may be built up, for example, to measure ability to recognize and classify words, but it will also depend upon ability to read the directions, ability to attend closely to horizontal and vertical lines and upon many other correlated abilities. Any test may measure primarily a particular mental dimension or ability but it is quite certain that the resulting score will be influenced by numberless other factors than the one that the examiner is most interested in measuring.

But since one of the very best tests of intelligence is, of course, the degree to which one is able to profit by social contacts and the breadth and variety of the individual’s stored-up impressions, these extraneous or collateral qualities, which every test also more or less successfully measures in addition to the particular quality or mental dimension under direct examination, furnish useful data in arriving at a conclusion which is, after all, the main purpose sought, as to the individual’s actual abilities and potential powers.

In order, however, to get at a really useful record of the mental capacity of an individual, we must apply a variety of tests and out of the sum total of the results of these tests we are able much more accurately to gauge the degree of possession of the qualities for which we are seeking than could possibly be done by any single test, no matter how skilfully constructed. Here again science confronts the popular human demand for a panacea. But just as in medicine only the quack offers a cure-all, so, in other fields, science has no single standard to offer by which all results in a given field may be accomplished, and psychology cannot now or at any time in the future pretend that by a single method or a single measurement mental capacity can be gauged.

To come back to an analogy used in a previous chapter, you cannot measure all the qualities of an automobile with a ten-foot rod. Your ten-foot rod will tell you whether the wheel base is 120 inches or more or less than that. It will not tell you how much above or below 120 inches. If it be necessary for you to know that, you must provide yourself with a longer or more minutely graded measuring implement; but because the ten-foot rod does not at a glance disclose to you all that you wish to know about a particular automobile, you do not, therefore, either discredit the ten-foot rod as a measuring implement or declare that the automobile cannot be measured except by the unaided human eye.

The limitations of the ten-foot rod are perfectly obvious to you; and so, too, are the complexities of the automobile, which require a variety of instruments and tests for their proper gauging and measurement. So before you undertake to form a judgment as to the ability of a particular automobile, you either measure it yourself or, as a matter of practice, you have it measured for you by a competent engineer. You do not necessarily inquire, if you have confidence in the engineer, as to precisely what dimensions and what materials he found in every part of the car, but you respect his conclusions, knowing that they are based upon the most precise and accurate measurements possible with the aid of such instruments as science has been able to devise, and you are satisfied that the conclusions form an accurate estimate of the machine’s qualities.

The engineer who sets out to measure an automobile in all of its capacities and powers must provide himself with tachometers for measuring the engine’s revolutions, dynamometers for testing its tractive force, micrometer calipers for gauging the bore and the stroke, thermometers for measuring its temperature, galvanometers for testing its magneto and battery, and hundreds of other instruments, the readings of which must be assembled and studied by means of complex, comparative mathematical formulas before he can tell you what a particular automobile will do.

The human mind, it must be apparent to every reader, is not less complex than the automobile. On the contrary, it is infinitely more complex and subject to an infinitely wider range of variations. As has been pointed out above, it is not necessary for practical, every-day purposes to measure every possible variation and every one of the infinite number of dimensions of any human mind in order to ascertain the individual’s ability to succeed in the ordinary pursuits of life. But even in our ordinary, every-day affairs and contacts, in the simplest forms of employment, there are called into play such a number of different sorts of ability and mental power that there must be applied, if one is really to know of what a particular individual is capable, a large variety of tests of different kinds for measuring different powers. And for the mental measurement of individuals whose work calls for the highest development and capacity, a still larger variety of tests must be applied.

It is not always possible—in fact, it is extremely difficult—to devise tests that do not to some degree measure the mental content resulting from education and experience, in the effort to measure the mental capacity which limits and controls one’s education and experience. The qualities that determine capacity are inherent in the individual. One is born with them or is not born with them. In their whole infinite variety they are not all possessed by any one individual, and the particular grouping of mental qualities which any one person inherits is probably not possessed by any other person living or who has ever lived. Yet while individuals differ so completely that it can truthfully be said that Nature never cast two persons in the same mold, yet there are qualities possessed by all intelligent persons, the simpler and more elemental expressions of which are absolutely essential to intelligent life and existence, and these can be so grouped, classified, measured, and standardized as to provide a scale whereby the inherent capacity with respect to these important and essential qualities may be determined equally in the case of the totally illiterate, untrained labourer or artisan and the highly trained, educated product of a university postgraduate course.

As a matter of practical, every-day common sense, one does not expect to find, nor does one find, except as a rare exception, an individual engaged in menial or purely physical labour who is endowed with inherent mental capacity comparable to that of the university graduate. A person possessing such capacities moves out from the ranks of labour in spite of educational handicaps; the history of American business and industry is full of the romantic stories of men who have achieved success as organizers and administrators, though in many cases absolutely illiterate. Properly applied psychological tests would pass over all or nearly all of the acquired knowledge of such individuals about their particular business and related matters, and neglect also, the bulk, at least, of the acquired knowledge of the university man, and so compare merely what might be called two naked brains, the native intelligence of each being the only thing to be measured. As has been pointed out, it is difficult or almost impossible to devise tests that entirely strip the layers of acquired knowledge from the raw mental powers beneath them, but for the practical purposes of the application of psychology and psychological tests in the affairs of every-day life, this can be done within a reasonable percentage of error.

CHAPTER IV
STANDARDS FOR MENTAL TESTS

To test or measure mental capacity or any of the dimensions and powers of the human mind, two preliminary steps are necessary.

First, it must be determined what particular powers or qualities of the mind it is desired to measure.

Second, there must be prepared a standard or scale that is, primarily at least, adapted to the measurement of those particular qualities.

While it is, in practice, as has been heretofore pointed out, impossible entirely to segregate a particular mental quality or power from all the other abilities and capacities possessed by a particular individual, it is possible to select certain characteristics or abilities which, by the degree of their presence or absence, give a fair index of certain mental dimensions or capacities, and to devise tests that, when taken together, will measure these “key-abilities” and so reflect the general ability and capacity of the subject. The standards by which the results of such tests are gauged must necessarily, therefore, be such as have been shown, by experiment and experience, to give the closest possible measurement of the individual’s ability in these particular directions, by enabling the examiner to compare each subject’s performance under the test, or series of tests, with the records made under precisely similar tests by individuals and groups of known ability.

Mental capacity tests may be devised that will measure certain mental qualities of an infant who has not yet learned to talk, and by thus providing a comparison between this particular child’s capacities and the average of children of the same age, enable parents and physicians to determine in what direction efforts looking toward its mental development may most helpfully be undertaken. Thus we may test the infant’s power of observation and perception of shapes, of colours, of sounds and familiar objects before it is able to talk, measuring these by standards derived from experience with similar tests applied to a large number of healthy, normal infants, and by this means determining whether the subject is above or below the normal average for its age and if so in what respects.

At the other end of the scale of mental development, let us assume, is the possessor of the degree of Doctor of Philosophy from any of the great universities, since this is the principal degree the possession of which tends to show the possession of unusual mental powers, if not necessarily of wisdom. By applying to a large number of Ph.D.’s tests which are designed to require for their successful performance the utmost use of all their inherent mental abilities, and arriving at an average of performance by tabulating and comparing the degrees or percentages of perfection achieved by all of the individuals so tested, a standard is set up by which to measure the mental capacity of any individual or group of individuals of superior, or presumably superior, intelligence. By such a standard there may be measured also the mental capacity of men and women who have never seen the inside of a university, but whose education has been acquired in the course of their business and professional activities. This is so because what is measured is not acquired knowledge, but the ability to acquire knowledge, which is quite a different thing.

The simplest way to measure the capacity of a circular tank is to pump it full of water and then measure the water as it is drawn off. But it would be absurd to contend that because there has never been any water pumped into the tank it is therefore impossible to determine how much water it would hold. And what the Doctor of Philosophy has got out of his university course is comparable to the water in the tank. The university may have assisted, and if its faculty were competent undoubtedly did assist him, in discovering earlier in life than he otherwise would have discovered the actual capacity of his mental tank. But there are probably as many men of equal mental capacity whose mental tanks have never been filled with the particular kind of intellectual fluid that the Ph.D. carries about with him, whose capacity there is no other means of measuring than by the application of mental tests based upon the known capacities of Doctors of Philosophy.

The process of measuring the human mind is, indeed, precisely like the process of measuring an automobile by an engineer, as was pointed out in the preceding chapter. Back of the tests that are applied to the automobile to determine its abilities and capacities there must lie a mass of very definite, exact knowledge of all automobiles or all types of automobiles already in existence and whose capacities and limitations are already definitely known. It is of no service to ascertain that the engine cylinders are of four-inch bore and that the piston has a six-inch stroke, unless it is well known what the possession of a given number of cylinders of that particular bore and stroke signifies as to the ability or capacity of an automobile engine. That knowledge has been acquired by the observation and measurement over a period of years of the performance of many automobiles of varying cylinder sizes and number of cylinders, and the comparison of each size and type with all the others.

Similarly, it is of no service to apply a test of any kind to a human being unless we have, in the first place, determined just what particular abilities or capacities we want to measure, and, in the second place, possessed ourselves of knowledge as to the significance of these capacities, after they have been measured.

Here, again, the reader should keep constantly in mind the warnings set forth in the preceding chapter and try to think of mental abilities and qualities not as detached, separate, sharply defined parts of a mental whole (as the engine, transmission and bearings of the wheels of an automobile are detachable, separate entities) but rather as qualities so intermingled and connected by an infinite number of attachments to all the other mental qualities and abilities that no one particular ability can be measured separately or even positively delimited by any sort of test. Even if this could be done in the case of one individual, the process would have to be repeated in each separate, individual case, as in no two human beings is there found exactly the same combination and correlation of the manifold manifestations of conscious sensation and thought that together make up the human intelligence.

But having determined just what qualities and abilities it is desired to measure, we must set up a standard of measurement by which to compare the indicated ability of each individual examined, or we shall have nothing as a result of our test but a mass of information, of the significance of which we cannot possibly be aware. This standard, for some purposes, may be merely a composite record of the performances of a particular group or class examined simultaneously and under the same conditions. That is to say, if all that is required is to determine which individual of a group has the greatest ability in certain directions (and by inference the greatest capacity for further development along similar lines) then all that is necessary is to apply a test that will give a comparative measurement of the intelligence of this particular group. But if the purpose is to ascertain how a particular individual, or the average of a group of individuals, compares in particular kinds of capacity with the average or the most highly developed persons of the same status, education, occupation, or age, then the standard by which the subject must be measured must be one derived from the observation and measurement of the mental capacities of as large a number as possible of individuals engaged in all sorts of occupations and of all degrees and grades of educational attainment. And even where the purpose is merely to determine the relative qualifications and capacities of a particular limited group, it is as a matter of practice desirable, it might almost be said necessary, to compare the performance of each individual of the group with a standard previously fixed and determined as a result of a much broader series of observations and experiments than can be made within the limits of any group to which it is practicable to apply any given set of tests as a whole.

This is true for two reasons. First, without such an outside standard of comparison all that is determined by the application of even the most carefully devised tests to any group is that certain individuals are more and certain others are less able in particular ways than the average of the group. The net result is of service, but of nowhere near the service of a record of the same individuals’ performances graded in accordance with their approach to conformity with a universal standard. For example, one might take two, three, or a dozen automobiles on a speedway and quite readily determine which was the fastest and which the slowest, but unless one were possessed of certain standards of measurements that in themselves have no relation whatever to automobiles, the net result would be of little consequence and of no value whatever in comparing any one of these cars with another automobile that had not taken part in the particular test. In this case, two standards are requisite, namely, distance and time. The length of the course must be definitely ascertained. The time required for each automobile under test to cover the course must be accurately recorded.

Now we have a record of performance that compares at all times with universal standards. If we add another automobile to the group we do not need again to run all the cars, including the new one, along the speedway to determine where the added member of the group ranks with reference to the others; we can apply to it alone a test based upon the universal standards of time and distance with which we have already compared the others, and the new one falls instantly into its proper rank among its fellows. So, too, we are enabled by this means to compare any member of the group with any automobile anywhere in the world, the performance of which has been gauged by these same universal standards of time and space, and we are thus able to tell, not only how each particular car ranks with reference to the limited group of cars, but how it ranks with reference to all cars of all kinds or of a particular type so far as these have been tested by the universal standard.

So in testing groups of individuals as to their intelligence or mental capacity, the use of universal standards of comparison makes the relative grading of the members of the group with reference to each other just as easy and simple as though the only standard were that of the group’s collective performance, and at the same time furnishes a record of the performance of each individual member of the group by which he or she may be readily compared with the members of any new group to which he or she may be at some subsequent time attached, and at all times with the general run of men or women of the same or differing social, economic, vocational, or educational status.

It is in the determination of these universal standards and the preparation of tests, the results of which indicate the individual’s relative approximation to these standards, that the scientific training of the psychologist comes principally into play. Rough standards for testing the more obvious mental capacities might be set up by any intelligent person who would take the pains to collect the essential data. These standards would not, however, be universal unless they were based upon research and experimentation covering as broad a field as that in which the psychologists have been working for many years. Nor would they, except by accident, be as simple and as accurate as the universal standards compiled by scientifically trained persons. For just as the average untrained individual cannot form an accurate or even an approximately accurate estimate of another person’s character and abilities by observation alone, so persons untrained in the study of the human mind are prone to be misled by the obvious and to lay undue emphasis upon external indications which do not, as a matter of scientific fact, actually signify what they are popularly believed to indicate. The scientific psychologist’s training enables him to eliminate to a large extent the non-essentials and to include, in the establishment of standards of mental measurement and the preparation of tests or methods of applying these standards, many facts which, to the untrained mind, do not at once present themselves as important elements.

Even in the simplest of mechanical operations every workman knows that it is not safe to trust to the accuracy of homemade measuring implements. In the absence of a try-square made by a responsible manufacturer in conformity with the universal standard right angle, even the most expert carpenter will refuse to run the risk of error until he has either obtained a new standard from the hardware store or by the application of geometrical science and the exercise of careful and painstaking technical skill constructed for himself a new try-square that conforms, without the variation of a hair’s-breadth, to the universal standard to which he must work. Still less would a good machinist undertake to gauge the close tolerances of an automobile bearing with a homemade micrometer. He knows it is not sufficient merely to have a perfect fit of this particular bearing, which might be worked out by rule of thumb, but that it is essential that the dimensions of the bearing, down to within a thousandth of an inch, must conform to the universal standards for automobile bearings, and that the best implement with which to test the degree of conformity to the universal standard is the standardized micrometer, prepared by specialized methods and produced only by the exercise of highly trained technical skill. Once given such implements of precision, any good workman can readily apply all the scientific intelligence that went into the devising of the standards and the preparation of the methods of applying them.

So, once there are at hand scientifically devised standards with which the mental qualities of any individual may be gauged and compared, and tests have been prepared for the scientific measurement of these qualities with reference to the established standards, the application of these tests to individuals may be made by anybody sufficiently intelligent to grasp their purport and follow directions exactly. It is not necessary, in other words, even for the testing of the most complex and highly developed mental powers, that the actual application of the test be made by the scientific psychologist. It is possible, and it has been the purpose in the preparation of the tests which are presented in this book, to devise mental tests which, if applied precisely as indicated in the instructions accompanying them, will yield the same results in the hands of the wholly untrained examiner as though the actual administration of the tests had been made by the scientist who devised them.

It must not be thought that the result of any test is always 100 per cent. accurate. Even good workmen sometimes make errors in the use of the most precise scientific instruments. Even though constructed with the most painstaking care, according to the truest scientific formulas and by men of the highest technical training and skill, the mechanical instruments of precision are occasionally found to be inaccurate. If this is the case with material implements and dimensions which are finite, concrete, and tangible, how much greater is the liability to error in dealing with the intangible, infinite, and more or less abstract qualities of the human mind. The scientific psychologist is, after all, merely another human being, and as such equally liable with all other human beings to human error. Of no man or woman can it be said that he or she is infallible, and as every one who applies a psychological test is human, and so liable to error either in its application or the reading of its result, conclusions drawn from the results of any particular test should be accepted as accurate only when they have been checked by the results of other tests applied to the same subject, and substantial conformity of the results of one to those of the others has been obtained. For this reason, among others, no single test can be expected to yield definite and complete information as to any particular individual’s mental capacity or ability, whether gauged by the universal standard or by group comparison. It has, therefore, been necessary to establish, as preliminary to the preparation of the Mentimeter tests, a variety of standards, and to prepare a considerable number of tests under each of these standards, all or most of which must be used in each instance if anything approaching scientific accuracy is to be reflected in the resulting scores.

As has previously been pointed out, however, the scientific method is incomparably freer from the liability to error than any method of determining human ability and capacity that depends upon unaided personal observation. How completely this has been demonstrated in practice in a wide range of fields is set forth in subsequent chapters. To yield results of maximum accuracy, however, scientific mental tests must be used only with reference to the standards on which they are based.

Lest it has not been made clear already to the reader how the method of establishing mental standards of comparison operates, let us again briefly try to point out just what is meant by a universal standard of mental capacity.

It is a comparatively simple matter, involving merely a considerable amount of painstaking search and the expenditure of a good deal of time to find, let us say, a thousand engineers, each of whom has demonstrated in the course of his professional practice that he possesses unusual ability to project and design bridges and viaducts. Let us suppose that we wish to take the average capacity of these thousand engineers as the standard by which to measure every budding engineer in the technical schools with reference to the capacity of each to become a planner and designer of bridges and viaducts.

The scientific psychologist must first familiarize himself with the essentials of that combination of artistic, technical, and mathematical skill which makes a great engineer. This is not a simple or easy task to begin with, and to accomplish it calls for the exercise of highly trained mental powers on the part of the investigator as well as a thorough understanding of the operation of the various processes of the human mind. Then there must be devised methods by which, as simply and yet as precisely as possible, each of these thousand engineers of known capacity may be tested as to the degree in which he possesses the various abilities, the sum total of which is the measure of his capacity as an engineer. It may be necessary to make these tests over a period of years, and the tests themselves may and probably will require frequent revision and amendment as it is found in the course of their application that some of them are unnecessary and others inadequate. If it is found that any of the tests so applied is readily fulfilled by every subject examined, the effort is made to increase the difficulty of the test, until it has reached a stage where the perfect performance of all its requirements is barely within the reach of the ablest and most competent of all the engineers under examination. Indeed, some of the tests may be so difficult that none of those examined may conform precisely to the set requirements. In respect of some classes of tests this is, in fact, desirable, as what is being sought is an average of group capacity, and if any considerable percentage exhibit a capacity greater than can be measured by the tests set there arises an element of doubt as to the accuracy of the average combined score, since some of those contributing to it have obviously greater mental ability than can be measured by the particular scale used.

Once, however, tests have been applied to the supposititious thousand expert engineers, and the performance of each of them in each test has been given its proper place in the scale, and an average struck, there has come into existence a preliminary standard; which, however, before being offered for general use in the testing of engineering students and others, must first be tried out by experimental application on as many individuals and groups as are available, and their performance with reference to the standard checked up by all other means available. It may be, and quite frequently is, the case that this preliminary try-out of a standard results in the elimination of some of its elements, the modification of others, and the necessary preparation of a new series of tests based upon the altered standards. But in this fashion, in the course of time and as the result of the combined effort of many trained minds, there is at last set up a standard which is substantially universal in its application, and by which it may readily be determined whether or not any particular individual possesses the mental capacity and particular abilities that have been found to be necessary if he is to develop into a competent engineer.

As psychological tests are more and more widely applied and there is consequently accumulated an increasing volume of data which can be collected, classified, and compared, standards become either more firmly established as a result of experience or subject to modification in the light of the wider range of knowledge. In science nothing is final. What psychology offers to-day is a method of mental tests, the soundness of which in principle is unchallenged, though the application in detail of these principles is subject to constant improvement and refinement.

CHAPTER V
DIFFERENT TYPES OF MENTAL TESTS

The character of any mental test or series of tests is determined primarily, of course, by the purpose for which the test is applied, and, secondarily, by the known or obvious mental limitations of the individual under examination.

Mental tests thus classify themselves, in the first instance, into as many different classes as there are specific purposes to be served by their use, particular kinds or classes of mental ability and capacity to be ascertained, or degrees of previously known mental limitations. Each one of these classifications cuts across all other classifications at some point, so that it is, as a matter of practice, impossible to tabulate or catalogue mental tests in such a way as to separate them into sharply defined or permanently detached groups or classes.

Broadly, all mental tests subdivide at first into tests devised for use with persons of normal mental capacity and development and tests for intelligences that are not fully developed. This is, perhaps, the chief permanent and fixed classification of intelligence tests that can be made, for in a group of tests for the sub-normal mind would be included the entire series of tests adapted for the examination of the mental powers of children of all ages, from earliest infancy to maturity. In fact, the standard method of rating or grading adults of undeveloped or sub-normal intelligence is to classify them by their mental age as compared with the performance of normal children of the same age.

Thus, a man or woman of twenty-five who is able to make a high score in tests which are passed successfully by normal children of eight, but who fails when subjected to tests which a normal child of ten should pass easily, is rated approximately as of mental age nine.

Cutting across this classification is the arbitrary classification of tests adopted in the psychological work of the United States Army, in which every officer and enlisted man is classified as to his relative intelligence by means of scientific mental tests. The Army tests are of three principal kinds. There is a series of tests, known as the Alpha, designed to measure the intelligence of individuals who can read and write the English language. For those who are either illiterate or whose ability to read or write is confined to some language other than English, there is the Beta series of tests. These may register as high a degree of intelligence as the Alpha tests; the results are merely not expressed in terms of the English language. The third classification in the Army is the individual tests, applied to those who fail to make a satisfactory score under either the Alpha or the Beta tests. This is, in its Army application, a system of tests for the sub-normal adult intelligence. Thus the broad classification first set forth above, in substance actually holds in the classification of the Army tests.

Under each of these two broad classifications, and particularly under the first (since in general, every-day practice it is of little service to undertake to analyze minutely the capacities and limitations of the sub-normal mind except in the application of these tests to growing children) there are many possible subdivisions of mental tests, based upon the particular mental qualities which it is desired to measure.

First and most useful generally are general intelligence tests, which must usually be subdivided into a series of related tests. Then, for varying purposes, such as the examination of candidates for particular classes of employment requiring special ability or capacity, there may be applied speed tests, accuracy tests, perception tests, coördination tests, memory tests, mathematical tests, and a wide variety of others. These are tests which primarily measure the subject’s ability to perform certain specific acts under pre-determined conditions, the determination of capacity in excess of that actually demonstrated under test depending upon the facility and accuracy with which the subject responds to the conditions of these tests. Of course, every scientific mental test is based upon the performance of certain acts, since it is only through action of some sort, whether by speech, writing, or the performance of a manual operation, that any one is able to express his mental ability at any time.

But while it is relatively a simple matter to devise tests that satisfactorily indicate the subject’s possession of the more obvious mental powers indicated by such tests as those last listed above, there is another class of mental tests, designed primarily to indicate or determine the possession of the more abstract qualities, the manifestation of which through the individual’s simple and ordinary actions is less obvious to the untrained observer. This is the class of tests that are designed to measure the degree in which an individual possesses such qualities as moral sense, form perception, the power to reason from cause to effect, poetic discrimination, ability to understand complicated instructions, judgment, sense of the right relationship of things and ideas. It is as important, if one is to arrive at a true measure of any individual’s mental capacity, that he be tested as to his possession of these more or less abstract qualities, as it is to determine his possession of concrete abilities. In other words, the normal mind of an intelligent adult is capable of dealing intelligently with ideas and abstractions. The mentality that does not respond with a certain degree of readiness to ideal conceptions is to that extent sub-normal. The only possible way of determining the possession of unusual or super-normal mental capacity is by means of the demonstration that its possessor grasps readily and responds unhesitatingly to the presentation of abstract concepts.

The demonstration itself must, of course, be concrete. Unless the individual possessing extraordinary mental power is able, as Kipling phrases it, to

. . . . press the logic of a fact

To its ultimate conclusion in unmitigated act.

it is of no social consequence whatever that he may possess the mental catholicity of a Shakespeare. There is no place in the modern world for “mute, inglorious Miltons.”

Indeed, it may be questioned whether a “mute, inglorious Milton” ever existed. The world is full of people who regard themselves as “unappreciated.” Everyone is familiar with the unfortunate type that is forever seeking sympathy, constantly on the lookout for friendly shoulders on which to sob out the sad tale of the world’s harshness. Under psychological tests the preponderant majority of this type of individual is clearly demonstrated to be mentally deficient or sub-normal in some important respect. The occasional individual of normal mental capacity who fails to demonstrate that capacity by the performance of specific acts is merely mentally lazy. In other words, it may be set forth as a sound conclusion, capable of scientific proof, that mental capacity in the healthy, normal individual always finds means of expressing itself in concrete and socially useful ways, whenever its possessor actually desires so to utilize his mental powers.

In the devising and preparation of tests intended to measure the less obvious of the mental powers, a considerable degree of ingenuity and the greatest amount of scientific care and technical skill is required. To the person untrained in psychology tests designed to measure the possession of the more abstract powers frequently look childish, if not positively silly. Since it is essential, in the case of Army officers and men, to determine as nearly as it may be done by simple and easily applied tests their possession of a wide variety of mental qualities, some of the elements of the Army Alpha test appear to the concrete type of mind to be futile, if not absurd. But any comprehensive system of mental tests must include, as there have been included in the Mentimeter tests presented in this volume, a considerable proportion which do not on their face appear to be directed toward the disclosure of the ordinary and useful mental capacities. It is of vital importance, if the results of any given series are to give an adequate picture of the actual abilities and possibilities of the subject examined, that tests of this character be included among them.

Each of the possible classes of mental tests may be set to any one of an infinite number of standards. General intelligence tests, for example, may be set to the standard of the average university graduate, so that the result when applied to any individual gives a fair estimate of the subject’s intelligence as compared with that of those who have demonstrated the possession of mental capacity sufficient to complete satisfactorily a university course. Or the standard may be that of the average lawyer, the average high school pupil, the average normal child of any age or school grade, the average skilled mechanic, the average labourer, or the average child below the age of speech. And, in practice, what is measured is, after all, general intelligence.

Intelligence, as has been frequently pointed out, while it does not depend upon the individual’s ability to read and write, is so generally accompanied by the definite and intimate knowledge of the symbols which we call letters, words, and figures, and of their meaning, that in the great majority of cases in which it is desired to apply the test of intelligence this can best be done, or at least most readily be done, by the use of these familiar symbols; in other words, by tests which involve only the acts of reading and writing. If intelligence may be defined as the intellectual power of adaptation to environment, a complete test of intelligence determines the individual’s ability to recognize the situation in which he finds himself, perceive his own relation to the situation, analyze it, and arrive at a conclusion as to what he should do next; then put that conclusion into effect by means of a concrete act. Thus one may learn a great deal about an individual’s mental capacity by observing his conduct when he misses a train. But since it is not practicable to apply this method of inquiry in every case, the next best thing is to ask the question, “What would you do if you missed your train?” To ask this question of a subject is next best to seeing him in such a situation. He must exercise his sense of reality upon it, size it up and plan his reaction.

Since all life is made up of situations in which the individual places or finds himself and from which he must extricate himself, and since the broader the mental capacity, the more easily will the individual meet situations as they arise, the ideal mental test is one that presents a situation such as does or might occur in real life, and requires the subject to extricate himself, or at least to indicate his first and immediate impulse toward action should such a situation arise.

Since the purpose of mental tests is primarily to determine intelligence rather than the possession of physical qualities, it is conceivable that, in many situations, properly devised questions may give a fairer view of the subject’s mental capacity than would observation of the same individual in action in a real situation. Thus a person of the highest intelligence and mental capacity might be deficient in physical courage, so that if we could observe him in action on unexpectedly meeting a highway robber armed with a revolver we might be able to deduce from his actions absolutely no criteria upon which to form a sound judgment as to his mental powers; the same subject, asked the question, “What would you do if held up by a footpad?” might exhibit in his answer unusual ability to perceive quickly and reason soundly to an intelligent conclusion—in other words, to demonstrate his possession of considerable mental capacity.

All properly constructed mental tests are, therefore, in effect, attempts to reproduce or project upon a laboratory scale situations such as the subject is or may be called upon to meet in actual life. It is obvious that ability to analyze quickly and propound immediately the correct course of action when the situation presented is unusual and outside the range of every-day experience indicates clearly the possession of mental ability greater than is required to meet only ordinary and familiar situations. The theory of the mental test as a reproduction in miniature of actual situations is thus commented on by Daniel W. La Rue:

“It is useless to ask a savage what he would do if he missed his train, or an old bachelor what he would do when the baby cried, or a green soldier how he will behave when a shell bursts near him. Further, just which of many millions of situations are so important, or so typical, or so closely correlated with a web of others, similar or dissimilar, that they should be admitted among the select few that form a test? The answer is coming as a slow deposit from the stream of experience and experiment.”

Doctor La Rue, pursuing the same theme, points out with sound philosophy the necessity for grading mental tests to fit the apparent or previously known mental level of the subject.

“We must beware how we use a high-level test to measure low-level intelligence. If our scales are set to weigh nothing less than a hundred pounds or upward, we cannot tell accurately the weight of an eighty-pound man. In particular, since devisers of tests are usually expert in the use of literary symbols, and since ordinary test conditions limit seriously the possible variety of responses open to the subject, we slide easily into the belief that a dextrous manipulation of symbols is the prime display of intelligence. No doubt it is true that in an ideally developed brain the language centres (tracts) are well webbed up with every other trait-tract. Ideally, to experience anything is to be able to utter it. But the stammering lover is matched by the stammering thinker, and there certainly may be intelligent action without the power to put it adequately into words. Probably Cæsar is the only great general who could describe a battle as finely as he could plan it or fight it. Words without deeds, deeds without words: we must be prepared for both. Our old test question, ‘Why should we judge a person by what he does rather than by what he says?’ applies to the test itself.”

Because of the fact that there is a percentage of persons who, either through unfamiliarity with the English language or lack of skill in expressing themselves through word and number symbols, do not respond to tests based on the use of words, any comprehensive scheme of mental tests must contain a proportion of tests the response to which may be made without the use of written, printed, or spoken words. Of such a nature were the Army Beta tests, already referred to, and there will be found in the Mentimeter tests presented in this volume a considerable number of forms that fall into this class of tests. To the person accustomed to dealing chiefly with words and ideas, it is not always readily apparent that proof of a high degree of intelligence can be obtained by means of tests which do not employ these familiar symbols. As a matter of practical fact, however, results which check up very closely with every other means of determining the subject’s intelligence were quite uniformly obtained through the use of the Beta tests in the Army, and similar success has been achieved through the application of tests of the same general character in industry and education.

There is another general class of tests to which only passing reference need be made here. This is the class of trade tests, in which by a combination of oral examination and specific performance the precise ability or degree of skill of the subject in a given occupation or trade is determined. Although frequently confused with psychological tests, this class of tests does not properly come within the scope of mental tests in the sense of being chiefly measurements of intelligence. It has been found, however, in practice that the individual’s native intelligence or inherent mental capacity has, in most occupations, a very decided bearing upon the degree of skill which he or she can attain, even in the simpler mechanical operations. Because of this fact, as well as because the value of trade tests in industry is of increasing importance, some of the principles underlying the construction of trade tests and their application are discussed briefly in a later chapter.

CHAPTER VI
MENTAL TESTS IN THE ARMY

The United States of America entered the World War under conditions of emergency which demanded the maximum of efficiency in the work of military preparation, with the minimum of effort. France was virtually broken; England was tired; Russia was demoralized and disrupted, and Italy was doing very little more than holding her own. The mere drilling and conditioning of the nearly three millions of men which the Nation had called to arms were not sufficient to meet the requirements of the task assumed. America was expected to develop, almost overnight, a fighting force capable of meeting and defeating a Teutonic military machine which had come to be known as the most powerful and skillful in the world.

The gravity of the situation forbade experiments with hit-or-miss methods. It was imperative that no round pegs be placed in square holes. Each one of those nearly three million American soldiers had to be placed where he would be of greatest service. Some simple, quick method of distribution was needed. It was perfectly obvious that these men could not be equally good material for soldiers or officers. Out of so great a number it was reasonably certain that men could be found especially qualified to perform each one of the particular tasks which the infinitely complex scheme of organization of a modern army requires.

It was in accordance with the law of probabilities that there would be contained in this mass of soldier material men highly skilled in every one of the more than seven hundred distinct and specific trades and handicrafts in which artisans were needed for the successful maintenance of the fighting forces in the field. The drag-net of the selective service system was certain to gather in its meshes men who were natural leaders and many more men who could only follow. From every city block, every crossroads hamlet, every village street would come those who could teach and those who could only learn. It was inevitable, moreover, that in this huge aggregation of human beings there would be a percentage of the wholly unteachable, the mentally stunted, fit only to be hewers of wood and drawers of water and sure to be a detriment and handicap to any military organization whatsoever.

In a lesser degree the same generalizations applied to the human raw material admitted to the various officers’ training courses; even though a fairly high minimum of educational attainment was required of all candidates, there was bound to be a wide range of military value between the best and the poorest of this officer material.

Psychology, the science that deals with the human mind, offered the only possible short-cut to the ultimate goal of the placement of every individual in the Army at the point where his efficiency would be greatest. The processes of the selective draft had weeded out the larger portion of the physically unfit. The draft questionnaire, as finally revised, provided for a rough preliminary classification of men according to their own estimates of themselves. But something more was needed—some system for passing the entire Army, officers and men, through a series of graduated sieves, as it were, so cunningly devised, and operated with such scientific precision as to tag, label, and index each and every one so exactly that as little as possible would be left for experience to disclose as to his qualifications for his particular part of the Army’s job.

On April 6, 1917, the United States Congress declared the existence of a state of war with Germany. On that same date there was being held in Boston a meeting of a group of psychologists known as the “Experimentalists,” among whom was Dr. Robert M. Yerkes, president of the American Psychological Association. On receipt of news that America was at last at war, all regular business of the meeting was suspended and those present resolved themselves into an informal committee to consider ways and means by which the psychologists of America could best serve their country.

On the evening of that day, as the result of many conferences, the president of the association asked the council to authorize him to appoint committees on various phases of applied psychology for the purpose: first, of enlisting the coöperation of every trained psychologist in America, including the entire membership of the American Psychological Association; and, second, of determining precisely what service the psychologists could best perform. The proposal met with an immediate response and Doctor Yerkes and his committee went to work.

The Army General Staff was skeptical at first, but Doctor Yerkes and his associates overcame this skepticism and by midsummer of 1917 the Division of Psychology of the Medical Department of the United States Army, with Doctor Yerkes at its head with the rank of major, was actively functioning, and the Committee on Classification of Personnel in the Army had been established and was demonstrating, to the surprise of the General Staff and the War Department, the possibility of determining by scientific means the relative military value and proper military assignment of the officers and men of the Army. By the end of 1917 psychology, as applied to war, had so far justified itself that the Surgeon General reported complete success in achieving the desired results, which he stated, concisely, to be: (a) to aid in segregating the mentally incompetent, (b) to classify men according to their mental ability, and (c) to assist in selecting competent men for responsible positions.

The programme of the Division of Psychology of the Medical Department included mental tests for all recruits during a two-weeks detention period. These intelligence ratings, as they were officially termed, aimed to aid:

(1) In the discovery of men whose superior intelligence suggested their consideration for advancement;

(2) In the prompt selection and assignment to development battalions of men who were so inferior mentally that they were suited only for special assignments;

(3) In forming organizations of uniform mental strength where such uniformity was desired;

(4) In forming organizations of superior mental strength where such superiority was demanded by the nature of the work to be performed;

(5) In selecting suitable men for various army duties or for special training in colleges or technical schools;

(6) In the early formation of training groups within a company in order that each man might receive instruction and drill according to his ability to profit thereby;

(7) In the early recognition of slow-thinking minds which might otherwise be mistaken for stubborn or disobedient characters;

(8) In eliminating from the army those men whose low-grade intelligence rendered them either a burden or a menace to the service.

In three systems of tests in use between May 1 and October 1, 1918, in the United States Army, approximately one million three hundred thousand men were tested.

The test first applied to all, men and officers, who could read English, was known as the “Alpha.” This was a group test. It required only fifty minutes and could be given to groups as large as 500. The test material was so arranged that each of its 212 questions might be answered without writing, merely by underlining, crossing out or checking. The papers later were scored by means of stencils, so that nothing was left to the personal judgment of those who did the scoring. The mental rating which resulted therefore was wholly objective.

The “Beta” test was used for foreigners and illiterates. It could be given to groups of from 75 to 200 and required approximately fifty minutes. Success in the Beta test did not depend upon knowledge of English, as the instructions were given entirely by pantomime and demonstration. It measured general intelligence through the use of concrete or picture material instead of the printed language. It also was scored by stencils and yielded an objective rating.

Both the Alpha and the Beta tests were known as Group tests because of the large number of men to whom they could be given simultaneously. Those men who failed in the Group tests were given Individual tests in which the instructions were given by a trained psychologist working with one soldier at a time in a quiet private office. These Individual tests were of two sorts: one for men who understood English, and the other for men without education and frequently without knowledge of the English language. The Individual tests served as a check upon the Group tests which had preceded them. No man was recommended for discharge or for labour battalions until after he had been individually examined by a psychologist who spent from a half hour to an hour and a half with him, attempting to determine whether or not the results of the Group tests could be relied upon.

To determine the relative intelligence of five hundred men in fifty minutes by a method so completely objective that no part of the resulting classification is based on the individual judgment or opinion of either the examiner or any of the men themselves is certainly a practical application of psychological science. Simple as the Alpha test was, its practical working out and reduction to an exact scientific formula was the work of hundreds of highly trained minds for many months. In its concrete application it looks like a children’s game, but the results are so reliable as to be almost uncanny in the precision with which they tally with the conclusions reached in the same cases as a result of long and intimate observation.

(For full details of the Alpha test the reader is referred to Appendix B to this volume.)

The highest score a man could make in the Alpha test was 212. This is an absolutely perfect score, a correct answer or response to every one of the 212 questions or examples; but any man who made a score above 135 was given the highest possible rating, Grade A, in the mental schedule. There were seven ratings in all: A, above 135; B, which included those making 104 to 135; C, plus, which took in those down to a score of 75; C, for those scoring from 45 to 74; C, minus, for those with scores of 25 to 44; D, for the ones who gave from 15 to 24 correct answers; and D, minus, for those who were unable to answer correctly more than 14 out of the 212 questions.

Now for the proof! Here is an official report of one of many comparisons made between the results of the psychological tests and the actual observations and personal knowledge of men by their officers.

The commanding officers of ten different organizations, representing various arms of service in one camp, were asked to designate (a) the most efficient men in their organizations, (b) the men of average value and (c) the men so inferior that they were barely able to perform their duties. The officers had been with these men from six to twelve months and knew them exceptionally well. The total number of men rated was 965, about equally divided between the three classifications.

After the officers’ ratings had been made, the men were given the Alpha test, and the comparison of results showed that the average score recorded in this test by those men the officers had graded as “best” was approximately twice as high as those the officers termed their poorest men. Of men scoring C, minus, in the Alpha test, 70 per cent. were those classed by the officers as their poorest men and only 4.4 per cent. of those ranked with the ones whom the officers regarded as best. Of all the men whose scores were above C, plus, 55.5 per cent. had been graded by their officers as their best men and only 15 per cent. as among their poorest soldiers.

In another camp 765 men of a regular infantry regiment, who had been with their officers for several months, were graded by their officers in five classes, according to their practical military value. Seventy-six of these men were rated either A or B by the Alpha test; all but nine of these had been graded “one” and “two” by their officers, and none of them had been placed in the lowest grade.

Out of 238 of these soldiers who scored D or D, minus, in the psychological test, all but eight had been placed in the three lowest grades by their officers. The psychological ratings and the ratings of the company commanders were identical in 49.5 per cent. of all cases. In 88.4 per cent. of the cases the agreement was within one step, and in only seven tenths of 1 per cent. was there a disagreement between the psychological test results and the officers’ ratings of more than two steps.

Here is another comparison. Sixty company commanders each named his ten best and ten poorest privates. Without any knowledge on the part of the psychological examiners in this or in any other of the comparative tests as to the ratings the officers had given the men, the Alpha gave the grade of D or D, minus, to 57.5 per cent. of those picked as the poorest and placed all but a fractional percentage of 1,118 men in the same classes in which they had been placed by their officers on the basis of observation and experience.

Those who failed in the Group tests were given individual attention by the clinical psychologist. The examination here was frequently by the Stanford Revision of the Binet test or by the Yerkes-Bridges Point Scale. For men who could not understand the instructions and the language necessary for taking these two tests a series of specially devised performance tests, consisting chiefly of picture puzzles, cubes, squares, crescents, and other forms cut from wood, were provided. The assumption was that a man who has not intelligence enough to place a triangular block in a perfectly obvious triangular hole, or to piece together the six or seven parts which, when properly assembled, make up the figure of a man or a ship is so hopelessly deficient mentally as to be not only of no value, but a positive detriment to the Army. In many instances fully grown men with the mentality of children seven or eight years old were thus weeded out from among the recruits who had successfully passed the physical tests and been inducted into the service. Men making the D, minus, or E score in either the Alpha or the Beta tests were graded as of very inferior intelligence; D, minus, men were held to be fit for regular service but the E men were recommended for service in the development battalions or for discharge.

About 15 per cent. of all the soldiers examined were scored in the D class. They were ranked as of inferior intelligence, likely to be fairly good soldiers but slow in learning, short on initiative, requiring more than the usual amount of supervision, and unable to rise above the grade of private. Most D, minus, and E men were below the mental age of ten years; few men making a psychological score of D had the intelligence of the average normal fourteen-year-old boy. About 20 per cent. of the 1,500,000 soldiers examined by the psychological method made the score of C, minus, which indicated low average intelligence. These men were good soldiers, however, and did satisfactory work in routine matters. The C men, those of average intelligence, included about 25 per cent. of the drafted men and furnished a fair proportion of non-commissioned material.

Those in the C, plus, rating, which indicated high average intelligence, included from about 15 to 18 per cent. of all the soldiers examined. This group provided not only a large amount of non-commissioned officer material, but an occasional soldier whose qualities of leadership and power to command fitted him for a commission.

A man who made a score of B in the Alpha test was graded as of superior intelligence. Between 8 and 10 per cent. of all soldiers examined made the B score. This group included a large proportion of men of the commissioned officer type and a very large proportion of men fit for the higher non-commissioned officers’ details.

Only 4 to 5 per cent. of the men in the Army made the score of A in the Alpha test, which means that they were able to answer in the given time, correctly, more than 135 of the 212 questions in the test. These were men of very superior intelligence—indeed, of marked intellectuality. Men of this mental type who had any leadership ability whatsoever made the various grades of commissioned officers.

The practical application of the psychological tests covered a very wide range. The highest intelligence among enlisted men was required in the Field Artillery, Machine-Gun Battalions, and Signal Corps. Men of the lowest grade of intelligence served as labourers, teamsters, and in other non-combatant service, while men only slightly below the average performed the duties of an infantryman satisfactorily.

By the application of the mental tests it was found possible to bring up the average of particular companies, regiments, and detachments, by exchanging men of high mentality from one regiment for an equal number of men of the lower mental grade from another regiment in which the average of ability was low. A great saving of time and energy was made possible by being able to determine that a particular soldier, on the strength of his psychological score, was qualified to become a good artilleryman, machine gunner, or signal-corps man, or what not. If only in preventing the loading up of combatant divisions with men qualified only for the service of supply, the work of the psychologists made possible the elimination of incalculable delay in getting our overseas contingent ready to fight.

The intelligence tests used in the Army were admittedly imperfect at many points. They were especially designed for and adapted to the testing of a very much larger group than is ever likely again to be subjected to any single test or series of tests, and so, for most civilian purposes, these Alpha and Beta tests cannot be taken as a fair or complete system of ascertaining all the facts which mental tests ought to disclose. But at the time and for their particular purpose they functioned admirably, as all persons familiar with the result obtained will concede.

CHAPTER VII
PSYCHOLOGICAL TESTS IN EDUCATION

Just as intelligence tests in the Army have developed a new appreciation of the significance of analyses of intelligence as a means of selecting the right man for the right place in the military machine, so have scientifically devised mental tests emphasized the possibilities of more rapid and satisfactory progress in our educational activities.

The application of psychology to the measurement of teaching methods in institutions of learning is of comparatively recent origin. Up to ten years ago we had been able to make very little use of tests for the measurement of intelligence in schools, colleges, and universities. We were fighting blindly, as it were, to overcome the problems which faced us at every turn. We had no concrete guide, for instance, in our efforts to select proper courses of study for children and adults of various mental capacities, nor could we decide upon uniform efforts toward the disposal of such questions as vocational guidance, schoolroom procedure, juvenile delinquency, promotional schemes, retardation of children, and the proper treatment of sub-normal and gifted pupils.

The retardation problem, for example, has become serious. Statistics indicate that from one third to one half of the children in the public schools of the United States fail to advance with the speed expected of them. Ten to 15 per cent. are retarded two years or more. Five to 8 per cent. do not come within three years of the state of development set as a standard. More than 10 per cent. of the $500,000,000 spent every year in this country for school instruction purposes is used for reteaching children what they already have been “taught” but have failed to learn.

Many efforts toward reform have been fruitful but disappointing. The supposition that evils in existing systems could be completely cured by adopting new methods of instruction, altering promotion methods, giving increased attention to children’s health, and adoption of other innovations, was less effective, experiments have shown, than was generally anticipated by educators who put these theories into operation. These reforms were less successful than their authors expected they would be, for the reason that the reformers fell into the error of assuming that, under the right conditions, all children would be equally, or almost equally capable of making satisfactory progress. They failed to take into account the fact that there are more than two classes of school children and that they cannot be graded merely as “feeble-minded” and “normal.” There are all degrees of intelligence, ranging from idiocy on the one hand to genius on the other, and any efforts toward improvement of conditions must be applied with full recognition of such differences.

There are wide differences among normal human beings in mental inheritance and these differences affect to a marked degree the capacity of men, women, and children to profit from instruction. Just as the Army had to allow for differences in mental capacity, so must the schools differentiate courses of study in such a way that each pupil will be allowed to study in a manner that is easy for him, whether that manner be rapid or slow.

Dr. Lewis M. Terman, Professor of Education at Stanford University, in California, who writes with more authority than any other author on the application of psychological tests in schools, emphasizes the fact that little progress can be made toward the correction of present evils until we acquire a more scientific knowledge of the material with which we deal. This phase of the problem perhaps suggests the only practical way toward solution.

Intelligence tests in schools and higher institutions have been given a wide range of application, but in virtually every instance the results have justified the claim of superiority for these tests over other methods of classifying students. In some instances positively startling developments have been noted.

Of particular interest, from the viewpoint of educators who already are convinced of the value of intelligence rating in educational institutions, is the report of experiments at Public School No. 64, New York City. The object was to select, group, and train a number of children of very superior intelligence, in an attempt toward the solution of the grading problem.

The experiment was suggested by a survey made several years ago by a psychologist employed by the Public Education Association. Among a number of so-called average children was W. H., a boy. W. H.’s mental age measured about two years ahead of his age in years. His physical development was superior to the average child of his grade, consequently he became an interesting subject to study. He was promoted as soon as he acquired the essential features of the work in each grade, and, without any conscious effort on his part, he accomplished the work of nine grades in two years. W. H. was especially fond of athletics and outdoor sports. He took his school work as a matter of course and showed no indication of special interest in books or study. By the time he had reached the fifth-grade several other boys of approximately the same ability had been discovered.

One day the psychologist, the principal, and one of the assistants discussed the possibility of forming a class composed of children similarly gifted. Special classes for defective children, with a course of study adapted to their needs, had been in existence for some time. Why not organize special classes for children at the other end of the scale, composed of those showing the highest grade of intelligence? Surely these children, society’s greatest assets, were entitled to progress at the speed that was desirable and normal to them. If defective children of ungraded classes were worthy of a course of study peculiarly adapted to their limitations, certainly an enriched curriculum must be provided to meet the needs of children whose capabilities extended to the highest degree of attainment.

The initial selection of children was made from the 5A class of W. H.’s associates, from other fifth- and sixth-grade classes in the school, and from similar grades of Public School No. 15, a neighbouring school for girls. The aim was to choose an equal number of boys and girls from four or five grades. The selection was limited to grades 4B through 6B. The basis of selection was determined by the following factors:

1—The age-grade standard was considered. Those children were selected who were below the normal age for the grade and whose school records showed a standing of general excellence for successive terms.

2—The evidence of superior ability as displayed in oral recitation during visits made by the psychologist and the assistant to the principal.

3—An analytical inspection of school record cards.

4—Two boys, H. R. and R. P., had received prizes in Wanamaker’s drawing competitions. Both of these boys passed the required intelligence tests.

5—A few interesting incidents were the means of discovering some other eligible candidates.

One Sunday evening, while the teacher who later became the instructor of this new special class was visiting the Christodora House, a neighbouring settlement, the leader of the evening hour asked the children the difference between God and guard. A boy, E. R., defined the words in such concise and perfect English that the attention of the visitors became centred on him. Later he was promoted from a school he was attending to Public School No. 64 and was admitted to the class of children of superior intelligence.

E. R. was a fatalist. He told an interested visitor, who questioned him as to how he came to be admitted to the class, that it was fate that he was chosen. He said he had been indifferent about attending the “Children’s Hour” at which his ability had been noticed, but that his brother had urged him to go. “You see,” said E. R., “if I hadn’t gone I might never have been chosen for this class.”

A bright, aggressive-looking boy entered the principal’s office one afternoon and asked the principal if he had room in his class for a “bright 6A boy.” He said he lived in the district of School No. 64 and had heard there were classes for children of excellent record. His report card showed an A-A record and he was admitted. The final issue was determined by the showing of the pupils in intelligence tests devised by Dr. Lewis M. Terman, and by their social traits. Two children who had the necessary qualifications otherwise were not considered because of several unfavourable traits of character.

The foregoing instances are cited to indicate some of the ways in which children were selected for the class. The next factor considered was the choice of a teacher. It was necessary that she show high intelligence or she would not be able to attack the problems which such a class would present. The principal had no standardized test by which to measure her ability but he was guided by many of the principles of general excellence that marked the selection of the pupils.

From a group of eighty-four he tried to select a teacher who showed initiative, ability to meet new situations, both intellectually and socially, one who sympathized with and understood the orthodox training of these children, and who would lead them to follow high standards of American ideals and customs, and whose scholarship was superior, especially in language. All these virtues, in addition to a zest and zeal for the experiment, were embodied in Miss G.

The next important step was to devise a curriculum for the class, which became known as the Terman Class, because the tests used in selecting it had been suggested by Doctor Terman. The grades that represented the first term were 4B through 6B; the second term 6A through 7B; and the third term, 7A through 8B. Formal grammar and arithmetic were assigned sequentially as outlined in the city syllabus. The class in general studied contemporary history, based upon the World War, from newspapers and periodicals, and, whenever possible, these events were related to or associated with past history. Geography was studied in relation to history and then extended until the world geography as outlined in the course of study was acquired.

An extended amount of reading was assigned. The supplementary lists issued by Professors Baker and Abbott, of Teachers College, Columbia University; the reading list of the Ethical Culture School; and the list issued by Doctor Leland, Director of Libraries, were used as guides.

Music, drawing, and physical training were taken by the class as general exercises. These covered the grade requirements. The composition of plays, songs, and dances for special programmes also was undertaken. The privilege of observing plants and live animals, their care, habits, and manner of reproduction, was provided in the nature-study room of the school. Some of the boys were given manual training in the shops of the prevocational school after the regular session of the academic department. The class attended the senior assemblies of the school at least once a week and as many more times as the educational activities of the school permitted. The privileges enjoyed outside the classroom educated these children socially in ways that few pupils of large and congested schools may experience.

One period a week was spent in the reading and study of assigned subjects in the Tompkins Square Public Library. Children were made acquainted with all departments of the library and its facilities. Reference books, magazines, and newspapers were at their service. The children were permitted to use a club room in the Christodora House once a week for musical and social exercises. A gymnasium was at their disposal in this institution two periods a week, and one of the Christodora House’s workers was assigned to teach the cooking club of the class. Another social worker taught a quartette of the class how to play the violin. Two boys who showed aptitude in art were given additional instruction after school at the “Boys’ Club,” a neighbouring institution. The class was taken on excursions to the Metropolitan Museum of Art, the New York Public Library, the Jumel Mansion, and Dyckman House—to study colonial furnishings and historical material—the Museum of Natural History, a sight-seeing yacht trip around Manhattan Island, theatre parties, campfire parties, and flower shows.

During the first term of six months the progress ranged from one to four grades. No pressure of any kind was brought to bear. The children were allowed to advance as soon as they acquired the work of each grade. The younger children reaped the advantage of the experience of associating with those a trifle older. This privilege perhaps accounted for the greater rate of progress by the younger pupils. During the first term the average progress was two and two thirds grades and during the subsequent terms two grades were accomplished each term.

The suggestion, of course, is obvious, that the general application of psychological tests of intelligence to school children everywhere would reveal similar exceptional mentalities in many schools and classes, and that we have at last, in tests of this character, an accurate method of distinguishing between mere parrot-like ability to memorize and repeat lessons and actual mental capacity. That there must result, from the wider application of the scientific method of mental measurement, a general regrading of school pupils, if not indeed a general reorganization of existing schemes and systems of education, goes almost without saying.

The use of intelligence tests for college entrance has shown satisfactory results in several institutions. In one in particular, the Carnegie Institute of Technology of Pittsburgh, a group of the freshman girls in the Margaret Morrison Carnegie School for Girls, was experimented on with such success that the results have been widely discussed.

All of the 114 freshmen were high school graduates. The first-year course, on which the instructors based their estimates of the students, contains the following subjects; physics, sewing, history, English, drawing and colour, hygiene, chemistry, foods, accounting, and social ethics.

Six mental tests were used, designed to answer the following questions:

(1) Can we demonstrate that we can reduce the number of students who are dropped for poor scholarship or placed on probation for poor scholarship by the use of our mental tests for admission?

(2) How do our mental test ratings of all the students compare with the faculty opinion about the general ability of the students?

The first criterion referred only to those who were pronounced as failures and dropped from college for inability to do college work, or placed on probation as doubtful students with two thirds of the regular programme. The second criterion had reference to the whole class, including the good students. A letter was sent to all members of the faculty asking them to indicate the student’s general ability as compared to the general ability of the class. A list of names, with ten numbered spaces after each name, was appended. The tests which agreed fairly well with the pooled judgment of the faculty were retained. The tests which failed in this regard were either improved or cancelled. When the returns were complete the instructor’s estimate was determined for each student and was used as a criterion for the tests.

The tests were analyzed both by correlation methods referring to the group as a whole, and by inspection of scatter diagrams referring to individual students. By devising a critical score it was possible to arrive at a mental-test rating. The results of this system of rating indicated, according to Prof. L. L. Thurstone, of the Carnegie Institute, that:

(a) Seven out of eleven failures could have been eliminated at the beginning of the year.

(b) Eight out of seventeen students placed on probation for poor scholarship should have been eliminated at the beginning of the year.

(c) Not one of the students who were below the critical mental-test rating was acceptable as a student. All of them should have been spared the discouragement which comes from failure and should have been advised to take up some other work.

(d) None of the acceptable students scored below the lower critical mental-test rating.

(e) All of the freshmen rated high by the faculty were above the average in the mental-test rating.

(f) Mental tests have been demonstrated to constitute a useful criterion for admission to college.

In October, 1918, first-year men in Brown University were given two series of psychological tests, an interval of several days separating the administration of Series I and II. Emphasis was placed upon thought and accuracy, rather than upon speed. Two hundred and ten students of the same University took the Alpha test of the Army in January, 1919. Of these men, 103 also had taken the Brown University tests, Series I and II. This made a comparison possible.

Two hundred and twelve men took Series I. Both the average and median were 66 on the basis of 100 as a maximum score. One hundred and seventy-eight men, all of whom had taken Series I, took Series II. It was administered after the students had begun military training of a rigorous nature and when they were far from fresh. The composite score of Series I and Series II, made from the records of one hundred and seventy-eight men who had taken both tests, showed that the Brown University Series proved as good as a measure of scholastic standing as did the Army test for military fitness.

Prof. Stephen S. Colvin, of Brown University, writing on these psychological tests, says that in addition to the evidence obtained by correlating the test results and the students’ academic marks, as to the relation between the scores of the psychological tests and academic standing, there is further indication that the psychological tests proved of considerable value in showing the probable success of a student in his academic work.

During the first half of the year, eighty students were reported as doing unsatisfactory work. Of these eighty students, thirteen had received a score of “good” or “very good” in the psychological tests; fourteen had received an average score: while in the cases of fifty-three the score was either “poor,” or “very poor.” During the second term, thirty-four men were reported as doing considerably above average grade. Of those thus reported, five ranked “superior” in their psychological tests; nineteen “very good”; seven “good”; two “average”; and one “poor.”

Interesting results were noted in intelligence tests at the University of Illinois on March 6, 1919, when nearly 3,500 students, who were distributed in twenty-four different halls, were examined simultaneously. The Army test (Alpha) was used. Various members of the faculty, including deans, volunteered for special preparatory training to act as examiners and alternate examiners. It was an interesting spectacle to witness eminent men voluntarily in the rôle of students and being “tested.”

In a summary of the results of the tests, Dr. David Spence Hill says:

“The smallness of difference between median scores of classes within each college of the large groups of students is insignificant. As between freshmen, sophomores, juniors, and seniors the extreme difference was less than 2 per cent. in the college of literature, arts and sciences; less than 4 per cent. in the colleges of engineering, and of agriculture; about 5 per cent. in the colleges of commerce, and less than 3 per cent. in the three years of the graduate school. Differences as small as these are safely to be accounted for by chance or by variations of one kind and another.”

The report of the value as a whole of the intelligence test, signed by members of the University staff, says, in part:

“On the whole, the experiment performed by the energetic coöperation of nearly four thousand university people may be regarded as remarkably successful for the purposes intended. If for no other reasons, it has been worth while as a study of a device used already upon nearly two millions of men engaged as soldiers in the great historic undertaking—the World War. It has been a means of self-revelation to many persons on the campus. When the statistics are all worked out in careful detail we shall obtain new insight into some educational problems.”

At Hamline University, St. Paul, Minn., the Alpha test was given to 74 men and 145 women, but reports on the results of the test are confined to 61 men and 145 women. The median for the men tested was 129 and 133 for women. The higher level for women was accounted for by the fact that there were more seniors and juniors among the women than among the men. The medians for these two classes of women were 138 and 150 respectively, but for the men in the same classes, 132 and 130 respectively. A somewhat higher standing for women was evident when the entire series of tests were considered, although the mathematical problems in the tests were harder for the women.

In questions of practical judgment, disarranged sentences and analogies, all of which involved nimbleness of wit, the women showed superiority to the men. In questions of general information, however, the men established a lead over the women, but of only 2.5 per cent.

Prof. Gregory D. Walcott, who reports the tests at Hamline, is not convinced that the Alpha tests, designed for military purposes, are the best for determining the fitness of students for college work. He says, however, that the degree of correlation obtained in the Hamline tests indicates that the Alpha tests are of tremendous value.

Intelligence tests are being used at regular intervals at the University of Rochester. The method of application is described as follows by Louis A. Pechstein, Professor of Psychology at the University.

“We call our freshmen to the campus a week early. The introductory week is given largely to lectures on college ethics and problems of study. During the first day of the week I give all the entrants both the Alpha and the Otis Group Intelligence tests. The marks and groupings are turned into the office and, so far as possible, we shall make up several representative classes of men supposedly of the same general mental make-up.

“During the first term we shall test the entire student body and then begin to correlate with teachers’ opinions and grade records. In no sense are we committed, but we shall try to influence our programme making and section determination by the testing results. Then I shall issue a report to each student regarding his standing, apparent strong and weak processes, and try to help him in his development.”

Other reports from schools, colleges, and universities indicate the widespread adoption of intelligence tests in determining the probable measure of success which a student will attain in his studies, or whether he is fitted, mentally, for the career he contemplates.

The group tests of intelligence have demonstrated their value in educational work to such an extent that, following the lead of Columbia University, a large number of prominent American universities and colleges are employing tests of intellectual ability as at least partial substitutes for the time-honoured college entrance examinations. Instead of requiring each prospective student to take an examination in which he would be required to demonstrate that he remembered the facts learned in high school, the present scheme is to examine the men who desire to enter college by means of the psychological tests designed to measure general fitness and intelligence. The theory behind this movement is that men should be allowed to enter college provided their intelligence and mental capacity is such as would enable them to profit by the instruction, regardless of whether such men could recall the required percentage of the facts taught them by their high school teachers.

This same philosophy will undoubtedly spread very widely through the high schools and elementary schools as well as through the colleges. A child should be allowed to undertake that work for which he is fitted by nature and intellectual capacity, regardless of what his past academic training may have been. It is unreasonable to require young men who, because of some accident, left school early in life and have continued their education through their own efforts, to go back and begin with younger pupils a course of study, which will have very little practical value to them, before they are allowed to undertake the professional courses they desire and are capable of undertaking at once. The group-examination method, which is employed by the majority of the Mentimeter tests, has been the greatest possible stimulus to the employment of intelligence examinations, because of the great saving of time which it affects over the method of individual examinations.

CHAPTER VIII
MENTAL TESTS IN INDUSTRY

The case for scientific mental tests as a prerequisite to the employment of beginners in business and industry has been well put by Dr. Henry C. Link. In addressing a convention of California railroad men, Doctor Link said:

“Would you, gentlemen, enter into a contract to buy material from a concern, the excellence of whose product you had grave reason to doubt? Would you place orders to the extent of three and one half millions of dollars a year, waive inspection of material, accept whatever was offered you, and make no effort to get your money’s worth? You would not—not if you expected to hold your job. And yet, that is what you are doing with respect to the public education system of California. In 1916 the railroads of this state paid in operative taxes $7,151,583. Of this sum 51 per cent., or $3,647,300, was used for purposes of public education.

“The boys and girls sent you from the public schools you take into your service, sometimes after a perfunctory mental examination, generally with none; in other words, you waive inspection, and then complain of the character of material after it has reached you and been paid for.”

It is, of course, in the case of the untried beginner in business or industrial life, the boy or girl fresh from school who has as yet had no opportunity to discover or to demonstrate his or her ability or capacity, that the application of scientific mental tests is most essential.

The skilled worker of long experience, master of his craft or of one or another of the specialized mechanical operations that enter so largely into modern industrial processes, has already found a definite place in the scheme of things and a simple trade or performance test is all that is required to indicate where that place is. For the present, at least, we are concerned with the worker of this class only long enough to point out, in passing, that a generally adopted scheme of intelligence measurement might have disclosed the possession by any individual of this group of abilities that would have given him a broader field and a happier and more useful existence, had he and those responsible for giving him a start in life been made aware of them early enough. Even to-day, when he has been engaged in his narrowly limited field of work for the better part of his active working life, he may have latent or undeveloped mental capacity such as would qualify him for more important, better-paid employment were some means provided for disclosing its existence.

There is, in fact, no degree or kind of employment for which a more intelligent and satisfactory selection of employees cannot be made by means of properly devised mental tests, accurately applied, than by any other method now in use. Under the direction of Dr. Walter Dill Scott the Carnegie School of Scientific Salesmanship of Pittsburgh has demonstrated the usefulness of the scientific method when applied not only in the selection and training of salesmen but for the choosing of men qualified for the most important executive positions in large industrial and business establishments. A large number, possibly as many as a hundred, of the largest industrial corporations of America have already (1919) adopted in whole or in part some system of scientific mental tests for the classification and grading of present employees, the selection of new employees, and the filling of vacancies by promotion. It is the unanimous testimony, whenever a properly devised system of tests has been applied in accordance with scientific methods and without prejudice, that the actual saving in time and expense as well as in the disorganization resulting from a heavy “labour turnover” has in every case been highly profitable from the employer’s viewpoint, while it almost goes without saying that the benefit to the employee in being accurately placed in the position in which he is best fitted by his natural mental endowment and capacity to function makes for individual contentment and satisfaction and for steadier and presumably higher earning power than the old hit-or-miss method could possibly do.

Next to the beginner in industry or business, the boy or girl starting his or her vocational career, the class to which the application of scientific mental tests is of the greatest benefit to employer and worker alike is the large group of unskilled, untrained workers, men and women of no particular trade, the “floaters” and seasonal workers, who turn their hands to whatever employment opportunity offers without developing especial skill at any one recognized trade or occupation.

In our modern industrial system, a very considerable part of the personnel of our factories, shops, and stores consists of this class of untrained workers. They try their hands at many things and fail in most. They constitute the majority of those who respond to “Help Wanted” advertisements and are willing to try any sort of work; their chief occupation in life is hunting for jobs.

This need not remain forever true. Because there is not in general use any intelligent or accurate method of determining whether or not any one of these unskilled, untrained workers possesses the elementary mental capacities requisite for a particular sort of employment, it is not surprising that most of them fail to make good in the jobs into which they are indiscriminately shovelled. Yet the great majority of them do possess mental capacity of a nature and degree which, once it is ascertained, indicates their definite fitness for some particular sort of work no less than it does their definite unfitness for many other kinds of work which they are prone to undertake.

Just as war conditions brought into the Army an enormous mass of young men whose capacity and special abilities had to be determined by scientific tests before they could be assigned to the places where they could most usefully serve in the military scheme of things, so the same exigency of war brought into the industries of the country, largely centred upon the production of munitions of war, millions of women without industrial experience or vocational training but upon whose efforts the nation had mainly to rely for the output of weapons, ammunition, military equipment and accessories without which the Army and Navy could not have functioned. In a large class of plants engaged in munition production the chief demand was for sufficient muscular strength, with a slight modicum of intelligence, for the operation of automatic machinery. But in the vitally important work of inspecting, testing, and sorting the finished product of even the most highly perfected automatic machines and in many of the more delicate operations of assembling and adjusting devices and apparatus made up of a number of more or less complicated parts, intelligence and mental capacity of several different kinds and ranging up to fairly high degrees were called for.

In a number of the larger munitions establishments scientific mental tests were adopted for the selection and assignment to particular tasks of the women workers. Wherever this was done it was found that the output was increased, a higher average of quality maintained, and the labour turnover greatly reduced.

In one of the largest groups of munitions plants at Bridgeport, Conn., there was worked out, under the direction of Dr. Henry C. Link, a system of scientific mental tests which checked up so closely with the actual results obtained by the most skilful workers that their adoption for the examination of all applicants for these positions resulted in very definite time and money savings and increase in plant efficiency.

Two types of work, conducted side by side in the same room, were settled upon as the most fruitful fields for the first experiment. The work chosen was that of inspecting shells before they had been loaded, and of gauging them for head-thickness. This work was being done by 330 girls, two thirds of whom were engaged in inspection and one third in gauging.

The work of inspecting shells was done at a table constructed like an upturned, shallow box. Upon this table was dumped a large box of brass shells, not yet loaded, and all of exactly the same kind. The work of each girl was to inspect these shells and throw out those that were defective. A girl would first gather up a handful of shells, being careful to have them all pointing in the same direction. Then she would put both hands around the shells and turn them up so as to expose their insides. She would then look down into every shell for dents, scratches, stains, and other very minute defects. When any such defect was discovered the shell was extracted from the pile and thrown into one of three or four “scrap” boxes. The entire handful was then turned over and the head of every shell examined for various defects. The shells were then held in a horizontal position on the left hand and allowed to roll from the pile into the right hand. Each shell, in rolling, exposed its lateral surface and was closely scrutinized for scratches, dents, oil stains, and other defects. The good ones were taken in the right hand and dropped into a pocket at the right side of the table, through which they fell into a box below.

This operation required good eyesight (in order to distinguish defects, which frequently were so minute as to be indistinguishable to all but the best of eyes); keen visual discrimination (the ability to determine, with a few glances, which shells were defective); quick reaction (ability to extract, as quickly as seen, the defective shell and toss it into the appropriate box); accuracy of movement (ability to pick out the right shell from a closely held handful); steadiness of attention (ability to prevent bad shells from slipping by or unduly lengthening the operation).

A set of eight tests was selected for the body of the experiment. The first was a simple eyesight test. The second was a card sorting test. The subject was given a pack of 49 cards, upon the face of each one of which from 7 to 12 letters were distributed promiscuously. Twenty of the cards contained the letter “O” and the rest did not. The subject was asked to sort these into two piles, those which had “O” on them and those which had not. The time required for this performance was taken and the number of errors recorded. The object of the test was to bring out the subject’s ability to pick out the essential element from a more or less heterogeneous collection of elements, and also, in some measure, to bring out the deftness of the subject in handling cards.

The third test was a cancellation test. The subject was requested to cross out, with a pencil, every 7. The fourth was a simple “Easy Directions” test. The fifth was a number-checking test, in which the subject was asked to place a check opposite every group which contained both a 7 and a 1. The sixth test was a tapping test, in which the subject was required to push down, as rapidly as possible, a telegraph key to which was attached a counter. The number of recorded thrusts over a period of one minute constituted a record for that performance. The seventh test was an accuracy test. This was given with the aid of a brass plate with nine holes, graduated in size from ½ inch to ⅛ inch in diameter. The subject was asked to take a brass-pointed pencil and insert it into each hole, beginning with the largest and continuing through the smaller ones, until the pointer touched the brass side of one of them. The brass-pointed pencil was wired in circuit with the brass plate containing the holes so that, whenever the brass point touched the side of the hole or any part of the brass plate, an electric contact was made which produced a click in a telephone receiver which the subject held to her ear. At the start of the test, the subject was instructed to put the brass pencil into each hole in succession until she heard a click in her ear, when she was to start all over again. The speed of the subject’s movements was controlled by a metronome, so as to allow thirty trials per minute. This test occupied from two to three minutes.

The eighth test was a steadiness test. This consisted of two brass bars about twelve inches long, set so as to form a long, horizontal V. The subject was asked to take the brass pointer and pass it along between these two bars. The farther she went, the narrower became the space between the bars. As soon as the brass pointer touched one of the bars it produced a click in the telephone receiver. The point at which this brass pointer touched was then read on a scale on the lower bar. Each subject was given fifteen trials and the last ten were averaged and constituted the subject’s average.

These eight tests were given to seventy-three girls, fifty-two of whom were inspectors and twenty-one gaugers. The scores in the tests were compared with the average daily work of the girls. This average was obtained by recording the number of pounds of shells inspected by the girls and the number of hours required for the work. It was found that the inspectors who inspected the largest number of shells in a given time attained the largest scores in the tests, thereby indicating the value of the tests in determining whether an applicant for work as an inspector had the mental capacity for the work.

The same tests were given to the twenty-one girls engaged in gauging the head-thickness of shells. This work does not require the use of the eyes. The operator simply picks up a handful of shells and, with or without looking, tries the head of each shell on a gauge. The gauge is a piece of steel with two notches or openings. The shells which are too small pass through the first opening and fall into a box of rejects below. Those that do not fall through are tried on the second opening and, if they pass through, they are of the right size. If they fail to pass through they are too large and are thrown aside. The operator sits in front of her gauge and tries each shell at one opening and then another, just as rapidly as she can move her hands up and down.

The tests showed, in this instance, an entirely different set of correlations. The comparative correlation scores follow:

TESTS	INSPECTORS	GAUGERS
Card Sorting	.55	.05
Tapping	.14	.52
Cancellation	.63	.17
General Intelligence	.14	.18
Number Group Checking	.72	—.19

Perfect agreement between average daily work and score in the test would be indicated by a correlation score of 1.00, while lack of relationship would be indicated by a correlation of 0 or nearly 0.

The score of the gaugers in the tapping test (.52) showed that they were speedier and had greater endurance. This seems reasonable since, in the operation of gauging, speed of movement and endurance are the chief factors. In the visual discrimination tests, such as card sorting, cancellation, and number group checking, the scores of the inspectors were higher. This quality, however, was not necessary to successful operation in gauging.

In other operations the results of these tests proved their value as a factor in eliminating blunders in the employment office. Girls who seemed, from observation, to possess the very qualities necessary for one or another operation, frequently puzzled their superiors by their failure to perform some highly important operation of their work. The eight tests would have demonstrated this particular inability and would have saved thousands of dollars lost through delay and mistakes. Similar results were obtained in experiments with men workers.

In almost every industrial enterprise, clerical work of some kind or another is necessary, and a problem of universal interest has developed around the selection of clerks. The time required to “break in” new employees runs from two weeks to two months, according to the nature of the routine, and this process invariably is very expensive. By means of standardized mental tests the whole process may be greatly simplified.

In an experiment recently reported tests were given to fifty-two men and women engaged in clerical and near-clerical work. An aggregate number of 440 tests was given. The manager of the department had made a study of these people and had attempted to rate them as to their actual ability.

The tests were classified under the head of tests for technique and tests for intelligence. By technique is meant the speed and accuracy shown by clerks in sorting tickets and papers, posting and adding columns of figures, indexing and filing, and in other routine clerical operations. The term intelligence is interpreted to designate the facility and success with which a clerk could master new tasks and follow directions about new work assigned from time to time. The clerk’s technique was indicated by steadiness, arithmetic, card sorting, and substitution-of-letters tests. The intelligence tests included a “hard-directions” test and an “abstract-relations” test, similar to those given in the Mentimeter in this volume.

When all the tests had been given the results were computed and tabulated so as to bring out the following points: (1) the rank of each individual with reference to all the rest; (2) the relation of each of four groups to each other; (3) the relation between technique and intelligence. The results were then submitted to the office head, who compared them with his records and with his own opinion of the relative merits of the various individuals. This comparison showed a very marked agreement between the testimony of the tests and the rankings of the office manager.

The results of these tests so impressed the office manager that he decided to give them to all incoming clerks. One of the first candidates to be examined was a young woman who had recently been interviewed by one of the office heads. The candidate was so unprepossessing in appearance that in spite of signs testifying to her intelligence, the office head was in doubt as to the advisability of hiring her. The psychological tests were applied. When this was done the young woman did remarkably well in every test. She was then hired, and proved herself so ready and capable that it was decided to train her for the work of an office assistant. In six weeks she had mastered the routine of four different kinds of work. This was a striking instance in which the testimony of the tests belied the testimony of observation.

Although there were certain inadequacies in the tests applied, as well as in the judgments obtained from office heads, the value of the results became more and more clear with each passing month. For example, 188 clerks recommended on the basis of the tests and followed up at intervals of one month for a period of three months were estimated as follows:

Percentage of those called good by their superiors

At the end of one month	75%
At the end of two months	89%
At the end of three months	92%

Another series of interesting experiments to determine the mental capacity of workers in industry was directed at stenographers, typists, and comptometrists. The work of these kinds of workers has been specialized by the use of a standard machine, and in applying tests to this kind of work it was necessary, therefore, to take into consideration two important factors: first, the skill already acquired by the workers at a certain machine; second, the aptitude which the worker possessed for improvement in the use of the machine.

Relevant tests were given to two senior classes of more than three hundred girls and boys in a commercial high school, to seventy-six pupils in two business schools, to a group of twenty-two office typists, to another group of nineteen stenographers, to over four hundred candidates for positions as typists and stenographers, to three groups of more than one hundred and forty comptometrists; and finally, to more than one hundred and twenty candidates for comptometry. More than one thousand persons were tested and more than five thousand tests were given.

Tests for typists included copying, spelling, substitution, and the Trabue Completion test. In the copying and spelling tests, office forms were used. A number of words, purposely misspelled in characteristic fashion, were mingled with words correctly spelled, and the applicant was asked to check off those incorrectly spelled. It was discovered, in the substitution test, that if an applicant without much previous experience in typing does very well in the test, the indication is that she has the necessary aptitude or potential ability to become a good typist with practice. The success of the applicant in the Trabue Completion test indicated his or her ability to complete sentences parts of which are missing. The ability to do this is a great advantage to the typist and one which will increase her capacity.

The Trabue Completion test also proved valuable in determining the ability of stenographers. The most important test probably, for a stenographer, is of her ability to take and transcribe dictation. Tests were given as nearly as possible at the speed which was best adapted to the applicant’s ability. The results were then graded on the basis of the total time consumed and the amount of work done correctly.

In experiments for determining the ability of computing-machine operators various tests were used. One of the most important was a mental-arithmetic test. This was designed to determine the applicant’s fundamental knowledge of arithmetic. Another was a numerical substitution test. In each of the tests conducted the scores of the applicants were compared with the rankings made previously by department heads, and in most instances there was an agreement of sufficient approximation to indicate the value of the tests.

Although still in its infancy, as it were, so far as its practical application in industry goes, the scientific method of mental measurement, wherever and whenever applied in accordance with true psychological principles and by standards and methods devised by trained psychologists, has so completely demonstrated its economic value and social usefulness that its general adoption, as these facts become more generally known, seems inevitable.

CHAPTER IX
HOW TO USE THE MENTIMETER TESTS

The Mentimeter tests differ from the Alpha tests, or from the Beta test of the United States Army, from the Otis test, or from any other system of tests now available, chiefly in their flexibility. Rather than present to the public a certain fixed and invariable group of eight or ten tests which are to be used wherever a measure of general intelligence is to be employed, as has been done in other cases, the present authors have chosen to present a wide variety of tests from which each reader may select those for his use which actually give the best results.

It is not probable that exactly the same tests would select men of high intelligence in the graduate work of a university as would be needed to select the intelligent men in a logging camp in the wilds of Canada or our own Northwest. The present authors do not profess to know just how much of each mental trait is required to make up a perfect superior intelligence, and for that reason they have not attempted to propose any single group of tests as the best measure of intelligence. The reader is asked to “try out” such tests in the Mentimeter series as seem to him to offer greatest promise of usefulness, and then to make up his own “team of tests” in such manner as will best reveal the kind of intelligence in which he is interested.

For the benefit of those who wish some suggestions as to the tests which would probably be most useful in the main lines of work to which intelligence tests may be applied, the authors here propose certain tentative or suggestive lists which would seem to them to offer great promise of successful use. For the classification of clerical workers in business and industry, the following tests should at least be given thorough trial:

MENTIMETER NO.	TITLE
6.	Completion of Form Series
7.	Checking Identity of Numbers
8.	Digit-Symbol Substitution
9.	Completion of Number Relation Series
16.	Naming Opposites
23.	Completion of Sentences
24.	Analogies
28.	Arithmetic Reasoning

It is possible, of course, that some employer who makes the trial will find a half dozen other tests that show more accurate results in classifying clerical workers than will be shown by any test in the above list, but such a thing will probably not happen, for the type of test which has been useful in similar situations will probably prove useful again. If such a thing did happen, however, the employer would be foolish and unscientific to retain the list suggested above when he knew of a better list.

In the classification of the intelligence of labourers, the authors would suggest that the following tests be given fair trial:

MENTIMETER NO.	TITLE
2.	Pictorial Absurdities
3.	Maze Threading
5.	Dividing Geometric Figures
6.	Completion of Form Series
9.	Completion of Number Relation Series
18.	Range of Information
28.	Arithmetic Reasoning
29.	Practical Judgment

For classifying public school pupils according to their general intellectual power and ability to learn, the authors propose that the following tests be employed until a different selection has been proved to be superior:

MENTIMETER NO.	TITLE
2.	Pictorial Absurdities
3.	Maze Threading
8.	Digit-Symbol Substitution
16.	Naming Opposites
20.	Reading Directions
23.	Completion of Sentences
28.	Arithmetic Reasoning
29.	Practical Judgment

As being more strictly education tests rather than tests of intelligence the reader’s attention is invited to the following list:

MENTIMETER NO.	TITLE
10.	Addition
17.	Spelling
19.	Reading: Vocabulary
21.	Reading: Interpretation
25.	Handwriting
26.	English Composition
27.	Poetic Discrimination
28.	Arithmetic Reasoning

The most profitable list from the point of view of social entertainment would seem to be the following:

MENTIMETER NO.	TITLE
2.	Pictorial Absurdities
3.	Maze Threading
5.	Geometrical Figures
6.	Completion of Form Series
18.	Range of Information
20.	Reading Directions
22.	Disarranged Sentences
23.	Sentence Completion
24.	Analogies
27.	Poetic Discrimination
29.	Practical Judgment
30.	Logical Conclusions

Whatever the purpose for which the tests are to be used, the best results can be obtained only by securing from the original publishers the carefully printed forms prepared by the authors of the tests. Mimeographed copies of test blanks or privately printed blanks are certain to differ so much from the true form that the results obtained therewith cannot be directly compared with the official results.

Long experience has likewise demonstrated, fairly clearly, that the best results will be obtained in any industrial organization or educational staff by making one person chiefly responsible for the proper administration of the intellectual and educational measurements. If a personnel director is at hand who can study his tests just as scientifically as he studies his men, progress and improvement in the methods and results are inevitable.

Measurements of intelligence are by no means the only or final criteria by which the successful personnel manager wins success in his work and saves money for his employers. He makes use of every piece of information about his men that it is possible for him to pick up anywhere. The trade tests particularly offer a wide field in which measurements of intelligence may be supplemented and made more useful. Of two men who are to-day working in the same trade, receiving the same wages and making the same score on their trade tests, that one is more promising who has the higher intelligence score. On the other hand, of two equally intelligent men, as measured by the intelligence tests, that one who has attained within a given time the higher proficiency in his trade is superior.

The chief value of the group intelligence tests will probably always be in the classification of large groups of persons into smaller, well-defined groups, the members of which groups may then be studied more carefully and by more exact methods in the hands of a trained psychologist, if necessary. Until the group method of examination was developed, making it possible to test the intellectual ability of every employee without tremendous expense in time and money, it would have been most foolish to talk about maintaining a continuous inventory of the mental strength of an organization, and yet such an inventory is now possible—just as possible as the record of the condition and capacity of each machine owned by the company.

Prospective users of the Mentimeters need to bear in mind that mental powers are far less constant in their amounts than are the dimensions and measurements of a piece of steel or lumber. Even the length of a steel rail varies between winter and summer, but the variation that occurs in the strength of mental connections from day to day or from hour to hour is very much greater than the variations of the steel rail. Except by chance one would not obtain exactly the same score a second time in taking a Mentimeter test, or any other test of mental ability. Being for the most part constructed on the “increasing difficulty” plan, however, the Mentimeters will prove much less influenced by recency of drill and nearness to the lunch hour than will most other tests, especially less than those speed tests which measure how many simple tasks one can do within a given time limit. The Mentimeter ideal is to test power rather than speed.

No single set of tests should be used as final and conclusive in the public schools with regard to the kind of work which a given boy or girl should undertake. The Mentimeter tests may be used as a first “drag-net,” but those caught in this net should then be carefully studied by the most refined methods known to psychologists before being recommended for particular types of special instruction or sent to special schools. One of the most hopeful signs in the entire educational field is the number of cities that are employing psychologists to follow up the results of group examinations in the schools. Many of the state universities have established bureaus to serve the local communities [^[1]] in such matters. The very finest measurements are of no avail unless something is done about the results disclosed.

[1]. There has recently been established in Teachers College, Columbia University, New York City, a Bureau of Educational Service, the Director of which would be glad to answer questions or advise with any one interested in measuring intelligence or educational results, regardless of the state or community in which one may live.

For each of the Mentimeter tests, the authors have classified the possible scores into five general groups: Superior, High Average, Average, Low Average, and Inferior. This classification is very rough and should not be wrongly interpreted. An individual who is tested with three or four or more of the Mentimeter tests should not be expected to receive the same classification in each test. In the Handwriting test, for example, a person might well be expected to make a rating of “Superior” in quality of writing while making only “Low Average” in speed of writing. The same person might well make a score on the test of Poetic Discrimination which would classify him as “Inferior.” Although there is a tendency for people who are superior in one line to have high abilities in other lines, it is only a general tendency, which will not hold good in all cases and with regard to all varieties of ability.

For the most accurate scientific work the reader will probably disregard entirely the fivefold classification of scores mentioned above. The finer distinctions made by the numerical scores will be studied, and interpretations will be made for the specific purposes of the examiner. It is probable, for example, that comparatively few children at the age of eight years would be classified as being better than “Inferior,” if these rough general classifications were to be the only record kept of performance on these tests. On the other hand, very few clerical workers of proved ability and success would make a classification as low as “Average,” except possibly in a few specialized-ability tests. The important point to be considered by the teacher of a second-grade class, or by an employer of clerical workers, or by any other person who wishes to make serious use of these tests, is the relation of the scores in the test to the relative abilities of the persons in the special group tested. The tentative classification of scores made at the end of each section of the chapter which follows this is for human beings in general and will not fit well any specialized group of persons.

In order to assist readers who have no statistical training in the evaluation for their special purposes of any particular Mentimeter test, a few pages will be devoted to an elementary statement of how to try out scientifically the relationship between a test, on the one hand, and demonstrated ability in any special line of endeavour, on the other. It may be stated here again that not all traits of mind are important in every task that must be done in life. Some positions require only a little intellectual ability while others require a great deal, and some tasks require very great development of a few traits which may be very little called for in other equally important tasks. The authors have used their best judgment as to which tests will probably select the type of persons needed in a certain type of position, but the judgments of other equally experienced men would be just as good. The final proof of reliability in a test can come only by actual trial of that test upon men of various degrees of demonstrated ability in the trade or profession concerned. What follows is a statement of how to measure this correspondence between demonstrated degree of success and score in a test, or between the scores of the same persons in two or more different tests.

No measure of relationship between success in life and success in a test can be any more accurate than the original measures of success from which the calculation is made. If the measures of success in life are unreliable, then the measure of their relationship to success in a test will be even more unreliable. The more definite and certain one can be of his measures of success, the more reliable will his measure of relationship be.

In productive labour, especially where payment is based upon the number of standard articles produced in a day, or upon the number of standard operations performed in a given time, the records of actual performance are probably the best measures of success available as a standard against which to judge the reliability of a test. The record for one day or for one week would be less reliable usually than the record for a month or a longer period.

In many business organizations and industries there is no such satisfactory standard of success as individual production records, and in such cases it is necessary to make use of the judgments of foremen, supervisors, or superintendents. These are far less satisfactory records of efficiency and are subject to gross errors and prejudices, but they are the only available measures of many workers. If the rating as to ability is the consensus of the judgments of two or more supervisors, each making his rating without any reference to that made by any other person, the result is much more reliable than the rating of any single supervisor would be.

Very grave errors creep into a rating of efficiency where the ratings are made by different supervisors, each supervisor rating only a few men. Even where a detailed schedule of qualities is listed, each to be given a definite weight or importance in making up the total rating, as in the Army Rating Scale, the degree of ability which one man’s experience leads him to call “Average” will call forth a rating of “Superior” from another equally able supervisor whose experience has been with slightly different people. If individuals A, B, and C are rated by the first supervisor and individuals D, E, and F by the second, it is not at all safe to assume that C is rated fairly in relation to D. Only when two individuals are rated by the same supervisors upon the same scale and under the same conditions is it legitimate or safe to assume that their relative abilities are well indicated by the ratings.

Assuming that the reader has obtained a reliable order of merit for the individuals he is using as a check upon the value of the Mentimeter tests, no test should be considered useful which does not result in approximately this same order of merit. The tests are, of course, so short and so crude that it is not to be expected that any test will, except by chance, show exactly the same order of ability as the production records or supervisor’s ratings furnish, but some tests will show much closer correspondence than others. Those tests which correspond most closely should be employed, while those tests which do not correspond at all should not be employed, regardless of any statement of the authors or any preconceived ideas of the reader as to what tests ought to foretell ability in any particular line of work. The proof of a test or of any method of prognostication lies in the degree to which it actually arranges people in the order of their relative efficiency in the tasks for which one seeks to foretell success.

A mere glance at a record such as that shown below for twenty-eight sixth-grade pupils would show that there was a real relationship between the scholarship marks, the teacher’s estimate of intelligence, and the results of educational measurements taken by an outsider.

SCORES AND RATINGS OF SIXTH-GRADE CLASS


NAME OF PUPIL	EDUCATIONAL MEASUREMENTS SCORE (NO. OF ERRORS)	TEACHER’S RANKING OF INTELLIGENCE (1 IS BRIGHTEST)	SUMMARY OF TEACHER’S MARKS IN SCHOLARSHIP
Adelaide	36.	19	85
Ruth	16.5	15	90
Alexander	25.5	7	93
LaMonte	46.5	6	93
Earl	76.5	18	77

Joseph	20.5	20	85
Amadeo	75.	14	85
Leo	48.	3	93
William	53.5	9	82
Isabel	25.	21	76

Ida	36.5	4	94
Hazel	15.	10	90
Frederick	65.	26	86
Charles	58.5	13	85
Edward	30.	1	95

Benjamin	62.5	24	76
Bruce	56.	22	87
Alden	55.	12	87
George	60.5	17	87
Alice	29.	11	88

Almira	15.5	5	96
Helen	16.5	2	90
Elizabeth	65.5	23	75
Amelia	24.5	8	92
Edwin	19.	16	89

Robert	67.	28	71
Edna	47.	27	78
Samuel	72.	25	80

The things which are not so evident at a glance are the degrees of relationship between these three types of measures. Is the relation of educational measurements to the teacher’s estimates greater than the relation of the measurements to the marks in scholarship given by the teacher? In order to measure precisely the relative degrees of correspondence between various measures and estimates of the abilities of individuals, it is quite evident that something more accurate and exact than mere inspection is necessary.

For an explanation of the method by which the exact relationship may be worked out mathematically between the results of a test and the true abilities of the individuals tested, the reader is referred to pages [326]–331 in the appendix. The discussion which will be found there of the method of calculating a coefficient of coördination will not be difficult to understand nor will the method be difficult of application for any one who wishes to measure the exact reliability of any of the Mentimeter tests or of any other test. For many purposes such a record as is shown on the preceding page, giving the score of the individual in each test used, will reveal the essential facts regarding the correspondence between test results and demonstrated ability. The reader should be cautious, however, about accepting a conclusion drawn from casual observation of such a table as that shown on the preceding page without checking up the accuracy of this conclusion by actually working out the coefficient of coördination according to the method shown in the appendix.

When the reader has tried out, upon a fairly large group of persons of known ability, the Mentimeter tests which seem to him to promise greatest usefulness, and when he has made his calculations and discovered which tests actually do classify his people most accurately, it will then be possible for him to make an intelligent scientific selection of tests for practical use. Let us suppose, for example, that an employer wishes to have a set of tests whereby he may select intelligent sales-girls. By giving the ten or twelve tests which seem most hopeful for the purpose to fifty or sixty saleswomen, who have been in his employ long enough to demonstrate their relative degrees of ability and intelligence, the five or six tests may be chosen whose results show the closest relation to their demonstrated ability for intelligent salesmanship.

The results obtained by the separate tests chosen should also be compared, for two tests may measure practically the same mental trait and have a very high coördination with each other. In such a case, it would seem almost a useless waste to retain in the group two tests which measured the same phase of ability. The one of the pair which showed the less close relationship to the true ranking might be dropped from the list without much loss to the total effectiveness of the group of tests. A group of tests thus carefully selected would prove very helpful and effective in the selection of untrained material for training or in the classification of experienced employees according to their intellectual qualifications for the type of position held by the people on whom the validity of the tests had been proved.

The advantage of such a well-selected “team” of tests is not so much that it selects various grades of ability more accurately than supervisors could select it after many months of experience in trying to train the new material, but that the tests make a satisfactory classification immediately, which saves the salaries and time of those applicants who would certainly fail in the training period. Even with the very best coefficients of coördination between the tests and actual demonstrated ability in the trade or position, the tests will not be infallible. On the other hand, no supervisor’s judgment would be infallible, either. And the supervisor would be much more likely to make errors through personal likes and dislikes than the impersonal tests could possibly be.

The tests are an invaluable aid, when they are themselves chosen with the scientific care outlined above, although it would be a short-sighted policy for any firm to trust entirely to the results of intelligence tests in the employment of its personnel. Appearance, voice, education, manners, physical size, and many other qualities are sometimes quite as important as the degree of intelligence, and the intelligence tests do not measure other elements of personality than the mental qualities.

Warning should also be given against using a particular set of intelligence tests, selected because they show high correspondence with ability in salesmanship, for example, as a measure of the intellectual qualities of candidates for some other position. Sets of tests, selected because they have been found accurate in classifying soldiers or school children for instruction, may not be of maximum usefulness in classifying machinists or business managers. The Mentimeter tests offer a wide variety, from which it is proposed that only those shall be used which have actually proved useful in classifying candidates for the particular task concerned. There is no reason to believe that exactly the same type of intelligence is required in all positions.

Having chosen certain promising tests for experiment, having proved the validity of these tests by checking up the relation of their results to the true abilities of a group of old employees or persons whose relative capacities are known perfectly, and having selected those tests whose results relate most directly to intellectual ability and least directly to one another, one may begin to employ the tests thus selected for the sorting and classification of new recruits or applicants. The question which will at once confront the reader who is not experienced in the employment of statistics of this sort is “How shall the test results be recorded and interpreted?”

The answer to the question regarding test records is that the exact score of each person should be kept for each test to which that person is “exposed.” One difficulty with the records kept of certain other group intelligence tests is that only the final total score is retained, while all the wealth of detail furnished by the different tests included in the series is lost. The total score on a series of six or eight intelligence tests is worth keeping, but the separate scores on each of the six or eight may prove to be even more illuminating than the total score. Two candidates may make the same total score on a series of tests but the one may make his points chiefly in memory tests with little help from the tests calling for complex thought, while the other may do very poorly in the memory work and very well in the thought tests. If only the total score on the series were retained, the usefulness of the series would be practically destroyed for many purposes.

For the interpretation of the result recorded on any test, one will need to use some short but intelligible scheme for stating the true relation of the score of any individual to the scores of the remainder of his group or to the scores of the other group of old employees used as a standard in selecting the tests to be regularly employed. It is not always safe to say merely that Mr. K—— is below the average of his group. As an extreme case of how unjust this might be, let us suppose that in one of the Mentimeter tests, A made a score of 0; B made a score of 2; C, a score of 1; D, 2; E, 3; F, 0; G, 10; H, 2; I, 3; J, 9; and K, 3. The average score of this small group, obtained by adding the eleven scores and dividing by 11, is 3.18. Mr. K—— therefore obtained a score which was below the average of the group, even though fewer than 20 per cent. of his group made better scores than he. The average score is too much influenced by extremely low or extremely high scores.

To arrive at a proper perspective for interpreting the score of any individual, it is necessary first of all to have a distribution of the scores made by all the persons in the group with which the individual is to be compared. Such a distribution should show how frequently each possible score was made. The table on the left illustrates the idea of a distribution, using as material the scores quoted above for eleven individuals tested by a Mentimeter test. This table shows that one person had a score of 10, that one other had a score of 9, and that 3 was the next highest score made. The mode, or most common score, in this distribution is a 2 or a 3, which fact makes K’s score of 3 appear as quite typical of his group. The modal or most frequent score is a really useful score with which to compare the record of any individual, although it is not as safe a measure of the central tendency of a distribution as is the median score.

DISTRIBUTION

SIZE OF SCORE	FREQUENCY
10	1
9	1
8	0
7	0
6	0
5	0
4	0
3	3
2	3
1	1
0	2
Total	11

The median score of a distribution is the middle score, than which there are just as many larger as smaller. The median score is found by beginning at one end of a distribution and counting through half of the frequencies. To count through half of the eleven frequencies in the above distribution would bring us into the midst of the three who had scores of 2, and therefore 2 is the median score with which K’s score, or the score of any other individual, should be compared.

The reader who is mathematically inclined may wish to find the median point in the distribution, the point which bisects the distribution. To find this, one needs to study his facts carefully and make such assumptions as seem most probable for the facts which are not perfectly apparent. For example, of the three persons who scored 2 points, one individual may have had the third problem thought out and have been in the very act of writing the correct answer to it when the time was up, while another may have just finished problem two without having begun to read the third problem, and the third person may have been right in the middle of his thought about problem three. Not knowing what the exact truth is, we may assume that of the three who had a score of 2, one’s true score was between 2 and 2.33, another’s was between 2.33 and 2.66 and that the third’s was between 2.67 and 3.00.

If we count out the five who scored 3 or higher, we shall still require half of the distance represented by the next highest individual in order to have counted out 5.5 (half of 11). If our assumption is true, then, we shall need to count half way down from 3.00 to 2.67 in order to find the median point, 2.83. The calculation of the median point is not necessary, however, unless there is a very large number of cases in the distribution and unless very accurate comparisons must be made. In passing it may be said that the calculation of the median point at 2.83 is just as sensible and just as accurate as the calculation of the average point at 3.18, and that the median point is a much more useful measure of the distribution than the more commonly used average.

The user of the Mentimeter tests will not, under ordinary circumstances, be satisfied with interpreting an individual’s score merely by indicating its direction from the median, mode or average of a group. It will not usually be sufficient to say “He made the modal or most popular score,” or “His score was lower than the average,” or even “His score was higher than the median.” Some indication will be desired as to how much better or poorer a given score is than the median, or just what percentage of the standard group made better scores. An illustration of the method to be employed in such calculations and a review of the method of finding the median is given below in connection with a distribution of scores on one of the Mentimeter tests. (See Mentimeter No. 24, page [234].)

I	II	III	IV
SIZE OF SCORE ANALOGIES TEST	FREQUENCY: NO. OF COLLEGE GRADUATES	TOTAL NO. FROM LOWEST SCORES	TOTAL % FROM LOWEST SCORES
30	2	129	100
29	4	127	98.5
28	10	123	95.3
27	22	113	87.6
26	32	91	70.6

25	20	59	45.8
24	18	39	30.3
23	8	21	16.3
22	4	13	10.1
21	2	9	7.0

20	1	7	5.4
19	2	6	4.7
18	1	4	3.1
17	1	3	2.3
16	...	...	.....

15	1	2	1.6
14	...	...	.....
13	...	...	.....
12	1	1	.8
11	...	...	.....

Total	129

Having distributed the scores obtained by a group of college graduates on the Analogies test, the next important step toward their interpretation is the totaling of the frequencies up to and including those of each possible size, as shown in the third column of the accompanying table. The fourth column is then prepared showing the corresponding percentages of the total number (129) of persons tested, for each of the total frequencies shown in column III. The table as a whole is then to be read from left to right. As an example, one may begin at 20 in the first column and read as follows: “1 college graduate made a score of exactly 20 points, making in all 7 individuals who obtained a score of 20 points or less, which (7) is 5.4 per cent. of the 129 individuals tested.” Dropping the eye to the next percentage below this line in column IV, one can interpret the score of the individual who made a score of 20 as follows: “This is a poor showing for a college graduate, for of 129 college graduates tested only 4.7 per cent. made a lower score.”

A very popular method of interpreting a score is to tell in what quarter or, as the statisticians would say, in what “quartile” of the distribution a given score is found. The upper or first quartile of a distribution is the range of scores below which 75 per cent. of those tested have fallen. The second quartile is the range of scores below which 50 per cent. are found but above which 25 per cent. of those tested are found. The third quartile is the range below which only 25 per cent. are found and above which 50 per cent. are found, and the fourth or lowest quartile is the range of scores in which are found the lowest 25 per cent. of the scores made. The first and second quartiles are above the median, while the third and fourth quartiles are below the median. Obviously the individual who scored 20 points in the Analogies test, and is included in the lowest 5.4 per cent. is also in the lowest quartile of the college graduate scores. The point dividing the first and second quartiles is called the 75 percentile, while the point dividing the third and fourth quartiles is called the 25 percentile. As was stated above, the median or 50 percentile divides the second and third quartiles.

Columns III and IV in the foregoing table assist one quite materially in calculating the median and the other percentile points. To find the median, one will need to count half way through the distribution, in this case to count out 64.5 scores (129
2 = 64.5). The 20 persons who scored on 25, in the above distribution, are shown by column III to be included in the lowest 59 scores and by column IV to be in the lowest 45.8 per cent. To include 64.5 (or 50 per cent.) of the scores, 5.5 of the 32 individuals who scored on 26 will need to be taken (64.5 − 59 = 5.5); 5.5 is .17 of 32, so it will be necessary to take .17 of the distance (26.0 up to 27.0) represented by a score of 26. This places the 50 percentile or median point at 26.17, if we assume that the 32 individuals obtaining a score of 26 were evenly distributed in their exact values between 26.0 and 27.0, which is the safest assumption one can make about these scores.

The 25 percentile is found by counting out one fourth of the frequencies, beginning with the low-score end of the distribution. In the case of the college graduates’ distribution on the Analogies test, the 25 percentile is 24.63. The 75 percentile, which is found by counting out three fourths of the frequencies from the low-score end or one fourth from the high-score end of the distribution, is 27.26 in the case of the analogies distribution shown above. The “middle 50 per cent.” of the distribution, or the second and third quartiles, lie between 24.6 and 27.3 according to these calculations. One may therefore assert that the typical college graduate, meaning one who is within the two middle quartiles of the college graduate distribution, should be expected to make a score of 24, 25, 26, or 27 on the Analogies test in the Mentimeter series.

Occasionally intellectual measurements are reported by tenths, the first tenth being the tenth of the distribution having the highest scores, just as the first quartile is the quarter containing the highest scores. For practical purposes with the Mentimeter tests, however, it is recommended (1) that the score made on each test be recorded, (2) that the median score of the standard group, with which each individual’s score is to be compared, be calculated, and (3) that the percentage of the standard group making lower scores than that individual’s score be used as an interpretation. For these simple interpretations, a table, such as that shown on page [102] for college graduates in the Analogies tests, practically completes the necessary calculations,[^[2]] except for the calculation of the median score. It will be fairly intelligible to describe Henry Smith’s score as follows: “Smith has a score of 24 points as compared with the median score of 26.2 points for his group. Only 16.3 per cent. of the college graduates make a poorer score than Smith, but 69.7 per cent. make a better score.”

[2]. For the purpose of assisting the reader in keeping and interpreting records of the Mentimeter tests, the authors have prepared a record booklet which may be used with the tests to excellent advantage. It will be found economical to use this booklet because of the guide lines, headings, and practical suggestions which it contains, reducing copying and memory work in the calculations to a minimum. It is recommended also that calculating tables or a slide rule be used to calculate the percentages called for in the final column of the distribution tables. Such aids are very desirable because of their contribution to the accuracy of results and to economy of time.

Assuming now that the reader has a fairly clear idea of how to administer and record the results of the Mentimeter tests, the next question to be answered is: “What shall be done about these test records?” Measurement in any field does not change to any appreciable degree the material which has been measured. The surveyor, for example, who measures the area of a field makes very little impression upon the soil over which he passes. A physician who measures the weight of an infant does not thereby increase that weight or diminish it. In the same way the psychologist who applies a Mentimeter test to a filing clerk, does not by that act increase the efficiency of that clerk. Measurements, of themselves, are of no value. Something must be done about the result which is obtained or all of the expense in time and money is of no avail.

The real purpose of a measurement is to tell facts about a situation more exactly and with greater objectiveness than they could be told in a description. A child may seem, on first appearance, to be under weight, but in order to know definitely whether or not that is true it is necessary to measure his age in terms of years, months, and days, to measure his weight in terms of pounds and ounces, and to measure his height in terms of feet and inches. All of these measurements taken together, however, will not hinder the child’s growth or make him develop more rapidly; they merely indicate what his present condition is, without reference to what it may have been in the past or what it may become in the future.

As a sample of the great benefit which may be obtained from knowing mental facts exactly, we may consider the traditions and present status of our public school systems. Education has in the past been pointed, from the very beginning in the kindergarten toward the high school and the college and ultimately the professional school in which lawyers, physicians, ministers, and teachers were to be prepared. The child who by nature was not inclined toward the consideration of abstract ideas and theories soon found that the schools were not well adapted to his interests.

The percentage of persons in our population who cannot successfully think and work with abstract symbols and verbal ideas is very much greater than most of us have been inclined to believe. We have stated or implied that any boy who would stay in school long enough might fit himself to become a United States Senator or possibly a great newspaper editor, or lawyer. Those pupils who found it impossible to assimilate the type of thing that was offered by the public schools have been eliminated and sent out into the industrial world to find materials which would correspond to their interests.

Educators have still further made the error of saying or implying that it was the inferior people who were thus forced out of school. The authors of the present book wish to assert their belief that the mind of a man whose interests lie in handling people and concrete objects is not at all inferior on that account to the mind of the man who handles ideas and abstract conceptions.

Measures of intelligence have in the past been chiefly those which would be favourable to the abstract thinker. The Alpha test, used in the Army, proved conclusively to those who studied the results most carefully, that fully half of our population can never succeed, even moderately, in the manipulation of abstract ideas. The large proportion of our boys and girls who come to school are absolutely doomed to be unsuccessful and to become discouraged in their attempts to progress in the courses which are commonly given, and yet the public supports these schools, and the administrators of these schools try to claim that they offer “equal opportunity to all.” Actually the kind of opportunity offered can be used effectively by only a small percentage of the pupils. Unless the child has the ability to interpret symbols and juggle ideas he is declared to be inferior and is forced out to learn for himself how to earn a living and to secure his rights.

The Mentimeter tests and other measures of intellectual abilities provide the means whereby pupils may be classified, at the very beginning of their education, according to the degree to which the formal academic training will be assimilated. These tests make it possible to select those who do not think abstractly but who require concrete objects or persons as the material for their mental activity. Unless the public recognizes that it owes an appropriate education to these people just as surely as it does to the academic few, it will not be long until this great group, in which our present schools develop the habit of failure and discontent, will arise to overthrow the injustices which our past aristocratic organization of society has handed on to them.

It is not proposed that certain individuals be selected by the Mentimeter tests and trained psychologists and then condemned to training of a less respectable order than that which is now offered. What is proposed is that by the use of intelligence tests students in schools be classified and placed in classes where they can learn things which it is within their mental power and interests to grasp and which will be of practical value and of social significance in the development of good citizens; rather than to continue, as we have in the past, condemning this large majority of our population to failure in school and elimination from the benefits of public taxation for education.

It is no disgrace for a blind man to be unable to paint beautiful pictures, nor is it considered a great social injustice for a man of ordinary size to be denied the opportunity of serving as a giant in a side show. It should not be considered by any one that being a good valet or mule driver or boot black or street cleaner is a less respectable calling for a man whose mind demands concrete objects for its exercise than the expounding of the gospel or explanation of legal technicalities is to the man whose mind is inclined toward abstract ideas and relationships. If we are to have an effective social organization each person must do the type of thing for which his brain and his physical body fit him, without feeling that he is thereby either inferior or superior to any other person. We must help one another, each supplying that service for which he is best fitted. To continue as we have in the past, encouraging every child to look for a “white-collar job” at the end of his educational career is to foster the monster of discontent and unrest which threatens to destroy the very foundations of modern society.

If the Mentimeter tests which follow can do no more than point out for employers and educators the limits to which those who are dependent upon them can go in the understanding and use of abstract ideas, they will thereby have contributed materially to the happiness and contentment of a weary world. Along with the results of the tests there must, however, be this feeling of responsibility for one another and the recognition of the need for “pulling together” for the common good, each man contributing that for which his inheritance has fitted him, else we shall continue to force men to learn failure and discontent in our schools and thereby destroy the social structure we have been so long in building.

CHAPTER X
THE MENTIMETER TESTS

Tests of the abilities of human beings may be classified upon a great many different bases. It is possible, first of all, to classify them according to the qualities of mind and body which they measure. The reason it is difficult so to classify tests of mental ability is that the mind refuses to be cut up into different parts, each one responsible for a specific characteristic. No test can be solved by the use of one and only one group of intellectual faculties. The results obtained in any mental examination are the complex effects of an immense number of different characteristics. No attempt has therefore been made in the classification of the Mentimeters to say that one measures imagination, another measures attention, and another some other quality. Almost every quality enters to some degree in each test.

It is possible to classify tests according to the subject matter which they contain. The Mentimeter tests are so arranged, where it is possible, as to cover a very wide range of subject matter.

It is possible to classify examinations according to the activity required of the candidate being examined. A number of the Mentimeter tests call for completing a series of objects or ideas, while a number of others call for memory of a certain sort, and still others require discrimination between certain differing elements. These differences in the activity of the candidate examined, are not, however, the chief distinctions to be made between the tests.

It is possible to classify measurements according to the number of candidates that may be examined at the same time. Some tests cannot be given readily to more than one person at a time, while other tests can be given to several at the same sitting. In so far as possible, the Mentimeter tests are so arranged that they can be given to large numbers at the same sitting. This makes for economy of time and of effort on the part of the examiner.

It is possible to classify tests according to physical characteristics of the candidate examined, such as tests for infants, tests for children, and tests for adults, or tests for the blind and tests for the deaf. The first test in the Mentimeter series is for infants while the remainder of the tests are intended to measure older people.

Tests may further be classified according to the language capacity of the candidates who are examined. Certain of the Mentimeter tests are for non-English-speaking persons primarily, while others are primarily for those who speak English, and still others for those who read English.

The Mentimeter series of examinations which follows consists of thirty different tests, the majority of which are modifications of tests which have been used previously elsewhere. The first test in the series is to be used as an individual test of very young children. The blank provided furnishes brief suggestions, at each point, of what the procedure should be, and also furnishes a place for the examiner to record the result of his questions and observations.

Each examination booklet in the Mentimeter series has on its title page blanks as follows:

NAME______________________________________

AGE AT LAST BIRTHDAY_______LOCATION_______

The space headed “Location” is to be used to indicate the business or industrial organization or the department of the candidate being examined; or the grade, class, and school of a school pupil. These blanks should always be filled out before the examination begins.

At the middle of the page directions are frequently given with examples to serve in explaining concretely just what the nature of the test is going to be. In the lower right-hand corner of the title page there appears a blank, preceded by the words “Total Score.” This is to be filled out by the examiner after the candidate has marked his paper and after the examiner has scored the results.

Tests numbered from 2 to 10 are classified as tests for non-English-speaking persons. They were designed originally, and can best be used, as group tests, although the directions given on the following pages for these members of the Mentimeter family are usually in terms of an individual examination. If it had been possible to prepare and furnish with this book large charts on which the explanatory samples could be exhibited and the pantomime instructions clearly demonstrated for a group of people at the same time, the instructions would have been printed as for a group examination. Within the confines of a title page of a test booklet only small examples can be presented, and therefore the instructions are for measuring one individual at a time. Any employer, teacher, or supervisor who plans to make use of these tests for non-English-speaking persons would do well to prepare the demonstration material in enlarged form in order to use it in giving the tests to groups of individuals at the same time.

In giving a group test it is practically always necessary to obtain the identifying information called for on the title page before the booklets are opened or turned over. There is a distinct tendency for candidates to try to glance at the pages which follow unless specific directions are given as the papers are distributed that this must not occur.

The procedure in giving Mentimeters 2 to 10 to people who can understand and even read English is very little different from the procedure to be used with the foreign-language-speaking groups.

Mentimeters 11 to 15 cannot be given as group tests because of the great amount of writing which this would entail. Group tests are most efficient when candidates are required to do nothing other than check the correct answers without having to write anything.

Mentimeters 16 to 30 may be given as individual examinations, although they are planned as group examinations and the results obtained from their use as group examinations will be superior to the results obtained from their use as individual examinations.

In giving all of these tests it is very important that the printed forms prepared by the publishers be employed and that the directions which follow be carefully observed. The stencils furnished with the printed test booklets make it possible for a clerk of average mental capacity to mark and score the results of these examinations with great rapidity and with just as much accuracy as could be obtained by specialists working without such stencils. These stencils and the group method make psychological examinations economical of administration.

The list of Mentimeter tests is as follows:

CHAPTER ISCIENCE VERSUS GUESSWORK

CHAPTER IITHE APPLICATIONS OF PSYCHOLOGICAL TESTS

CHAPTER IIIWHAT THESE TESTS MEASURE

CHAPTER IVSTANDARDS FOR MENTAL TESTS

CHAPTER VDIFFERENT TYPES OF MENTAL TESTS

CHAPTER VIMENTAL TESTS IN THE ARMY

CHAPTER VIIPSYCHOLOGICAL TESTS IN EDUCATION

CHAPTER VIIIMENTAL TESTS IN INDUSTRY

CHAPTER IXHOW TO USE THE MENTIMETER TESTS

CHAPTER XTHE MENTIMETER TESTS