How to Teach - George D. Strayer; Naomi Norsworthy

Much inaccurate calculation has resulted from misguided attempts to secure a median point with the formula just given, which is applicable only to the location of the median measure. It will be found much more advantageous in dealing with educational statistics to consider only the median point, and to use only the n/2 formula given in a previous paragraph, for practically all educational scales are or may be thought of as continuous scales rather than scales composed of discrete steps.

The greatest danger to be guarded against in considering all scales as continuous rather than discrete, is that careless thinkers may refine their calculations far beyond the accuracy which their original measurements would warrant. One should be very careful not to make such unjustifiable refinements in his statement of results as are often made by young pupils when they multiply the diameter of a circle, which has been measured only to the nearest inch, by 3.1416 in order to find the circumference. Even in the ordinary calculation of the average point of a series of measures of length, the amateur is sometimes tempted, when the number of measures in the series is not contained an even number of times in the sum of their values, to carry the quotient out to a larger number of decimal places than the original measures would justify. Final results should usually not be refined far beyond the accuracy of the original measures.

It is of utmost importance in calculating medians and other measures of a distribution to keep constantly in mind the significance of each step on the scale. If the scale consists of tasks to be done or problems to be solved, then "doing 1 task correctly" means, when considered as part of a continuous scale, anywhere from doing 1.0 up to doing 2.0 tasks. A child receives credit for "2 problems correct" whether he has just barely solved 2.0 problems or has just barely fallen short of solving 3.0 problems. If, however, the scale consists of a series of productions graduated in quality from very poor to very good, with which series other productions of the same sort are to be compared, then each sample on the scale stands at the middle of its "step" rather than at the beginning.

The second kind of scale described in the foregoing paragraph may be designated as "scales for the quality of products," while the other variety may be called "scales for magnitude of achievement." In the one case, the child makes the best production he can and measures its quality by comparing it with similar products of known quality on the scale. Composition, handwriting, and drawing scales are good examples of scales for quality of products. In the other case, the scales are placed in the hands of the child at the very beginning, and the magnitude of his achievement is measured by the difficulty or number of tasks accomplished successfully in a given time. Spelling, arithmetic, reading, language, geography, and history tests are examples of scales for quantity of achievement.

Scores tend to be more accurate on the scales for magnitude of achievement, because the judgment of the examiner is likely to be more accurate in deciding whether a response is correct or incorrect than it is in deciding how much quality a given product contains. This does not furnish an excuse for failing to employ the quality-of-products scales, however, for the qualities they measure are not measurable in terms of the magnitude of tasks performed. The fact appears, however, that the method of employing the quality-of-products scales is "by comparison" (of child's production with samples reproduced on the scale), while the method of employing the magnitude-of-achievement scales is "by performance" (of child on tasks of known difficulty).

In this connection it may be well to take one of the scales for quality of products and outline the steps to be followed in assigning scores, making tabulations, and finding the medians of distributions of scores.

When the Hillegas scale is employed in measuring the quality of English composition, it will be advisable to assign to each composition the score of that sample on the scale to which it is nearest in merit or quality. While some individuals may feel able to assign values intermediate to those appearing on the Hillegas scale, the majority of those persons who use this scale will not thereby obtain a more accurate result, and the assignment of such intermediate values will make it extremely difficult for any other person to make accurate use of the results. To be exactly comparable, values should be assigned in exactly the same manner.

The best result will probably be obtained by having each composition rated several times, and if possible, by a number of different judges, the paper being given each time that value on the Hillegas scale to which it seems nearest in quality. The final mark for the paper should be the median score or step (not the median point or the average point) of all the scores assigned. For example, if a paper is rated five times, once as in step number five (5.85), twice as in step number six (6.75), and twice as in step number seven (7.72), it should be given a final mark indicating that it is a number six (6.75) paper.

After each composition has been assigned a final mark indicating to what sample on the Hillegas scale it is most nearly equal in quality, proceed as follows:

Make a distribution of the final marks given to the individual papers, showing how many papers were assigned to the zero step on the scale, how many to step number one, how many to step number two, and so on for each step of the scale. We may take as an example the distribution of scores made by the pupils of the eighth grade at Butte, Montana, in May, 1914.