59. Quality of Speech Sounds

Another mistaken assumption that is frequently made is that the speech of non-literary peoples is harsh, its pronunciation more difficult than ours. This belief is purely subjective. When one has heard and uttered a language all his life, its sounds come to one’s mouth with a minimum of effort; but unfamiliar vowels and consonants are formed awkwardly and inaccurately. No adult reared in an Anglo-Saxon community finds th difficult. Nor does a French or German child, whose speech habits are still plastic, find long difficulty in mastering the particular tongue control necessary to the production of the th sound. But the adult Frenchman or German, whose muscular habits have settled in other lines, tries and tries and falls back on s or t. A Spaniard, however, would agree with the Anglo-Saxon as to the ease and “naturalness” of th. Conversely, the “rough” ch flows spontaneously out of the mouth of a German or Scotchman, whereas English, French, and Italians have to struggle long to master it, and are tempted to substitute k. German ö and French u trouble us, our “short” u is equally resistant to Continental tongues.

Even a novel position can make a familiar sound strange and forbidding. Most Anglo-Saxons fail on the first try to say ngis; many give up and declare it beyond their capacity to learn. Yet it is only sing pronounced backward. English uses ng finally and medially in words, not initially. Any English speaker can quickly acquire its use in the new position if, to keep from being disconcerted, he follows some such sequence as sing, singing, stinging, ringing, inging, nging, ngis.

So with surd l—Welsh ll—which is ordinary l minus the accompaniment of vocal cord vibrations. A little practice makes possible the throwing on or off of these vibrations, the “voicing” of speech, for any sound, with as much ease as one would turn a faucet on or off. Surd l thereupon flows with the same readiness as sonant l. As a matter of fact we often pronounce it unconsciously at the end of words like little. When it comes at the beginning, however, as in the tribal name usually written Tlingit, Americans tend to substitute something more habitual, such as kl, which is familiar from clip, clean, clear, close, clam, and many other words. The simple surd l has even been repeatedly described quite inappropriately as a “click”; which is about as far from picturing it with correctness as calling it a thump or a sigh; all because it comes in an unaccustomed position.

Combinations of sounds, especially of consonants, are indeed of variable difficulty for anatomical reasons. Some, like nd and ts and pf, have their components telescope or join naturally through being formed in the same part of the mouth. Others, like kw (qu), have the two elements articulated widely apart, but for that reason the elements can easily be formed simultaneously. Still others, like kt and ths, are intrinsically difficult, because the elements differ in place of production but are alike in method, and therefore come under the operation of the generic rule that similar sounds require more effort to join and yet discriminate than dissimilar ones; for much the same reason that it is on the whole easier to acquire the pronunciation of a wholly new type of sound than of one which differs subtly from one already known. Yet in these matters too, habit rather than anatomical functioning determines the reaction. German pf comes hard to adult Anglo-Saxons, English kw and ths to Germans. So far as degree of accumulation of consonants is concerned, English is one of the extremest of all languages. Monosyllables like tract, stripped (stripd), sixths (siksths), must seem irremediably hard to most speakers of other idioms.

Children’s speech in all languages shows that certain sounds are, as a rule, learned earlier than others, and are therefore presumably somewhat easier physiologically. Sounds like p and t which are formed with the mobile lips and front of the tongue normally precede back tongue sounds like k. B, d, g, which are voiced like vowels, tend to precede voiceless p, t, k. Stops or momentary sounds, such as b, d, g, p, t, k, generally come earlier than the fricative continuants f, v, th, s, z, which require a delicate adjustment of lip or tongue—close proximity without firm contact—whereas the stops involve only a making and breaking of jerky contact. But so slight are the differences of effort or skill in all these cases, that as a rule only a few months separate the learning of the easier from that of the more difficult sounds; and adults no longer feel the differences. The only sound or class of sounds seriously harder than others seems to be that denoted by the letter r. Not only do children usually acquire r late, but among all races there appears to be a certain percentage of individuals who never learn to form the sound right, but substitute one approaching g or w or j or l. The reason is that r stands alone among speech sounds. It is the only one produced by blowing the tongue into a few gross vibrations; which means that this organ must be held in a special condition of laxness and yet elevated so that the flow of breath may bear on it. However, even this inherent difficulty has been insufficient to prevent many languages from changing easier sounds into r.