The number of forms which can be assumed by the waves of sound is naturally limited in kind, while various bodies may emit sounds containing the same harmonic or partial tone. The quality or timbre which depends on the relation and strength of these partial tones, and of the composite form assumed by the sum of their vibrations, constitutes what we have called a peculiar tone. This, as we have seen, is a simple one in the case of the tuning-fork, but in other cases it forms part of a full or complex group. We may find an illustration in the characteristic lines of light which we learn from the spectrum analysis are projected by substances; where we are dealing with a simple elementary substance, the line thrown upon the spectrum is correspondingly simple; where, on the other hand, the substance is compound, its spectrum also is compound, reflecting the several chemical elements of which it is made up. The simple spectrum answers to the simple harmonic or partial tone with its varying pitch and invariable form, just as the compound spectrum answers to the full note or peculiar tone with its characteristic quality and diversified grouping of partial tones. Now, if a body which has a certain peculiar tone is struck by a sound which contains a partial tone in any way similar to this peculiar tone, the body in question vibrates in sympathy, and we hear what is known as a by-note or harmonic. This by-note reacts upon the partial tone which has caused it, strengthening the partial tone and so modifying the quality of the complex sound. If, for instance, we play a note such as C on a violin, the strings of a piano representing C as well as the harmonics allied to it will vibrate in sympathy. Of course the more elastic the body which is struck, the louder and clearer will be the by-note, and of all elastic bodies none are better than those chambers of resonance into which we can divide the air. Such chambers of resonance are afforded by wind instruments of all kinds, whose shape determines the peculiar tone they are to emit. If the instrument is so constructed as to change its shape at will, now round, now straight, now broad, now narrow, the number of different chambers of resonance, and consequently the number of different peculiar tones, may be almost indefinitely increased.
It is this variability of form which makes the human throat such a marvellous instrument for the production of manifold sounds. Like most chambers of resonance, it has the hollow reed-like shape which connects it most readily with the primary source of sound. In analyzing the material of language we must never forget that we have to do with the most perfect wind instrument that exists, a wind instrument, too, of infinite pliability and power of change, and thus in constant and ready sympathy with the harmonics that are struck by the other organs of speech.
We must now pass from the science of acoustics to the science of physiology. We have seen what are the conditions under which musical notes are produced, we have also seen that among these musical notes the utterances of articulate speech have to be classed; we have next to examine into the nature and conformation of the physical organs to which these utterances owe their origin. In the first place, the organs of speech may roughly be divided into three groups:—the breathing apparatus, or lungs, the trachea or windpipe with larynx and bronchial tubes, and the chamber of resonance or mouth and nose. The lungs provide the material which is worked up into inarticulate noises and articulate sounds by the trachea and chamber of resonance. As long as the breath flows out of the throat and mouth quietly and without interruption language of any sort is out of the question. The organs of speech are at rest, and all that can be done is to propel the breath with greater or less violence. We may breathe hard through the mouth, we may even make noises like that of snorting through the nose, but as yet there is nothing which can constitute a starting-point for articulate speech.[141] Mere breath, as distinguished from voice, only supplies the material out of which words and sentences may afterwards be created. Voice is breath, acted upon and excited into waves of sound by the organs of the throat and mouth; a larger quantity of air than is needed for simple breathing is rapidly taken into the lungs, and immediately expelled in intermittent gusts, but with varying degrees of force. Almost all the sounds we utter are accompanied by exspiration; only such sounds as an occasionally mispronounced ja in Germany or our own surprised Oh! are produced while the breath is being drawn in. Experiment will at once show how difficult it is to pronounce a sound at the same time that this is being done.
The breath, then, is the passive instrument through which language is formed by the trachea and chamber of resonance. This trachea is a long cartilaginous and elastic pipe ending in the bronchial tubes, through which the air is admitted to the lungs. Its upper part is termed the larynx, consisting of five cartilages and situated in the throat. The lowest of these cartilages is the cricoid, which resembles a ring with the broad flat surface turned downwards. Over this comes the cartilago thyroidea or Adam’s apple, with two wings which partly enclose the cartilago cricoidea, and form a link between it and the os hyoideum,[142] or bone of the tongue, which has somewhat of the shape of a horseshoe. The space surrounded by these two cartilages may be compared with a hollow reed, out of the back part of which a piece has been cut. From the base of the latter and the upper rim of the cartilago cricoidea spring two small pyramidal cartilages, the arytenoids, which resemble the horns of an ox and almost touch one another. Their roots are connected with one another and with the cricoid and thyroid cartilages by the so-called processus vocales, which in spite of their name have little to do with the formation of speech. The horns of the arytenoids serve to unite two elastic bands to the opposite surface of the thyroid cartilage. These bands are formed of muscle enveloped with mucous membrane, and are the famous chordæ vocales upon which as upon the strings of a piano the manifold modulations of human language are played. So long as they remain, the other vocal organs, not excluding the tongue, may be removed without depriving the patient of the faculty of articulate speech.[143] Their length differs in men and women, in children and adults; the average length in men being about one-third greater than in women, and occasioning the different pitch of male and female voices.[144] The two chordæ vocales run obliquely across the cavity enclosed between the thyroid cartilage and a small projection on the front part of the arytenoid cartilage, an aperture which is called the glottis, or glottis vera. They can be relaxed or contracted at will by the muscles of the cartilages to which they are attached, and a portion of them can even be deadened by pressure from a small protuberance on the under side of the epiglottis. The glottis itself is divided into two parts, one the space between the vocal chords and the lateral thyro-arytenoid and crico-arytenoid cartilages, the other the triangular space between the vocal chords themselves, the latter allowing a passage for breath, the former a passage for voice. Both spaces can of course be narrowed or enlarged by the contraction or relaxation of the vocal chords, and the junction of the latter will close one or both altogether. It is in this secret chamber that the phonetic substance of speech is moulded into shape; the vibrations of the chordæ vocales in the breath of the glottis are the ultimate cause of syllables and words.
Above this chamber of the voice the trachea or windpipe again widens, and a second chamber is formed by two cavities on either side, called the ventricles of the larynx (the ventriculi Morgagni). Each cavity leads, at the back, into a pouch of the mucous membrane called the laryngeal sac and covered with sixty or seventy mucous glands, the secretion from which acts like oil on a piece of machinery by keeping the vocal chords and the surrounding parts in a moist condition. Stretched across the cavities are two thick ligaments, the false vocal chords, like the true chordæ vocales below them. They differ from the vocal chords in having no muscle of their own, but like the latter can contract or enlarge at pleasure the false glottis (glottis spuria), the space, that is, which is enclosed between them. The false glottis, which, like the false vocal chords, takes no part in the creation of language, is shut by an elastic cartilage, called the epiglottis, the lower point of which is attached to the thyroid cartilage immediately above the chordæ vocales, while the upper end broadens out like a leaf and falls over the fissure of the false glottis. This corresponds with the entrance of the larynx. The upper surface of the epiglottis is concave, and in swallowing it is allowed to drop upon the larynx. At other times it may be depressed over the false and true vocal chords.
Such is the machinery whereby breath from the lungs is transformed into voice in its passage through the windpipe; and voice is next taken up by what we have termed the chamber of resonance and modified in various ways. If we may call the glottis the manufactory of voice, we may call the mouth and nose the manufactory of the articulate sounds into which voice is divided. At the back of the epiglottis lies the pharynx, leading into the œsophagus, and the pharynx is bounded on the side of the mouth by the posterior pillar or arcus pharyngo-palatinus, opposite to which is the anterior pillar or arcus glosso-palatinus. Between them are the tonsils, and above these again the uvula, a sort of pendent valve which hangs downwards from the top of the anterior pillar towards the posterior pillar behind. The uvula is attached to a piece of yielding muscle known as the soft palate or velum palati, which with the uvula separates the throat from the entrance to the nostrils. The soft palate can move either backwards or forwards; in pronouncing the guttural (ng) for instance, it is pressed forward against the tongue, shutting off the throat; in pronouncing the vowels, on the other hand, it is pressed backward, and so cuts off the flow of breath to the nose. Above the soft palate comes the arch of the hard palate or roof of the mouth, and below this the tongue with its two roots and pointed tip. The teeth that enclose the mouth, along with their alveolars that form the front wall of the hard palate, have much to do with the formation of specific sounds, while it is hardly necessary to refer to the phonological importance of both nose and lips. As is well known, a leading characteristic of cultivated English is the little use it makes of the latter.
It is now time to consider the precise parts played by these different organs of speech, in producing the various elements of spoken language. We must begin by putting out of sight all inarticulate sounds or noises, such as the clicks of the Bushman or the Hottentot, which have entered into the composition and framework of actual speech. Such inarticulate sounds are but the stepping-stones to real language, the first steps of the ladder, as it were, which were eventually to lead to articulate words. They are the natural cries of man like the natural cries of the animals from which they in no way differ; and just as on the one side the barking of the dog and the mewing of the cat are said to be attempts to imitate the human voice, so on the other hand the inarticulate cries of the infant or “non-speaker” are on the same level as the roar of the lion or the shriek of the cockatoo. We are told that the cynocephalic ape of the Upper Senegal, whose form is depicted on the monuments of ancient Egypt, utters clicks which sometimes contain a distinct d,[145] and the Bushmen themselves show a true instinct when they make the beasts in their fables talk not only with the clicks of the Bushman dialects, but even in the case of some animals with clicks that do not otherwise occur.[146] If we watch the first endeavours of children to speak, we may discover inarticulate noises gradually becoming articulate sounds with definite meanings, and we may even trace a recollection of the first efforts of man to create a language for himself in the guttural aspirates heard for instance in some of the Semitic dialects. Indeed, the name given to the hard breathing (h) by the Greeks, πνεῦμα δασύ or “rough aspirate,” reminds us of the guttural noises, not yet phonetic sounds, made by the child; in forming this sound we jerk out the breath at the same time that we narrow the glottis, adding if we like various degrees of hoarseness by further stopping its free flow. The glottal catch, which is heard in Danish after vowels, and according to Mr. Bell is substituted in the Glasgow pronunciation for “voiceless stops,” is really a mere cough. Even the spiritus lenis or soft breathing, heard before a vowel, partakes in some measure of the nature of a noise. It is true that the rough breathing cannot be sung while the soft breathing may be; but this is because in the case of the latter the breath is checked near the vocal chords and can therefore be intoned. Professor Max Müller is doubtless right in holding that all that the Greeks meant by πνεῦμα ψιλόν as opposed to πνεῦμα δασύ was “a negative definition of another breath which is free from roughness,”[147] just as the ĕ-´psilon is negatively contrasted with the êta. Neither breathing was regarded as constituting as yet a true sound or “voice.”
The true sounds of language, however, were distinguished but roughly and imperfectly one from the other. Plato, in his Kratylus, divides them into φονηέντα or “vowels,” and ἄφωνα or “mutes,” these last being further subdivided into semi-vowels which are neither vowels nor mutes (φωνηέντα μὲν οὔ, οὐ μέντοι γε ἄφθογγα) and ἄφθογγα or real mutes. The term ἄφωνα, mutes, afterwards came to be restricted in its sense as a simple equivalent of Plato’s ἄφθογγα, its place being taken by the term σύμφωνα or “consonants,” letters, that is to say, which must be sounded along with a vowel. These consonants were next classed as ἡμίφωνα or semi-vowels (l, m, n, r, and s), ὑγρά or “liquids” which covered all the semi-vowels with the exception of s, and ἄφωνα or “mutes.” The mutes fall into three classes, the ψιλά or “bare” (k, t, p), the δασέα or “aspirates” (kh, th, ph) and the μέσα which stood, as it were, “between” them. The Latin translation of the latter term has given us the mediæ of modern grammars.
Far more thorough-going and scientific were the phonological labours and classification of the Hindu prâtiśâkhyas. Instead of starting from written speech like the Greek grammarians, they had to do with an orally-delivered literature, and hence while the Greeks never got beyond the belief that the tongue, teeth, and lips were the sole instruments of pronunciation, the Hindus had carefully analyzed the organs of speech some centuries before the Christian era, and composed phonological treatises which may favourably compare with those of our own day. They knew, for example, that in sounding the tenues, or hard letters, the glottis is kept open, while in sounding the mediæ, or soft ones, it is closed; they knew also that e and o were diphthongs analyzable into a + i and a + u; and they explained k and g, p and b, as formed by complete contact of the vocal organs. They had noted the repha or “Newcastle burr,” and had divided the nasals into their several classes. The names they gave to the various sounds, and the groups into which they were classified, were descriptive of their mode of formation, like the names similarly applied by modern phonologists. Thus the guttural sibilant formed near the root of the tongue (χ) was called Jihvâmûlîya, “the tongue-root letter,” and the labial sibilant (φ) Upadhmânîya, “to be breathed upon.” The consonants were classed both according to the place where they were formed, and according to their prayatna, or “quality,” the mutes and nasals, for instance, being formed by “complete contact” of the vocal organs, the semi-vowels by “slight contact” (îshat sprishṭa), the sibilants by “slight opening” (îshad vivṛita), and the vowels by complete opening. A controversy even sprung up among the grammarians as to the extent of this opening of the organs. “Some ascribe to the semi-vowels duḥspṛishṭa, imperfect contact, or îshadaspṛishṭa, slight non-contact, or îshadvivṛita, slight opening; to the sibilants nemaspṛishṭa, half-contact; i.e., greater opening than is required for the semi-vowels, or vivṛita, complete opening; while they require for the vowels either vivṛita, complete opening, or aspṛishṭa, non-contact.”[148]
Leaving the speculations of the past, let us now pass on to the results which have been obtained by modern research. Thanks to the labours of men like Alexander Ellis, Melville Bell, Helmholtz, Czermak, Brücke, Sweet, and others, the mechanism of speech has been fairly settled; and though many points are still open to discussion, the main facts have been thoroughly ascertained and adequately explained. We have learnt the real nature and causes of those phonetic elements of speech which the old grammarians first tried to separate and classify; we have cleared away the confusion from which even the Vedic scholars of India could not wholly escape, and have discovered that in phonology as elsewhere, the convenient systems of practical life do not bear a close scientific investigation. Even the ordinary distinction of vowels and consonants is exposed to more than one objection. It rests not upon the essential character of the sounds themselves, but upon mere differences of function, and its advocates have to invent a series of semi-vowels or semi-consonants, a name which of itself indicates how incomplete and unsatisfactory the distinction must be. The distinction, indeed, has a basis of fact, but the fact is one which has been misapprehended or overlooked.