If the computer has its language problems, man has them also, to the nth degree. There are about 3,000 tongues in use today; mercifully, scientific reports are published in only about 35 of these. Even so, at least half the treatises published in the world cannot be read by half the world’s scientists. Unfortunately, UNESCO estimates that while 50 per cent of Russian scientists read English, less than 1 per cent of United States scientists return the compliment! The ramifications of these facts we will take up a little later on; for now it will be sufficient to consider the language barrier not only to science but also to culture and the international exchange of good will that can lead to and preserve peace. Esperanto, Io, and other tongues have been tried as common languages. One recent comer to the scientific scene is called Interlingua and seems to have considerable merit. It is used in international medical congresses, with text totaling 300,000 words in the proceedings of one of these. But a truly universal language is, like prosperity, always just around the corner. Even the scientific community, recognizing the many benefits that would accrue, can no more adopt Interlingua or another than it can settle on the metric system of measurement. Our integration problems are not those of race, color, and creed only.
Before Sputnik our interest in foreign technical literature was not as keen as it has been since. One immediate result of the satellite launching by the Russians was amendment of U.S. Public Law 480 to permit money from the sale of American farm equipment abroad to be used for translation of foreign technical literature. We are vitally concerned with Russia, but have also arranged for thousands of pages of scientific literature from Poland, Yugoslavia, and Israel. Communist China is beginning to produce scientific reports too, and Japanese capability in such fields as electronics is evident in the fact that the revolutionary “tunnel diode” was invented by Esaki in Japan.
It is understandable that we should be concerned with the output of Russian literature, and much attention has been given to the Russian-English translator developed by IBM for the Air Force. It is estimated that the Russians publish a billion words a year, and that about one-third of this output is technical in nature. Conventional translating techniques, in addition to being tedious for the translators, are hopelessly slow, retrieving only about 80 million words a year. Thus we are falling behind twelve years each year! Outside of a moratorium on writing, the only solution is faster translation.
The Air Force translator was a phenomenal achievement. Based on a photoscopic memory—a glass disc 10 inches in diameter capable of storing 55,000 words of Russian-English dictionary in binary code—the system used a “one-to-one” method of translation. The result initially was a translation at the rate of about 40 words per minute of Russian into an often terribly scrambled and confusing English. The speed was limited not by the memory or the computer itself but by the input, which had to be prepared on tape by a typist. Subsequently a scanning system capable of 2,400 words a minute upped the speed considerably.
Impressive as the translator was, its impact was dulled after a short time when it was found that a second “translation” was required of the resulting pidgin English, particularly when the content was highly technical. As a result, work is being done on more sophisticated translation techniques. Making use of predictive analysis, and “lexical buffers” which store all the words in a sentence for syntactical analysis before final printout, scientists have improved the translation a great deal. In effect, the computer studies the structure of the sentence, determining whether modifiers belong with subject or object, and checking for the most probable grammatical form of each word as indicated by other words in the sentence.
The advanced nature of this method of translation requires the help of linguistics experts. Among these is Dr. Sydney Lamb of the University of California at Berkeley who is developing a computer program for analysis of the structure of any language. One early result of this study was the realization that not enough is actually known of language structure and that we must backtrack and build a foundation before proceeding with computer translation techniques. Dr. Lamb’s procedure is to feed English text into the computer and let it search for situations in which a certain word tends to be preceded or followed by other words or groups of words. The machine then tries to produce the grammatical structure, not necessarily correctly. The researcher must help the machine by giving it millions of words to analyze contextually.
What the computer is doing in hours is reproducing the evolution of language and grammar that not only took place over thousands of years, but is subject to emotion, faulty logic, and other inaccuracies as well. Also working on the translation problem are the National Bureau of Standards, the Army’s Office of Research and Development, and others. The Army expects to have a computer analysis in 1962 that will handle 95 per cent of the sentences likely to be encountered in translating Russian into English, and to examine foreign technical literature at least as far as the abstract stage.
Difficult as the task seems, workers in the field are optimistic and feel that it will be feasible to translate all languages, even the Oriental, which seem to present the greatest syntactical barriers. An indication of success is the announcement by Machine Translations Inc. of a new technique making possible contextual translation at the rate of 60,000 words an hour, a rate challenging the ability of even someone coached in speed-reading! The remaining problem, that of doing the actual reading and evaluation after translation, has been brought up. This considerable task too may be solved by the computer. The machines have already displayed a limited ability to perform the task of abstracting, thus eliminating at the outset much material not relevant to the task at hand. Another bonus the computer may give us is the ideal international and technical language for composing reports and papers in the first place. A logical question that comes up in the discussion of printed language translation is that of another kind of translation, from verbal input to print, or vice versa. And finally from verbal Russian to verbal English. The speed limitation here, of course, is human ability to accept a verbal input or to deliver an output. Within this framework, however, the computer is ready to demonstrate its great capability.
A recent article in Scientific American asks in its first sentence if a computer can think. The answer to this old chestnut, the authors say, is certainly yes. They then proceed to show that having passed this test the computer must now learn to perceive, if it is to be considered a truly intelligent machine. A computer that can read for itself, rather than requiring human help, would seem to be perceptive and thus qualify as intelligent.
Even early computers such as adding machines printed out their answers. All the designers have to do is reverse this process so that printed human language is also the machine’s input. One of the first successful implementations of a printed input was the use of magnetic ink characters in the Magnetic Ink Character Recognition (MICR) system developed by General Electric. This technique called for the printing of information on checks with special magnetic inks. Processed through high-speed “readers,” the ink characters cause electrical currents the computer can interpret and translate into binary digits.