We have mentioned the use of such a tape voice in the computerized ground-controlled-approach landing system for aircraft, and the airline reservation system called Unicall in which a central computer answers a dialed request for space in less than three seconds—not with flashing lights or a printed message but in a loud clear voice. It must pain the computer to answer at the snail-like human speed of 150 words a minute, so it salves its conscience by handling 2,100 inputs without getting flustered.
The writer’s dream, a typewriter that has a microphone instead of keys and clacks away merrily while you talk into it, is a dream no longer. Scientists at Japan’s Kyoto University have developed a computer that does just this. An early experimental model could handle a hundred Japanese monosyllables, but once the breakthrough was made, the Japanese quickly pushed the design to the point where the “Sonotype” can handle any language. At the same time, Bell Telephone Laboratories works on the problem from the other end and has come up with a system for a typewriter that talks. Not far behind these exotic uses of digital computer techniques are such things as automatic translation of telephone or other conversations.
Information Retrieval
It has been estimated that some 445 trillion words are spoken in each 16-hour day by the world’s inhabitants, making ours a noisy planet indeed. To bear out the “noisy” connotation, someone else has reckoned that only about 1 per cent of the sounds we make are real information. The rest are extraneous, incidentally telling us the sex of the speaker, whether or not he has a cold, the state of his upper plate, and so on. It is perhaps a blessing that most of these trillions of words vanish almost as soon as they are spoken. The printed word, however, isn’t so transient; it not only hangs around, but also piles up as well. The pile is ever deeper, technical writings alone being enough to fill seven 24-volume encyclopedias each day, according to one source. As with our speech, perhaps only 1 per cent of this outpouring of print is of real importance, but this does not necessarily make what some have called the Information Explosion any less difficult to cope with.
The letters IR once stood for infra-red; but in the last year or so they have been appropriated by the words “information retrieval,” one of the biggest bugaboos on the scientific horizon. It amounts to saving ourselves from drowning in the fallout from typewriters all over the earth. There are those cool heads who decry the pushing of the panic button, professing to see no exponential increase in literature, but a steady 8 per cent or so each year. The button-pushers see it differently, and they can document a pretty strong case. The technical community is suffering an embarrassment of riches in the publications field.
While a doubling in the output of technical literature has taken the last twelve years or so, the next such increase is expected in half that time. Perhaps the strongest indication that IR is a big problem is the obvious fact that nobody really knows just how much has been, is being, or will be written. For instance, one authority claims technical material is being amassed at the rate of 2,000 pages a minute, which would result in far more than the seven sets of encyclopedias mentioned earlier. No one seems to know for sure how many technical journals there are in the world; it can be “pinpointed” somewhere between 50,000 and 100,000. Selecting one set of figures at random, we learn that in 1960 alone 1,300,000 different technical articles were published in 60,000 journals. Of course there were also 60,000 books on technical subjects, plus many thousands of technical reports that did not make the formal journals, but still might contain the vital bit of information without which a breakthrough will be put off, or a war lost. Our research expenses in the United States ran about $13 billion in 1960, and the guess is they will more than double by 1970. An important part of research should be done in the library, of course, lest our scientist spend his life re-inventing the wheel, as the saying goes.
To back up this saying are specific examples. For instance, a scientific project costing $250,000 was completed a few days before an engineer came across practically the identical work in a report in the library. This was a Russian report incidentally, titled “The Application of Boolean Matrix Algebra to the Analysis and Synthesis of Relay Contact Networks.” In another, happier case, information retrieval saved Esso Research & Engineering Co. a month of work and many thousands of dollars when an alert—or lucky—literature searcher came across a Swedish scientist’s monograph detailing Esso’s proposed exploration. Another literature search obviated tests of more than a hundred chemical compounds. Unfortunately not all researchers do or can search the literature in all cases. There is even a tongue-in-cheek law which governs this phenomenon—“Mooer’s” Law states, “An information system will tend not to be used whenever it is more painful for a customer to have information than for him not to have it.”
As a result, it has been said that if a research project costs less than $100,000 it is cheaper to go ahead with it than to conduct a rigorous search of the literature. Tongue in cheek or not, this state of affairs points up the need for a usable information retrieval system. Fortune magazine reports that 10 per cent of research and development expense could be saved by such a system, and 10 per cent in 1960, remember, would have amounted to $1.3 billion. Thus the prediction that IR will be a $100 million business in 1965 does not seem out of line.
The Center for Documentation at Western Reserve University spends about $6-1/2 simply in acquiring and storing a single article in its files. In 1958 it could search only thirty abstracts of these articles in an hour and realized that more speed was vital if the Center was to be of value. As a result, a GE 225 computer IR system was substituted. Now researchers go through the entire store of literature—about 50,000 documents in 1960—in thirty-five minutes, answering up to fifty questions for “customers.”