From the Print Media to the Internet - Marie Lebert

"Anything that can be entered into a computer can be reproduced indefinitely." The Project Gutenberg Philosophy uses this premise to make information, books and other materials available to the general public in forms a vast majority of the computers, programs and people can easily read, use, quote, and search. Project Gutenberg Etexts are made available in what has become known as 'Plain Vanilla ASCII', meaning the low set of the American Standard Code for Information Interchange (ASCII). The reason for this is that 99% of the hardware and software a person is likely to run into can read and search these files." Plain Vanilla ASCII thus addresses the audience with Apples and Ataris all the way to the old homebrew Z80 computers, not to mention the audience of Mac, UNIX and mainframers. Michael Hart explains:

"When we started, the files had to be very small …. So doing the U.S. Declaration of Independence (only 5K) seemed the best place to start. This was followed by the Bill of Rights - then the whole U.S. Constitution, as space was getting large (at least by the standards of 1973). Then came the Bible, as individual books of the Bible were not that large, then Shakespeare (a play at a time), and then into general work in the areas of light and heavy literature and references…By the time Project Gutenberg got famous, the standard was 360K disks, so we did books such as Alice in Wonderland or Peter Pan because they could fit on one disk. Now 1.44 is the standard disk and ZIP is the standard compression; the practical file size is about three million characters, more than long enough for the average book.

However, pictures are still so bulky to store on disk that it will still be a while before we include even the lowres Tenniel illustrations in Alice and Looking-Glass. However we are very interested in doing them, and are only waiting for advances in technology to release a test edition. The market will have to establish some standards for graphics, however, before we can attempt to reach general audiences, at least on the graphics level."

The On-Line Books Page is a directory of books that can be freely read right on the Internet. It was founded in 1993 by John Mark Ockerbloom, a graduate student in computer science at Carnegie Mellon University, Pittsburgh, Pennsylvania, who remains the editor of the pages. It includes: an index of more than 7,000 on-line books on the Internet, which can be browsed by author, by title or by subject; pointers to significant directories and archives of on-line texts; and special exhibits. From the main search page, users have options to search for four types of media: books, music, art, and video.

"Along with books, The On-Line Books Page is also now listing major archives of serials (such as magazines, published journals, and newspapers), as of June 1998. Serials can be at least as important as books in library research. Serials are often the first places that new research and scholarship appear. They are sources for firsthand accounts of contemporary events and commentary, They are also often the first (and sometimes the only) place that quality literature appears. (For those who might still quibble about serials being listed on a 'books page', back issues of serials are often bound and reissued as hardbound 'books'.)"

Web space and computing resources are provided by the School of Computer Science
at Carnegie Mellon University. The On-Line Books Page participates in the
Experimental Search System of the Library of Congress. It works with The
Universal Library Project, also hosted at Carnegie Mellon University.

In his e-mail to me of September 2, 1998, John Mark Ockerbloom explained how the site began:

"I was the original Webmaster here at CMU CS, and started our local Web in 1993. The local Web included pages pointing to various locally developed resources, and originally The On-Line Books Page was just one of these pages, containing pointers to some books put on-line by some of the people in our department. (Robert Stockton had made Web versions of some of Project Gutenberg's texts.)

After a while, people started asking about books at other sites, and I noticed that a number of sites (not just Gutenberg, but also Wiretap and some other places) had books on-line, and that it would be useful to have some listing of all of them, so that you could go to one place to download or view books from all over the Net. So that's how my index got started.

I eventually gave up the Webmaster job in 1996, but kept The On-Line Books Page, since by then I'd gotten very interested in the great potential the Net had for making literature available to a wide audience. At this point there are so many books going on-line that I have a hard time keeping up (and in fact have a large backlog of books to list). But I hope to keep up my on-line books works in some form or another."