Project Open Book, the Yale University Library's effort to convert 10, 000 books from microfilm to digital imagery, is currently in an advanced state of planning and organization. The Yale Library has selected a major vendor to serve as a partner in the project and as systems integrator. In its proposal, the successful vendor helped isolate areas of risk and uncertainty as well as key issues to be addressed during the life of the project. The Yale Library is now poised to decide what material it will convert to digital image form and to seek funding, initially for the first phase and then for the entire project.
The proposal that Yale accepted for the implementation of Project Open Book will provide at the end of three phases a conversion subsystem, browsing stations distributed on the campus network within the Yale Library, a subsystem for storing 10,000 books at 200 and 600 dots per inch, and network access to the image printers. Pricing for the system implementation assumes the existence of Yale's campus ethernet network and its high-speed image printers, and includes other requisite hardware and software, as well as system integration services. Proposed operating costs include hardware and software maintenance, but do not include estimates for the facilities management of the storage devices and image servers.
Yale selected its vendor partner in a formal process, partly funded by the Commission for Preservation and Access. Following a request for proposal, the Yale Library selected two vendors as finalists to work with Yale staff to generate a detailed analysis of requirements for Project Open Book. Each vendor used the results of the requirements analysis to generate and submit a formal proposal for the entire project. This competitive process not only enabled the Yale Library to select its primary vendor partner but also revealed much about the state of the imaging industry, about the varying, corporate commitments to the markets for imaging technology, and about the varying organizational dynamics through which major companies are responding to and seeking to develop these markets.
Project Open Book is focused specifically on the conversion of images from microfilm to digital form. The technology for scanning microfilm is readily available but is changing rapidly. In its project requirements, the Yale Library emphasized features of the technology that affect the technical quality of digital image production and the costs of creating and storing the image library: What levels of digital resolution can be achieved by scanning microfilm? How does variation in the quality of microfilm, particularly in film produced to preservation standards, affect the quality of the digital images? What technologies can an operator effectively and economically apply when scanning film to separate two-up images and to control for and correct image imperfections? How can quality control best be integrated into digitizing work flow that includes document indexing and storage?
The actual and expected uses of digital images—storage, browsing, printing, and OCR—help determine the standards for measuring their quality. Browsing is especially important, but the facilities available for readers to browse image documents is perhaps the weakest aspect of imaging technology and most in need of development. As it defined its requirements, the Yale Library concentrated on some fundamental aspects of usability for image documents: Does the system have sufficient flexibility to handle the full range of document types, including monographs, multi-part and multivolume sets, and serials, as well as manuscript collections? What conventions are necessary to identify a document uniquely for storage and retrieval? Where is the database of record for storing bibliographic information about the image document? How are basic internal structures of documents, such as pagination, made accessible to the reader? How are the image documents physically presented on the screen to the reader?
The Yale Library designed Project Open Book on the assumption that microfilm is more than adequate as a medium for preserving the content of deteriorated library materials. As planning in the project has advanced, it is increasingly clear that the challenge of digital image technology and the key to the success of efforts like Project Open Book is to provide a means of both preserving and improving access to those deteriorated materials.
SESSION IV-B
George THOMA
In the use of electronic imaging for document preservation, there are several issues to consider, such as: ensuring adequate image quality, maintaining substantial conversion rates (through-put), providing unique identification for automated access and retrieval, and accommodating bound volumes and fragile material.
To maintain high image quality, image processing functions are required to correct the deficiencies in the scanned image. Some commercially available systems include these functions, while some do not. The scanned raw image must be processed to correct contrast deficiencies— both poor overall contrast resulting from light print and/or dark background, and variable contrast resulting from stains and bleed-through. Furthermore, the scan density must be adequate to allow legibility of print and sufficient fidelity in the pseudo-halftoned gray material. Borders or page-edge effects must be removed for both compactibility and aesthetics. Page skew must be corrected for aesthetic reasons and to enable accurate character recognition if desired. Compound images consisting of both two-toned text and gray-scale illustrations must be processed appropriately to retain the quality of each.