The most important factor that distinguished the vendors under consideration was their identification with the customer. The size and internal complexity of the company also was an important factor. POB was looking at large companies that had substantial resources. In the end, the process generated for Yale two competitive proposals, with Xerox's the clear winner. WATERS then described the components of the proposal, the design principles, and some of the costs estimated for the process.
Components are essentially four: a conversion subsystem, a network-accessible storage subsystem for 10,000 books (and POB expects 200 to 600 dpi storage), browsing stations distributed on the campus network, and network access to the image printers.
Among the design principles, POB wanted conversion at the highest possible resolution. Assuming TIFF files, TIFF files with Group 4 compression, TCP/IP, and ethernet network on campus, POB wanted a client-server approach with image documents distributed to the workstations and made accessible through native workstation interfaces such as Windows. POB also insisted on a phased approach to implementation: 1) a stand-alone, single-user, low-cost entry into the business with a workstation focused on conversion and allowing POB to explore user access; 2) movement into a higher-volume conversion with network-accessible storage and multiple access stations; and 3) a high-volume conversion, full-capacity storage, and multiple browsing stations distributed throughout the campus.
The costs proposed for start-up assumed the existence of the Yale network and its two DocuTech image printers. Other start-up costs are estimated at $1 million over the three phases. At the end of the project, the annual operating costs estimated primarily for the software and hardware proposed come to about $60,000, but these exclude costs for labor needed in the conversion process, network and printer usage, and facilities management.
Finally, the selection process produced for Yale a more sophisticated view of the imaging markets: the management of complex documents in image form is not a preservation problem, not a library problem, but a general problem in a broad, general industry. Preservation materials are useful for developing that market because of the qualities of the material. For example, much of it is out of copyright. The resolution of key issues such as the quality of scanning and image browsing also will affect development of that market.
The technology is readily available but changing rapidly. In this context of rapid change, several factors affect quality and cost, to which POB intends to pay particular attention, for example, the various levels of resolution that can be achieved. POB believes it can bring resolution up to 600 dpi, but an interpolation process from 400 to 600 is more likely. The variation quality in microfilm will prove to be a highly important factor. POB may reexamine the standards used to film in the first place by looking at this process as a follow-on to microfilming.
Other important factors include: the techniques available to the operator for handling material, the ways of integrating quality control into the digitizing work flow, and a work flow that includes indexing and storage. POB's requirement was to be able to deal with quality control at the point of scanning. Thus, thanks to Xerox, POB anticipates having a mechanism which will allow it not only to scan in batch form, but to review the material as it goes through the scanner and control quality from the outset.
The standards for measuring quality and costs depend greatly on the uses of the material, including subsequent OCR, storage, printing, and browsing. But especially at issue for POB is the facility for browsing. This facility, WATERS said, is perhaps the weakest aspect of imaging technology and the most in need of development.
A variety of factors affect the usability of complex documents in image form, among them: 1) the ability of the system to handle the full range of document types, not just monographs but serials, multi-part monographs, and manuscripts; 2) the location of the database of record for bibliographic information about the image document, which POB wants to enter once and in the most useful place, the on-line catalog; 3) a document identifier for referencing the bibliographic information in one place and the images in another; 4) the technique for making the basic internal structure of the document accessible to the reader; and finally, 5) the physical presentation on the CRT of those documents. POB is ready to complete this phase now. One last decision involves deciding which material to scan.
******