The emergence of high-speed telecommunications networks as a basic feature of the scholarly workplace is driving the demand for electronic document delivery. Three distinct categories of electronic publishing/republishing are necessary to support access demands in this emerging environment:
1.) Conversion of paper or microfilm archives to electronic format
2.) Conversion of electronic files to formats tailored to
electronic retrieval and display
3.) Primary electronic publishing (materials for which the
electronic version is the primary format)
OCLC has experimental or product development activities in each of these areas. Among the challenges that lie ahead is the integration of these three types of information stores in coherent distributed systems.
The CORE (Chemistry Online Retrieval Experiment) Project is a model for the conversion of large text and graphics collections for which electronic typesetting files are available (category 2). The American Chemical Society has made available computer typography files dating from 1980 for its twenty journals. This collection of some 250 journal-years is being converted to an electronic format that will be accessible through several end-user applications.
The use of Standard Generalized Markup Language (SGML) offers the means to capture the structural richness of the original articles in a way that will support a variety of retrieval, navigation, and display options necessary to navigate effectively in very large text databases.
An SGML document consists of text that is marked up with descriptive tags that specify the function of a given element within the document. As a formal language construct, an SGML document can be parsed against a document-type definition (DTD) that unambiguously defines what elements are allowed and where in the document they can (or must) occur. This formalized map of article structure allows the user interface design to be uncoupled from the underlying database system, an important step toward interoperability. Demonstration of this separability is a part of the CORE project, wherein user interface designs born of very different philosophies will access the same database.
NOTES:
(6) The CORE project is a collaboration among Cornell University's
Mann Library, Bell Communications Research (Bellcore), the American
Chemical Society (ACS), the Chemical Abstracts Service (CAS), and
OCLC.
Michael LESK The CORE Electronic Chemistry Library
A major on-line file of chemical journal literature complete with graphics is being developed to test the usability of fully electronic access to documents, as a joint project of Cornell University, the American Chemical Society, the Chemical Abstracts Service, OCLC, and Bellcore (with additional support from Sun Microsystems, Springer-Verlag, DigitaI Equipment Corporation, Sony Corporation of America, and Apple Computers). Our file contains the American Chemical Society's on-line journals, supplemented with the graphics from the paper publication. The indexing of the articles from Chemical Abstracts Documents is available in both image and text format, and several different interfaces can be used. Our goals are (1) to assess the effectiveness and acceptability of electronic access to primary journals as compared with paper, and (2) to identify the most desirable functions of the user interface to an electronic system of journals, including in particular a comparison of page-image display with ASCII display interfaces. Early experiments with chemistry students on a variety of tasks suggest that searching tasks are completed much faster with any electronic system than with paper, but that for reading all versions of the articles are roughly equivalent.
Pamela ANDRE and Judith ZIDAR