Connecting and Sharing: the Emerging Role of Z39.50 in Library Networks
1. Introduction
While everyone has heard about Z39.50, there is still a lot of uncertainty about its relevance to the library community. “It’s still under development,.” you may have read on lists or heard people say. “It’s too complex to implement.” “It doesn’t work.” “It’s not needed now we have the Web.” In fact, Z39.50 is a mature standard, widely implemented in the library community. It is beginning to solve real problems, not just for libraries, but also for other collecting agencies such as art galleries, museums and archives. Implementors Implementors are no longer focusing on adding new functionality but on ensuring interoperability within and across these communities. Not only is it still relevant in a Web environment, the Web provides opportunities for universal access to Z39.50-enabled databases.
The strategic importance of Z39.50 to the library community lies in its potential to accommodate a wide range of information exchange applications between libraries and consortia. The paper addresses the effect of Z39.50 on an activity the National Library considers important: to support the provision and maintenance of a centralised union catalogue. Following a short overview of the protocol, the distributed method of providing the functionality of a union catalogue by exploiting Z39.50 is compared with centralised models.
2. The Protocol
2.1 History
The development of Z39.50 can be traced back to the OSI (Open Systems Interconnection) model, where Z39.50 is an application layer protocol.1 The current version of the protocol was published in 1995 and is titled Information Retrieval (Z39.50): Application Service Definition and Protocol Specification. In this form it is an ANSI/NISO standard (American National Standards Institute, National Information Standards Organization). Previous versions were published in 1992 (version 2) and 1988 (version 1, now considered obsolete). It has become an international standard with ISO number 23950, and the texts of the ISO and ANSI versions are identical. The standard is maintained by the Z39.50 Maintenance Agency at the Library of Congress. The Agency can be accessed at: http://lcweb.loc.gov/z3950/agency/. At this site, information on current developments of the standard and instructions on how to join the Z39.50 Implementors’ Group (ZIG) discussion list can be found.
2.2 Basics
The core functions of Z39.50 relate to searching and retrieving information from databases stored on multiple host sites. The protocol “specifies data structures and interchange rules that allow a client machine (called an ‘origin’ in the standard) to search databases on a server machine (called a ‘target’ in the standard) and retrieve records that are identified as a result of such a search.”2
The protocol confines itself to interactions between the client and server machines, and does not address interaction between a human user and the client machine or between the target machine and its databases. The standard is designed to facilitate interoperability between computer systems. The communication described in the standard is connection-oriented and stateful: that is, the origin initiates a session with the target and the connection is maintained until the association is terminated.
In an implementation, the origin and target convert their local forms of messages and responses to and from Z39.50 ‘language’. This means an origin can maintain a consistent user interface for searching targets which support Z39.50, because the client machine’s searching syntax can be mapped into Z39.50 queries. In this way, the origin extends the local interface to search external targets. On the target or server side, this requires considerable conversion because the incoming Z39.50 query must be mapped to retrieval mechanisms and vice versa.

The standard does not directly support the broadcasting of searches to multiple servers, but a client can open Z39.50 sessions with multiple servers either sequentially or simultaneously. Manipulating multiple result sets to remove duplicates and ensure a uniform presentation to the user also falls outside the scope of the protocol.
Web-based search and retrieval applications need Z39.50 for the same reason as proprietary applications – to avoid the proliferation of interfaces to the target databases. The Web is a static collection of html documents stored on http servers. Special programs using scripting languages and compiled modules are needed to deliver search and retrieval functionality. In server-based implementations, the http/ Z39.50 gateway resides on an http server as in the diagram below. Browser-based implementations also exist which require Java or Active X applets to be downloaded to the user’s machine.

As databases differ considerably in structure and indexing methods, the protocol employs a common, abstract model for describing databases. The model requires a “schema” or abstract record structure to be defined for each database, composed of “elements” such as author, title, date last modified. Access points are also defined for each searchable element or group of elements.
However, Z39.50 should not be interpreted as a database indexing standard. In each implementation, the targeta databases must be mapped to the Z39.50 database model to enable communication between origin and target. However, Z39.50 profiles and implementor agreements developed for specific communities do impose a de facto database indexing standard in that they define the minimum set of access points needing to be supported by the database indexes to ensure interoperability between target systems.
2.3 Search and Retrieval Facilities
The Search and Retrieval facilities are the core functions of the standard.
A Search request can be made to one or more databases at a target system and must contain a query. The group of records retrieved as the result of a query is called a result set. Several query-types are defined, including Type-1 for Reverse Polish Notation (RPN) and Type 101, which extends Type-1 for proximity searching and restricting result sets by other attributes. The protocol gives full support and mandates use of the type-1 query which consists of a single access point clause, or several clauses linked by logical operators. For example,
In the database named “Library” find all records for which the access point ‘title word’ contains the value ‘glass’ AND the access point ‘author’ contains the value ‘white’
The attributes used in searching belong to a particular attribute set, “whose definition is registered, that is, assigned a unique and globally recognised attribute-set-ID, an Object Identifier, which is included within the query.”3 The attribute set used by the bibliographic community is called bib-1
Retrieval consists of two Z39.50 services: Present and Segment. In a Present response, the origin requests response records in the result set. The origin may specify a preferred syntax, schema and element specification (e.g., brief, full, brief with holdings, title and subject only, etc.). The syntax is the envelope in which the elements are packaged for transfer between systems. Z39.50 supports a number of such syntaxes, ranging from the familiar MARC syntax to the general, but very complex GRS (General Record Structure) syntax. GRS is becoming increasingly important in library applications because it enables elements from multiple schemas (e.g., bibliographic, holdings and circulation data) to be combined in one package for transfer.
If the target cannot support the number of response records requested by the origin, the target segments the response, delivering the result set in portions.
2.4 Other Facilities
There are a number of services which complement the basic Search and Present functions by providing other types of messages between origin and target. These include session establishment and termination, access control, and operations on result sets, such as sorting, browsing or deleting. The Scan service, important to library implementations because it enables browsing of ordered term lists, was introduced in version 3 of the standard and is not yet widely implemented.
One particularly promising service is Explain which allows an origin to query a target about implementation details, e.g., which databases are available, the particular attribute sets and record syntaxes used. The facility is “intended to permit the development of clients that to at least some extent are dynamically self-configuring as they encounter various servers.”4 The Explain facility has not been widely implemented. However, a number of implementors have formed a group recently to test its functionality.
2.5 Extended Services
Version 3 introduced a new service called Extended Services, which permits tasks to be performed outside a Z39.50 session. These tasks are more complex than a search and retrieve operation, and may usually be carried out after the session initiating the task has been completed. If the task is not completed before the target responds to the Extended Services request, the target creates a task package in a special Extended Services database which can be searched by the origin using standard Z39.50 facilities.“The Extended-Services (ES) service allows an origin to create, modify, or delete a task package at the target. The target maintains task packages in a special database.”5
The Extended Services are described in Appendix 8 of the published standard. They include:
· Persistent Result Set. This enables the origin to request the target to save a search result.
· Persistent Query. In this service, the target saves a Z39.50 Query in response to a request from the origin.
· Periodic Query. The origin requests the target to save a query and run it periodically, according to a schedule specified by the origin.
· Item Order. The origin can submit a document delivery request to the target
· Database Update. The origin requests that the target update a database by insert new records, replacing or deleting existing records, or by modifying elements within records.
· Export Specification. The origin specifies the format, delivery mechanism and destination of records from result sets.
· Export Invocation. This service allows the origin to request delivery of records, according to an Export Specification.
3. Business Applications
There are a number of potential and existing applications of this standard to libraries. Local access to external data sources. The basic search and retrieve functions can be used to extend the number of data sources available for searching at a user workstation. Local and remote databases can be searched using the syntax provided in the local system. This has been the most common implementation of Z39.50 in libraries. Creation of virtual or distributed union catalogues. A group of libraries can use the Search and Present services to enable access from a local origin to many targets. In this way, a user on one library can use the syntax and interface of their local system to search catalogues of other systems in the group. With the ILL Protocol, a group of libraries could provide a virtual union catalogue and mechanisms for resource sharing between them. Issues related to this will be discussed later in the paper. Copy cataloguing using Z39.50. A local Z39.50 origin can search an external database, specify that the records be presented in MARC syntax, and copy them into their local system for inclusion in a local catalogue. This practice is becoming more widespread.
Orders for bibliographic outputs. The Extended Services allow a variety of methods to retrieve result sets on a regular basis and have them sent in specified formats. There are a number of possibilities for use of these facilities: SDI services; new and changed records for catalogue purposes; reports for collection development purposes.
Updating databases. The Update service of the Extended Services can enable simultaneous updating of more than one target by an origin. This will be taken up further in the discussion of the Union Catalogue Profile.
4. Union Catalogues
4.1 Introduction
The business applications outlined in Section 3 have implications for centralised bibliographic services. The availability of copy cataloguing through Z39.50 can be expected to have some impact on enterprises based on supply of MARC records. In Australia, the use of ABN for catalogue record creation and supply has made a significant contribution to the development of a national union catalogue, which supports resource sharing and collection development activities in libraries across the nation. In planning for the replacement of ABN, the Library has had to take account of new and possible ways for libraries to build their catalogues: vendors supply MARC records; catalogue records can be obtained from other utilities such as OCLC and RLG; catalogue records can be copied from other servers, e.g., the Library of Congress.
While considering the future of the national union catalogue, the National Library has observed the development of sectoral consortia and regional or state-based networks. Z39.50 has the potential to change the operations of these networks through increasing implementations of distributed or virtual union catalogues. During planning for the Networked Services Project, the Library had to make a key strategic decision about continuing to provide and maintain a centralised national union catalogue. The decision was positive, because the Library believes the union catalogue will continue to play a role in resource sharing and collection development for at least another five years. Union catalogues may also have a strong role to play in access to digital collections, through collection level and item level records linked to the digital files at a remote host site. However, the Library did recognise that while the services delivered by a centralised national union catalogue were required, Z39.50 raised possibilities for new methods of creating this catalogue
4.2 The Virtual Union Catalogue
In the past, union catalogues have been implemented as centralised systems, with a single database on a single system. This model includes: commercial services such as OCLC, RLG and WLN which developed as large scale shared cataloguing systems; pure union catalogues, such as the University of California’s MELVYL system, developed specifically as public access union catalogues; and, shared union catalogues which are part of an integrated library system shared by a group of libraries.
Z39.50 based searching, on the other hand, allows users to search multiple catalogues on multiple distributed systems with a single search. These multiple databases then constitute a ‘virtual’ union catalogue from the user’s point of view. Key research in this area is being carried out by the National Library of Canada. The goal of their virtual Canadian union catalogue (vCuc) project5 is to determine the long term feasibility of using Z39.50 for searching distributed individual library catalogues and consortial union catalogues which together would emulate the services provided by a centralised union catalogue.
One particular problem with the distributed search model is the retrieval of locations, holdings and circulation information. Recent discussion within the Z39.50 Implementors’ Group on how to provide this information in response to a Z39.50 query has not yet achieved consensus. The problem is compounded by the lack of consistency in the storage of holdings data in local and union catalogues. The USMARC format for holdings and locations seems an obvious candidate to ensure predictability and reliability in standardising these types of data. However, in Australia there has not been widespread implementation of this particular USMARC format. USMARC also allows summary information for holdings to be embedded in bibliographic records in tags 850 and 852, but there is little evidence of this practice in Australia. Generally holdings data is held in locally defined MARC fields, usually in the 9XX block.
Another limitation of the distributed model is its inability to deal satisfactorily with duplicate records in a search result. Ideally, the separate result sets retrieved from each target should be merged and duplicate records removed. In centralised union catalogues, duplicate resolution is chiefly achieved through software. The residue of duplicates which cannot be resolved by software is sent to human review. However, duplicate resolution in the distributed environment is primarily a human task. When search results are retrieved from a Z39.50 broadcast search sent to multiple targets, a human being reviews the results to identify duplicates. With records being presented sequentially as the search results from each target arrives, and screen displays that are not designed to make record comparison easy, the searcher can have a difficult task.
The task of de-duplication will become critical if the shift to distributed union catalogues gathers pace. The University of Bradford is conducting some useful research in this area, using client software to merge records and deliver including record merging and user-friendly displays.6 However, the National Library of Canada is uncertain of the scalability of a client-based architecture for duplicate removal if complex algorithms such as those now common in large central databases are involved.7 At the National Library of Australia, we are beginning to think that “middle-ware” solutions may deliver better performance, with the client referring Z39.50 searches and the responsibility for resolving duplicates to a server-based broker. Such an architecture might also address concerns with the intellectual property invested in MARC records by managing user authentication and logging transactions. There is an increasing trend for OPACs to present search results in a syntax other than MARC to avoid copyright issues related to the re-use of records purchased from bibliographic utilities and other copy cataloguing services.
Lynch has examined the advantages and limitations of the centralised and distributed approaches, and concluded that with current technology, centralised union catalogues have major advantages both in function (searching and indexing consistency, database quality and removal of duplicates) and in performance (particularly from the user’s point of view, availability and response time)8. One argument against virtual union catalogues is that, to work effectively, they appear at present to be successful with a limited number of targets – up to around eight. The technical problems such as merging and de-duplication of result sets and performance may be solved sooner rather than later. However, on balance, there are technical advantages in delivering services through centralised union catalogues.
4.3 Centralised Union Catalogues
Technical factors were not the sole criteria for a decision about maintaining a centralised union catalogue. The National Library of Australia considered the value of the existing asset within the National Bibliographic Database, the cooperative environment in Australian libraries built up over many years, the high quality of service it can offer and the viability of the present service. For these reasons the National Library of Australia stated in its RFT for the Networked Services Project which will replace the ABN system, that “for the location information on the NBD, it is the Library’s present view that the most practical solution is probably a centralised national database”9. The National Library is committed to the national union catalogue as a means of supporting resource sharing among Australian libraries and believes that a centralised system is the best way of achieving this at the present time. However, the Library is aware that there may be significant changes to this system in the next few years.
Weighed against the advantages of a centralised system is the high cost of building and maintaining a large centralised database and the consequent charges which are passed on to searchers to meet desired levels of cost recovery. In the case of the ABN database, there are six million records without holdings in addition to the seven million with location information. The records without holdings are included to meet the needs of cataloguers searching for copy: before the Internet, it was an efficient way of providing a wide range of cataloguing data for libraries. The Library has been looking for other ways to deliver these services, and suggested in the Networked Services RFT that “for the data which does not have attached location information, the Library wishes to offer ‘apparent one-stop shopping’ but has no prior preference for whether this is implemented through a distributed or centralised model”10. Lynch argues that the centralised and distributed approaches should be complementary rather than competitive11.
While the national union catalogue, that is all bibliographic records with holdings, will probably be implemented in the replacement system in a centralised form, the provision of source data for copy cataloguing may be considered differently. The National Library is considering, for example, not migrating some older data without holdings to the new system. Access to lesser used data can be provided by Z39.50 searching of external databases through the same search interface as is provided for the proposed National Bibliographic Utility, and therefore the amount of data which the National Library has to maintain can be reduced. However the main benefit of Z39.50 searching of external databases for users lies in the wider range of bibliographic source data which can be made available.
4.4 Strategies for delivering a centralised union catalogue
The shared cataloguing service goes hand in hand with, and is a vital means of building and maintaining the national union catalogue. In replacing the ABN system, one of the National Library’s objectives is to make it as easy as possible for users to contribute to the national union catalogue.
Several years ago, it looked as though the best strategy for achieving this was to integrate file transfer mechanisms seamlessly into the workflows of information providers. The current ABN system has a downline loading service based on ftp which delivers records to the local system in real time. However, this service requires users to use one interface to catalogue the item and another to add detailed holdings information. In addition, keeping the national union catalogue up-to-date as holdings change depends on duplicating workflows or using a batch update process which has not been widely taken up by libraries. While the National Library recognises the need to accommodate libraries preferring to catalogue locally, the current ABN system is not equipped with duplicate removal software sufficiently sophisticated to support upline loading. Upline loading of records from local systems has been a key requirement from early stages in plans to replace ABN.
Since then, library systems have evolved from proprietary character-based applications to Z39.50-enabled client/server technologies. This new architecture provides both a threat to the existing strategies for maintaining the union catalogue and a significant opportunity to achieve seamless integration of union catalogue maintenance into user workflows.
Part of the solution is already provided by the Z39.50 Search and Present services. These enable libraries to obtain bibliographic records in MARC syntax from Z39.50-enabled servers. Many library system vendors now offer Z39.50 cataloguing client software to their users. The next generation of products will be integrated technical workstations which fully acknowledge the interrelationship between selection, acquisition, accessioning and cataloguing processes in building library catalogues.
4.5 Union Catalogue Profile
In 1996, IT/19, the joint Standards Australia and Standards New Zealand Committee dealing with library-related standards, endorsed a proposal from Jan Gatenby of Stowe Computing and the National Library of Australia to develop a cataloguing protocol that would enable all cataloguing to be done from a single client.12
Benefits of using a single client are:
Potential for a superior work flow
Only one interface to master
A simple efficient configuration
Simplified support
A wider choice for users
Opportunities for independent evolution of client and server products.
The cataloguing protocol needed to support all catalogue maintenance functions, including bibliographic, holdings and authority data. It needed to recognise that the update function is dependent on searching and that database searches are frequently done at several stages in the update process. Another requirement was to transfer information between systems about transactions as well as data. The protocol also needed appropriate security mechanisms and effective client-based, interactive error resolution methods where possible.
A Union Catalogue Profile over Z39.50 was considered to be the most appropriate mechanism for delivering these requirements. A profile specifies how a particular standard, or group of standards will be used to support a given application, function, community or environment. It is used both for procurement purposes and to ensure interoperability between systems. Developing a Z39.50 profile rather than a separate cataloguing protocol avoided the proliferation of protocols. Z39.50 already has all the services in place to support user authentication and search and retrieval, together with a Database Update Extended Service defined within it. These services have been widely accepted and implemented by the library community. The profile has now been reviewed by the ZIG in several versions and is about to be submitted to the International Standards Organisation as a Draft Internationally Registered Profile (DIRP).
The Union Catalogue Profile presents a strategy for enabling all cataloguing to be done through a single client. The National Library is interested in supporting and developing the Profile. The Networked Services Project RFT sought products which supported the Union Catalogue Profile. At the time of writing, it is not possible to disclose any more information about the tender process and solutions being considered. However, before any major implementation of the Profile, some testing will be required.
The Library encourages libraries and integrated library management system vendors to read the Profile and register interest in pilot projects. Comments on the Profile are also encouraged. It is recognised that implementors may wish to move towards conformance in stages. A set of priorities for staged implementation is included in Appendix 3 of the Profile. It may take several years before the Union Catalogue Profile is widely implemented. In the meantime, services such as the replacement ABN will have to support a range of migration strategies involving a combination of proprietary and standards-based solutions
5. Conclusion
The National Library recognises that the resource sharing environment is becoming more open and distributed. This trend has been assisted by libraries implementing key standards: MARC, Z39.50, the ILL Protocol are prominent now. There are many predictions about this future environment and many research projects underway. For the next five years at least, the National Library believes the national union catalogue still has an important role to play. As distributed approaches grow, the national union catalogue may end up covering gaps in the recording of collections and locations. Whatever the outcome, the national Library is committed to maintaining a centralised union catalogue, but offering a wide range of options for libraries to contribute to it. We believe it still a very important part of the nation’s information infrastructure.
References
Clifford Lynch, “The Z39.50 Information Retrieval Standard. Part 1, A Strategic View of Its Past, Present and Future”. In D-Lib Magazine [online]. April, 1997 [cited 4 August 1997].
Lynch.
Z39.50 Maintenance Agency, Information Retrieval (Z39.50): Application Service Definition and Retrieval Specification (Washington, 1995), p. iii.
Lynch.
National Library of Canada, Virtual Canadian Union Catalogue Project [online] Ottawa, 1997 [cited 20 November 1997].
University of Bradford, Dept. of Computing. Bradford OPAC 2 (BOPAC2) [online] Bradford, 1997 [cited 20 November 1997].
Carrol Lunau, Fay Turner, Issues Related to the Use of Z39.50 to Emulate a Centralized Union Catalogue [online]. Ottawa: National Library of Canada, 1997 [cited 4 August 1997].
Lynch.
National Library of Australia, Request for Tender for National Library of Australia Networked Services (Canberra, 1997), p. 49.
National Library of Australia, p. 49.
Lynch.
National Library of Australia, Union Catalogue Profile [online] Version 3, August 1997. Canberra, The Library. 1997 [cited 4 August 1997].


Leave a Reply