| Home | Day 1 |
| Previous paper | Day 2 |
| Author index | Day 3 |
Fuelled by a successive wave of information transfer protocols such as tcp/ip, ftp, smtp, mtp, gopher and http, the Internet has produced an unprecedented wave of invention of new information services. Over the last seven years we have seen archie, gopher, veronica, jughead, followed by an explosion in the number and variety of Internet search engines and specific-purpose databases. On top of these standards we now see another wave of interactive techniques adding to complexity, such as cgi scripting, Java and JavaScript.
While the library community initially saw the Internet in terms of its potential as a publishing and information delivery medium, its impact is now far wider, affecting government, commerce and, in particular, vertical links within industries. New business models in information distribution and alternatives to information supply by libraries are now emerging. The deluge of change continues with new technologies like Dublin Core, XML and RDF, which have the potential to completely alter the way information services are organised and by whom. This paper develops on these themes and suggests possible new models to challenge libraries and librarians in the next few years.
Im a patchy seer. At the 1991 VALA conference I gave an informal paper where I said that the Internet was going to be important, but it has become far more important than I expected. At the 1993 conference I gave a paper on campus-wide information systems and predicted that gopher would be displaced by the World Wide Web and that it too would create changes to publishing and libraries. Of course it has been far more significant than that. In 1996 I gave a paper where I suggested the Web would be as important to the future of computing as it would to publishing.1 Again an underestimation. While I also suggested that cgi scripts and Java would be technologies to watch I also picked out VRML which so far has been a bit of a non-starter. What you are about to read therefore is likely to be a curates egg.
This paper will be library-focused, as that is my community and thence bias. Other sectors of the information industry are involved, as one of the things the Net is doing is blurring the boundaries between the industries that support communication. Libraries become involved in publishing, publishers become involved in cataloguing and new industries such as information aggregators arise.
But there is more than the technology. There is the social and economic interactions which determine how it will be applied. Many of the networking technologies are now mature (in so far as a rapidly developing field can be) and being deployed rapidly. They are starting to affect the way we structure our organisations and what we do or outsource. I concluded then that without placing these technologies in a strategic context their relevance would be unclear. That context for this paper is the future services that the library or the library profession will deliver or help deliver. I do not necessarily assume that these service would be run by, or be part of, the library, only that library expertise would be needed to give effect to them. Certainly librarians are still needed.2
The rest of this paper addresses:
For the sake of discussion I have chosen a set of categories which will highlight the difference between the approach to digital resources and paper. The main reason for doing this is to get away from the completely amorphous concept of the digital library which hides more than it reveals and means many different things to different people. What we are in fact moving to is the hybrid library which provides access to information in different forms both print and electronic. The tens of thousands of tons of paper in major libraries will not go away in the foreseeable future. The idea of the hybrid library has been ably put forward by Chris Rusbridge, the Programme Director or the Electronic Libraries Programme at the University of Warwick3 and elaborated with examples by Pinfield et al.4
As print material will not go away, and ways of integrating services based on the two media need to be developed, innovation will be required. This is as true for public libraries as it is for university libraries.5 To successfully achieve innovation the libraries will need to strengthen their central expertise in a number of areas. The IEEE ADL99 Advances in Digital Libraries Conference 1921 May 1999 has a forbiddingly technical list of topics:
For the purposes of discussion I discuss library functions under four categories:
I will discuss these further, but first a digression on the differences between print and networked media is necessary.
The architecture of the Net as a publishing system is radically different from that involved in print, and requires a different approach to bring the work of authors to the hands, or eyes, of readers. A number of these issues have been discussed in a recent book by Rosenfeld, a librarian trained at the Schools of Information and Library Science at the University of Michigan.6
At the deepest level, the difference between networked access and print material is that documents are in the publishers collection, not the librarys collection. This is what access versus ownership means. At any time the publisher may change their location or conditions of access, introducing a new level of instability which must be managed. In the print world, once you have a citation for a work, after a search in the library, you may find it, or after a considerable wait the library might get it for you. In the networked world, if you have a citation you might well be able to have it on your desk in seconds, perhaps with facilitation by the library and perhaps not. The document might also have disappeared without trace. This is not to say that access will be free and easy. To make it so for the user community may be a task for the library which may be doing the paying!
The publisher is thrust into the role of the librarian. The collection of the publishers documents need to be organised for items to be found. On the Net, what was the publishers catalogue becomes a library of the publishers products. The effectiveness of that library, and in particular the effectiveness of the cataloguing/metadata, plays a large part in determining if those publications are read and acquired by potential readers. Within an institution librarians, and presumably the library, then should be key players in organising access to the institutions publications. Thence we see in many institutions that library staff have editorial control of the content and organisation of the institutions Web servers and a large say in the capabilities of the software used in the Web servers and the standards used.
Another major difference is that publishing on the Net is wildly unconstrained, and the neat divisions that can be made between monograph and serials, publicity brochure and textbook, vanity publications and refereed become indistinct. The task of selecting and filtering to ensure access to quality material becomes much more complex requiring more interaction with the end users of the information and often the publisher. To find and select quality material will be a central problem and which I referred to in 1995 at Questnet.7
A third substantial difference relates to the different aggregates into which literature is divided and the structure of the indexes to them.
Roughly speaking, the print literature is divided as shown in Table 1.
| Item and components | Indexes | ||
| Monographs | Library catalogues, bibliographies | ||
| Chapters | |||
| Pages | Book indexes | ||
| Serial titles | Library catalogue | ||
| Papers | Abstracting and indexing services, bibliographies | ||
| Pages | |||
On the Net the taxonomy of indexes is still evolving and a simple model is difficult to construct.
A single physical server may encompass many virtual servers with different names.
The server contains pages, but:
This has created a problem in that there has been a great increase in the number of available indexes with the consequent need to find some way of simplifying access to them
A further change is the involvement of a wide range of new standards12 and an increasing interaction between those standards which relate to libraries, the Internet and publishing, creating new opportunities and services due to an increase in interoperability. An unprecedented rate of change has resulted in the way libraries operate and do business.
This rate of change has not yet ceased. The imminent introduction of Extensible Markup Language13 and the Resource Description Framework (RDF)14 with the associated Document Object Model15 is likely to create changes os similar magnitude to those that happened with the shift from gopher to WWW. The ability of computer programs to get inside documents and interpret what is there will enable much finer grained indexing than is possible now. The use of RDF to carry metadata in a structured way and to serve it remotely from the document it describes could provide services as varied as:
I now address the categories I introduced earlier in the context of the hybid library with print and electronic resources and look at the problems that need to be addressed.
While the collection of print material will continue, budgetary pressures and the print serials crisis will increasing shift to access as an alternative In many areas the print serial will cease to exist. While the collecting might cease the selection of which services to access and under what conditions will remain such as:
The traditional way that the library involved users in the development of the collection will need to be change as users make more use of interactive facilities on the Net and expect similar interaction to be provided by the library They will wish to air their opinions on the utility of non-print material and the mechanisms by which it can be accessed factored into the decision. The level of interactivity provided by the online bookshops will be expected of the library and its catalogues and indexes.
With increasing numbers of services bundled via information aggregators, together with the possibility of consortium deals to access these services, there will be a need for more refined analysis to help decision making.
The growth of interactive course material access, available for a fee, raises the question of the degree to which the library should be involved in purchase of such material as they may well provide an alternative to text books. This is not just a problem for academic libraries as those undertaking distance education may find the public library a pleasant place to access remote course material.
Cataloguing has been described as the central mystery of libraries, although I have also heard it described as the international conspiracy of cataloguers. The catalogue has been the main tool by which libraries organise access to their collections. With more material becoming available on the Net and with cataloguing records available which could be loaded giving hypertext access to the full text, a question of what should be catalogued and what not becomes significant and needs analysis. Roy Tennant thinks its a mistake to do so, because of the costs of cataloguing.16 Norm Medeiros thinks it will open up new opportunities for cataloguers.17 Is the catalogue too fixed in its past as a tool to manage print material, or can it become a prime tool to access all information sources? Is it just a database like any other?
Certainly Web-based OPACS can have entries put in for a wide range of resources such as:
There are a number of experiments in cataloguing material on the Net, but few libraries are doing it in an active way. The most extensive effort is probably at OCLC. Specialised tools have also been developed to create records such as MARCit,18 which can create MARC records for Internet material for later loading into a cataloguing system. The bottom line, however, is that adding entries to the OPAC for Internet objects should be done very selectively due to:
Iowa State University Library has a number of innovative examples.19
The alternative approach is to treat the OPAC as just another database of interest to the clientele, and rely on Internet-based indexes to locate other material. I anticipate that increasingly libraries will shift to relying on specialised indexes of quality material generated by robots rather than the mega search engines which will fail, retrieving and increasing proportion of irrelevant material as the Internet expands. Libraries may well start to generate their own indexes to remote material as robot technology comes to the desktop. Maxum Computers Phantom technology20 is the kind of software that is becoming available. The NLA/DSTC Metweb project21 is a local example.
In the print world, the multiplicity of databases abstracting and indexing services, bibliographies and the catalogue need to be searched separately. Once these services are electronic, mechanisms are being developed to give some integration of searching over multiple databases greatly simplifying the end users problems of finding where to start.
Such integration can achieved at the client end. Muns22 gives a JavaScript example. Apples new Sherlock23 technology puts a very powerful tool in the hands of end uses to do this themselves. This software is also scriptable, so I expect by the time this paper appears Macintosh-based Web servers will be offering aggregated searching across a variety of databases. Where Apple leads Microsoft will follow. The increasing sophistication of end-user tools to handle information could well be a trend which will have unforeseen effects on those organisations which supply information.
Integration can be achieved via protocol converters at the server, linked via CGI scripts. The Z39.50 protocol was developed for multiple bibliographic database searching. The DSTC ZedWEB24 development is an excellent local example. Both Ovid and Silverplatter both now support Z39.50.
CGI scripts which conduct searches of multiple Internet search engines and aggregate the results also provide search aggregation. The SIMS25 system at INRIA is an example. A number are available to aggregate results from Internet search engines and can be tracked at Search Engine Watch.26
This is a rapidly moving area. A useful survey conducted only a few years ago at the University of South Australia, the IDA project,27 is now somewhat dated. The commercial sector is also entering the area with aggregation services such as Blackwells Navigator. It is likely that libraries will continue to create targeted gateways on specific subjects such as those described by Kirriemuir28 or adapt the more general approach of the ROADS project in the UK29 or Internet Scout in the US,30 as well at the others described recently by Pinfield et al.4
For those institutions which publish a wide range of material, such as research institutions, the shift to electronic publishing raises some interesting opportunities for the libraries of those institutions. Network publishing can be thought of as the publisher directly placing their publications on the shelves of a global library. The extent to which this material is read depends on whether it can be found. To assist in this, site-level indexing and organisation is needed, as is metadata to enable targeted search engines to more readily index the material. Librarians have been active in the development of Web sites for many institutions, and this is likely to continue as the librarians skills are those needed to promote and ease access to the publications of an institution.
Cross-linking of services is another area in which we will see growth. Already we see hypertext links from:
The challenge will be to find innovative ways to cross link information which was previously held in separate databases where each can be enhanced by adding cross links.
What do you need to get access to library services? In the print world you need access to libraries and to their catalogues. Increasingly in the electronic world a workstation anywhere will get you to information sources. There are problems to be addressed which will become of increasing importance.
Problems like these are difficult to address without detailed analysis which takes time and expertise. They may require specific middleware32 to be created, which takes development time outside of normal maintenance. We can expect to see libraries continuing to invest heavily in systems and staff to maintain those systems.
With print documents you went to the library and found them, sometimes with a little assistance. Or the library could get them in for you and you could pick them up there. Delivery went as far as the library building; the rest was up to the user. The initial step in library automation allowed the user to check the catalogue remote from the library, hardly a great advance. The hybrid library will need to address desktop delivery more closely. It would be inconsistent to have network delivery across the planet and still make the end user walk to the library for the final kilometre of travel.There are a number of options for desktop delivery now. These will become increasingly important as the serial crisis bites further.
Over the last few years there has been much discussion of the role of librarians on reference desks. Analogies with other professions which do not staff their own enquiry points have been made. At the same time there has been rapid development of call centre technology, as well as online support mechanisms and heldesk software.36 Increasingly libraries will need to shift to this kind of support as their remote clientele grows; the Internet Public Library,37 a development of the University of Michigan only responds electronically for instance.
Once access via the network is in place and access to information is mediated by equipment, the nature of requests for help to libraries change. Difficulties which in the past which may have required explaining:
To the end user, everything from the screen to the service is invisible. A failure in any part is a failure of the whole so that any level of assistance needs each step in the chain:
may need to be addressed in a coherent way. The nature of the help libraries need to give will require close coordination with other organisations which are part of the chain of access to the libraries services.
We may see a more radical development. Library references services were tied to a print collection which was geographically fixed. Electronic services are not fixed. Database providers may find a market for selling Internet accessible reference services. Consortiums of libraries may consider the possibility of distributed virtual reference services where member libraries specialise in particular topics with a membership dispersion around the globe sufficient to provide a 24-hour service.
By their nature, predictions are guesses. I have not attempted to rate these in any sort of likelihood order. I think these will occur (if they do) within five years and often less:
This is the final paper for Day 2.