Abstract
This paper argues that the growing importance of the World Wide Web means that Web sites are key candidates for digital preservation. After an brief outline of some of the main reasons why the preservation of Web sites can be problematic, a review of selected Web archiving initiatives shows that most current initiatives are based on combinations of three main approaches: automatic harvesting, selection and deposit. The paper ends with a discussion of issues relating to collection and access policies, software, costs and preservation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Castells, M.: The Internet galaxy: reflections on the Internet, business, and society. Oxford University Press, Oxford (2001)
Hendler, J.: Science and the Semantic Web. Science 299, 520–521 (2003)
Lyman, P.: Archiving the World Wide Web. In: Building a national strategy for digital preservation. Council on Library and Information Resources and Library of Congress, Washington, D.C., pp. 8–51, (2002), http://www.clir.org/pubs/abstract/pub106abst.html
Day, M.: Collecting and preserving the World Wide Web: a feasibility study undertaken for the JISC and Wellcome Trust (February 2003), http://library.wellcome.ac.uk/projects/archiving_feasibility.pdf
Lawrence, S., Giles, C.L.: Searching the World Wide Web. Science 280, 98–100 (1998)
Lawrence, S., Giles, C.L.: Accessibility of information on the Web. Nature 400, 107–109 (1999)
Murray, B., Moore, A.: Sizing the Internet. Cyveillance White Paper. Cyveillance, Inc. (July 10, 2000), http://www.cyveillance.com/web/downloads/Sizing_the_Internet.pdf
Lyman, P., Varian, H.R.: How much information? University of California at Berkeley, School of Information Management and Systems, Berkeley, Calif (2000), http://www.sims.berkeley.edu/research/projects/how-much-info/internet.html
Bar-Ilan, J.: Data collection methods on the Web for informetric purposes: a review and analysis. Scientometrics 50, 7–32 (2001)
Bergman, M.K.: The deep Web: surfacing hidden value. Journal of Electronic Publishing (August 7, 2001), Available at: http://www.press.umich.edu/jep/07-01/bergman.html
Lawrence, S., Pennock, D.M., Flake, G.W., Krovetz, R., Coetzee, F.M., Glover, E., Nielsen, F.Å., Kruger, A., Giles, C.L.: Persistence of Web references in scientific research. Computer 34, 26–31 (2001)
Casey, C.: The cyberarchive: a look at the storage and preservation of Web sites. College & Research Libraries 59, 304–310 (1998)
Webb, C.: Who will save the Olympics?OCLC/Preservation Resources Symposium, Digital Past, Digital Future: an Introduction to Digital Preservation, OCLC, Dublin, Ohio (June 15, 2001), http://www.oclc.org/events/presentations/symposium/preisswebb.shtm
Charlesworth, A.: A study of legal issues related to the preservation of Internet resources in the UK, EU, USA and Australia (February 2003), http://library.wellcome.ac.uk/projects/_archiving_legal.pdf
Bollacker, K.D., Lawrence, S., Giles, C.L.: Discovering relevant scientific literature on the Web. IEEE Intelligent Systems 15, 42–47 (2000)
Chakrabarti, S., Dom, B.E., Kumar, S.R., Raghavan, P., Rajagopalan, S., Tomkins, A., Kleinberg, J., Gibson, D.: Hypersearching the Web. Scientific American 280, 44–52 (1999)
Herring, S.D.: Using the World Wide Web for research: are faculty satisfied? Journal of Academic Librarianship 27, 213–219 (2001)
Kahle, B.: Way back when. New Scientist 176, 46–49 (2002)
HallgrÃmsson, T.: Survey of Web archiving in Europe. E-mail sent to list web archive@ cru.fr (February 3, 2003)
National Archives of Australia: Archiving Web resources: a policy for keeping records of Web-based activity in the Commonwealth Government (January 2001), http://www.naa.gov.au/recordkeeping/er/web_records/archweb_policy.pdf
National Archives of Australia: Archiving Web resources: guidelines for keeping records of Web-based activity in the Commonwealth Government (March 2001), http://www.naa.gov.au/recordkeeping/er/web_records/archweb_guide.pdf
Public Record Office: Managing Web resources: management of electronic records on Websites and Intranets: an ERM toolkit, v. 1.0 (December (2001)
Bellardo, L.J.: Memorandum to Chief Information Officers: snapshot of public Web sites. National Archives & Records Administration, Washington, D.C. (January 12, 2001), http://www.archives.gov/records_management/cio_link/memo_to_cios.html
Ryan, D.: Preserving the No 10 Web site: the story so far. Web-archiving: managing and archiving online documents, DPC Forum, London (March 25, 2002), http://www.jisc.ac.uk/dner/preservation/presentations/pdf/Ryan.pdf
Arvidson, A., Persson, K., Mannerheim, J.: The Royal Swedish Web Archive: a complete collection of Web pages. International Preservation News 26, 10–12 (December 2001), http://www.ifla.org/VI/4/news/ipnn26.pdf
Hakala, J.: The NEDLIB Harvester. Zeitschrift für Bibliothekswesen und Bibliographie 48, 211–216 (2001)
Rauber, A., Aschenbrenner, A., Witvoet, O.: Austrian Online Archive processing: analyzing archives of the World Wide Web. In: Agosti, M., Thanos, C. (eds.) ECDL 2002. LNCS, vol. 2458, pp. 16–31. Springer, Heidelberg (2002)
Arms, W.Y., Adkins, R., Ammen, C., Hayes, A.: Collecting and preserving the Web: the Minerva prototype. RLG DigiNews (April 5, 2001), http://www.rlg.org/preserv/diginews/diginews5-2.html#feature1
Mannerheim, J.: The new preservation tasks of the library community. International Preservation News 26, 5–9 (December 2001), http://www.ifla.org/VI/4/news/ipnn26.pdf
Abiteboul, S., Cobéna, G., Masanès, J., Sedrati, G.: A first experience in archiving the French Web. In: Agosti, M., Thanos, C. (eds.) ECDL 2002. LNCS, vol. 2458, pp. 1–15. Springer, Heidelberg (2002)
Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30, 107–117 (1998)
Masanès, J.: Towards continuous Web archiving: first results and an agenda for the future. D-Lib Magazine 8 (December 2002), http://www.dlib.org/dlib/december02/masanes/12masanes.html
Brygfjeld, S.A.: Access to Web archives: the Nordic Web Archive Access Project. Zeitschrift für Bibliothekswesen und Bibliographie 49, 227–231 (2002)
Ardö, A., Lundberg, S.: A regional distributed WWW search and indexing service – the DESIRE way. Computer Networks and ISDN Systems 30, 173–183 (1998)
CCSDS 650.0-B-1: Reference Model for an Open Archival Information System (OAIS). Consultative Committee on Space Data Systems (2002), http://wwwclassic.ccsds.org/ documents/pdf/CCSDS-650.0-B-1.pdf
Lynch, C.: Authenticity and integrity in the digital environment: an exploratory analysis of the central role of trust. In: Authenticity in a digital environment. Council on Library and Information Resources, Washington, D.C, pp. 32–50. (2000), http://www.clir.org/pubs/abstract/ pub92abst.html
Hirtle, P.B: Archival authenticity in a digital age. Authenticity in a digital environment. Council on Library and Information Resources, Washington, D.C., 8–23 (2000), http://www.clir.org/pubs/abstract/pub92abst.html
RLG/OCLC Working Group on Digital Archive Attributes: Trusted digital repositories: attributes and responsibilities (2002), http://www.rlg.org/longterm/repositories.pdf
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Day, M. (2003). Preserving the Fabric of Our Lives: A Survey of Web Preservation Initiatives. In: Koch, T., Sølvberg, I.T. (eds) Research and Advanced Technology for Digital Libraries. ECDL 2003. Lecture Notes in Computer Science, vol 2769. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45175-4_42
Download citation
DOI: https://doi.org/10.1007/978-3-540-45175-4_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40726-3
Online ISBN: 978-3-540-45175-4
eBook Packages: Springer Book Archive