Skip to main content

Preserving the Fabric of Our Lives: A Survey of Web Preservation Initiatives

  • Conference paper
Research and Advanced Technology for Digital Libraries (ECDL 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2769))

Included in the following conference series:

Abstract

This paper argues that the growing importance of the World Wide Web means that Web sites are key candidates for digital preservation. After an brief outline of some of the main reasons why the preservation of Web sites can be problematic, a review of selected Web archiving initiatives shows that most current initiatives are based on combinations of three main approaches: automatic harvesting, selection and deposit. The paper ends with a discussion of issues relating to collection and access policies, software, costs and preservation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
EUR 29.95
Price includes VAT (Hong Kong/P.R.China)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 85.59
Price includes VAT (Hong Kong/P.R.China)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 99.99
Price excludes VAT (Hong Kong/P.R.China)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Castells, M.: The Internet galaxy: reflections on the Internet, business, and society. Oxford University Press, Oxford (2001)

    Google Scholar 

  2. Hendler, J.: Science and the Semantic Web. Science 299, 520–521 (2003)

    Article  Google Scholar 

  3. Lyman, P.: Archiving the World Wide Web. In: Building a national strategy for digital preservation. Council on Library and Information Resources and Library of Congress, Washington, D.C., pp. 8–51, (2002), http://www.clir.org/pubs/abstract/pub106abst.html

  4. Day, M.: Collecting and preserving the World Wide Web: a feasibility study undertaken for the JISC and Wellcome Trust (February 2003), http://library.wellcome.ac.uk/projects/archiving_feasibility.pdf

  5. Lawrence, S., Giles, C.L.: Searching the World Wide Web. Science 280, 98–100 (1998)

    Article  Google Scholar 

  6. Lawrence, S., Giles, C.L.: Accessibility of information on the Web. Nature 400, 107–109 (1999)

    Article  Google Scholar 

  7. Murray, B., Moore, A.: Sizing the Internet. Cyveillance White Paper. Cyveillance, Inc. (July 10, 2000), http://www.cyveillance.com/web/downloads/Sizing_the_Internet.pdf

  8. Lyman, P., Varian, H.R.: How much information? University of California at Berkeley, School of Information Management and Systems, Berkeley, Calif (2000), http://www.sims.berkeley.edu/research/projects/how-much-info/internet.html

  9. Bar-Ilan, J.: Data collection methods on the Web for informetric purposes: a review and analysis. Scientometrics 50, 7–32 (2001)

    Article  Google Scholar 

  10. Bergman, M.K.: The deep Web: surfacing hidden value. Journal of Electronic Publishing (August 7, 2001), Available at: http://www.press.umich.edu/jep/07-01/bergman.html

  11. Lawrence, S., Pennock, D.M., Flake, G.W., Krovetz, R., Coetzee, F.M., Glover, E., Nielsen, F.Å., Kruger, A., Giles, C.L.: Persistence of Web references in scientific research. Computer 34, 26–31 (2001)

    Article  Google Scholar 

  12. Casey, C.: The cyberarchive: a look at the storage and preservation of Web sites. College & Research Libraries 59, 304–310 (1998)

    Google Scholar 

  13. Webb, C.: Who will save the Olympics?OCLC/Preservation Resources Symposium, Digital Past, Digital Future: an Introduction to Digital Preservation, OCLC, Dublin, Ohio (June 15, 2001), http://www.oclc.org/events/presentations/symposium/preisswebb.shtm

  14. Charlesworth, A.: A study of legal issues related to the preservation of Internet resources in the UK, EU, USA and Australia (February 2003), http://library.wellcome.ac.uk/projects/_archiving_legal.pdf

  15. Bollacker, K.D., Lawrence, S., Giles, C.L.: Discovering relevant scientific literature on the Web. IEEE Intelligent Systems 15, 42–47 (2000)

    Google Scholar 

  16. Chakrabarti, S., Dom, B.E., Kumar, S.R., Raghavan, P., Rajagopalan, S., Tomkins, A., Kleinberg, J., Gibson, D.: Hypersearching the Web. Scientific American 280, 44–52 (1999)

    Google Scholar 

  17. Herring, S.D.: Using the World Wide Web for research: are faculty satisfied? Journal of Academic Librarianship 27, 213–219 (2001)

    Article  Google Scholar 

  18. Kahle, B.: Way back when. New Scientist 176, 46–49 (2002)

    Google Scholar 

  19. Hallgrímsson, T.: Survey of Web archiving in Europe. E-mail sent to list web archive@ cru.fr (February 3, 2003)

    Google Scholar 

  20. National Archives of Australia: Archiving Web resources: a policy for keeping records of Web-based activity in the Commonwealth Government (January 2001), http://www.naa.gov.au/recordkeeping/er/web_records/archweb_policy.pdf

  21. National Archives of Australia: Archiving Web resources: guidelines for keeping records of Web-based activity in the Commonwealth Government (March 2001), http://www.naa.gov.au/recordkeeping/er/web_records/archweb_guide.pdf

  22. Public Record Office: Managing Web resources: management of electronic records on Websites and Intranets: an ERM toolkit, v. 1.0 (December (2001)

    Google Scholar 

  23. Bellardo, L.J.: Memorandum to Chief Information Officers: snapshot of public Web sites. National Archives & Records Administration, Washington, D.C. (January 12, 2001), http://www.archives.gov/records_management/cio_link/memo_to_cios.html

  24. Ryan, D.: Preserving the No 10 Web site: the story so far. Web-archiving: managing and archiving online documents, DPC Forum, London (March 25, 2002), http://www.jisc.ac.uk/dner/preservation/presentations/pdf/Ryan.pdf

  25. Arvidson, A., Persson, K., Mannerheim, J.: The Royal Swedish Web Archive: a complete collection of Web pages. International Preservation News 26, 10–12 (December 2001), http://www.ifla.org/VI/4/news/ipnn26.pdf

    Google Scholar 

  26. Hakala, J.: The NEDLIB Harvester. Zeitschrift für Bibliothekswesen und Bibliographie 48, 211–216 (2001)

    Google Scholar 

  27. Rauber, A., Aschenbrenner, A., Witvoet, O.: Austrian Online Archive processing: analyzing archives of the World Wide Web. In: Agosti, M., Thanos, C. (eds.) ECDL 2002. LNCS, vol. 2458, pp. 16–31. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  28. Arms, W.Y., Adkins, R., Ammen, C., Hayes, A.: Collecting and preserving the Web: the Minerva prototype. RLG DigiNews (April 5, 2001), http://www.rlg.org/preserv/diginews/diginews5-2.html#feature1

  29. Mannerheim, J.: The new preservation tasks of the library community. International Preservation News 26, 5–9 (December 2001), http://www.ifla.org/VI/4/news/ipnn26.pdf

    Google Scholar 

  30. Abiteboul, S., Cobéna, G., Masanès, J., Sedrati, G.: A first experience in archiving the French Web. In: Agosti, M., Thanos, C. (eds.) ECDL 2002. LNCS, vol. 2458, pp. 1–15. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  31. Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30, 107–117 (1998)

    Article  Google Scholar 

  32. Masanès, J.: Towards continuous Web archiving: first results and an agenda for the future. D-Lib Magazine 8 (December 2002), http://www.dlib.org/dlib/december02/masanes/12masanes.html

  33. Brygfjeld, S.A.: Access to Web archives: the Nordic Web Archive Access Project. Zeitschrift für Bibliothekswesen und Bibliographie 49, 227–231 (2002)

    Google Scholar 

  34. Ardö, A., Lundberg, S.: A regional distributed WWW search and indexing service – the DESIRE way. Computer Networks and ISDN Systems 30, 173–183 (1998)

    Article  Google Scholar 

  35. CCSDS 650.0-B-1: Reference Model for an Open Archival Information System (OAIS). Consultative Committee on Space Data Systems (2002), http://wwwclassic.ccsds.org/ documents/pdf/CCSDS-650.0-B-1.pdf

  36. Lynch, C.: Authenticity and integrity in the digital environment: an exploratory analysis of the central role of trust. In: Authenticity in a digital environment. Council on Library and Information Resources, Washington, D.C, pp. 32–50. (2000), http://www.clir.org/pubs/abstract/ pub92abst.html

  37. Hirtle, P.B: Archival authenticity in a digital age. Authenticity in a digital environment. Council on Library and Information Resources, Washington, D.C., 8–23 (2000), http://www.clir.org/pubs/abstract/pub92abst.html

  38. RLG/OCLC Working Group on Digital Archive Attributes: Trusted digital repositories: attributes and responsibilities (2002), http://www.rlg.org/longterm/repositories.pdf

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Day, M. (2003). Preserving the Fabric of Our Lives: A Survey of Web Preservation Initiatives. In: Koch, T., Sølvberg, I.T. (eds) Research and Advanced Technology for Digital Libraries. ECDL 2003. Lecture Notes in Computer Science, vol 2769. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45175-4_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-45175-4_42

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40726-3

  • Online ISBN: 978-3-540-45175-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics