ETH Zurich Web Archive

The ETH Zurich University Archives regularly archives the most important ETH Zurich websites. These include the homepage, Staffnet pages, various departmental websites, subject-specific pages and blogs.

What do we archive?

We permanently save the ETH Zurich webpages in our external page Web Archive to make them available to the public. Snapshots document what the homepage and all sub-pages of a specific version of our website looked like at the time of archiving. This also includes PDF files, images and videos. Once a website has been archived, it can no longer be changed. Archived content is not indexed by search engines, such as Google.

Current collections of ETH websites in the Web Archive:

  • ETH Web 1997–2021: data copied from the external page Internet Archive. These data are comprehensive but not curated.
  • ETH Web from 2017 onward: a curated and structured archive of all ETH homepages, departments, institutes, many research groups and services.
  • Covid-19 Collection: ETH websites containing information about the Covid-19 pandemic.
  • Web archive of ETH Zurich's first webmaster, Reto Ambühler.
  • A collection comprising websites of students’ associations and other organisations closely associated with ETH Zurich.

Which criteria determine what we archive?

The Web Archive of the University Archives documents the web presence of ETH Zurich as a historical source. Technical and visual monitoring ensure the high quality of the snapshots. The Web Archive ensures that links and quotations remain available, so that users can refer to the archived websites as academic sources and cite them consistently using a Digital Object Identifier (DOI).

Searching the Web Archive

The ETH Zurich external page Web Archive offers a full-text search function to facilitate research using its contents. All contents of the Web Archive (including attached PDFs) are full-text indexed. The below video provides a step-by-step guide to searching the Web Archive.

By playing the video you accept the privacy policy of YouTube.Learn more OK

Various portals, such as the Virtual Reading Room database and external page ETH Library @ swisscovery, also enable you to access the Web Archive’s metadata, i.e. the title of a website, the date on which it was archived and other information. These links make your research easier.

Remote harvesting

The Web Archive of the University Archives uses remote harvesting for its digital archiving. It uses the Archive-It Brozzler and Heritrix web crawlers. Based on a starting URL, the web crawler collects all linked contents. It creates WARC files in compliance with the corresponding ISO standard. Log files also document the settings used by the web crawler at the time.

Requirements of web archiving

Only websites belonging to the ETH Zurich domain can be archived in the Web Archive. All pages we archive are publicly accessible without a login. The WARC files also contain materials such as PDFs and presentation slides which have been published on a website. Embedded contents from external services, e.g. YouTube videos or Google maps, are not archived. The placeholder “Resource not in archive” is displayed in their place.

The new Archive-It platform

The Web Archive uses the Wayback viewer, a tool of the external page Archive-It platform, to display websites to users. Archived versions may differ slightly from the original websites. Dynamically generated contents, such as navigation menus, are particularly difficult to archive. The quality assurance process for our archived websites primarily ensures that all central contents of a website have been archived.

Each archived website is assigned a permanent Digital Object Identifier (DOI) and an archive signature. You can find this information in the archive database and on the Archive-It platform. The archiving date is also visible in the header of the viewer.

Please cite as follows:
ETH Zurich University Archives, [catalogue number], [website title plus URL], [archiving date], [DOI]

For example:
ETH Zurich University Archives, EZ-INF1.1/213, website of the Institute of Human Movement Sciences and Sport, original URL: http://www.ibws.ethz.ch, 14 July 2017, DOI: 10.7893/ethz-hsa-web-223

To request the archiving of an ETH website, please email .

We also store and manage the WARC files of each crawl in the ETH Data Archive. The ETH Data Archive is based on the OAIS (Open Archival Information System) model and uses the internationally common METS and PREMIS standards.


Contact

ETH Zurich University Archives
  • +41 44 632 07 04
JavaScript has been disabled in your browser