ETH Library Services in the field of AI
Generative artificial intelligence technologies are increasingly used in study and research. The ETH Library provides a wide range of information and services on the responsible use of AI, literature searching, and academic writing with AI-supported tools, as well as data packages for AI-based applications.
Please also follow any guidelines issued by your department. Students must agree on the use of AI tools with their supervisors. In the case of academic publications, the guidelines of the respective publisher or journal must be followed.
In addition, the ETH Library provides guidance on ethical and legal aspects of AI use, including:
Guidance for the research, writing and publication process
How should the use of AI tools be declared? What are the implications for authorship? Why must AI-generated content be checked for plagiarism? What does the use of AI tools mean for citation practices?
You can find answers to these and further questions on our webpage Plagiarism and generative Artificial Intelligence.
Guidance on the use of licensed full texts in generative AI tools
Are licensed e-books and e-journals permitted to be used in generative AI tools? What do existing licence agreements with publishers allow – and where are the limits of use?
On our webpage Artificial Intelligence – Use of Licensed Full Texts in Generative AI Tools, you will find information on legal frameworks, licensing issues and the responsible use of copyrighted content.
Literature Search
AI-supported research tools have become a complement to traditional searching in specialised databases. The added value of AI lies in querying in natural language (as opposed to search syntax using Boolean operators), in structuring results, and in generating summaries.
Unlike general chatbots, these tools are based on closed, quality-controlled academic databases. This minimises the risk of hallucinations and ensures that every answer can be directly traced back to a scholarly publication.
Important: Critical evaluation remains essential when working with AI-generated content. Always verify statements against the original sources.
Tools licensed by the ETH Library
One group of research tools enables systematic searching in natural language. From a wide range of such products, the ETH Library licenses the following for members of ETH Zurich:
A second group of research tools enables exploratory searches using visual network representations. Search results—typically starting from an existing source—are displayed as graphs. Within such network graphs, it is possible to iteratively discover related articles by navigating from individual papers. A wide range of such tools exists. The ETH Library provides the following for members of ETH Zurich:
- Connected Papers starts from one or more articles and identifies related works based on citation patterns, even if different terminology is used. The tool uses Semantic Scholar as its data source.
- external page Open Knowledge Maps helps users gain an overview of a new topic and understand how it is structured into subtopics. It groups literature thematically based on textual similarity. Its primary data sources are BASE and PubMed.
Course: In our Toolbox literature research, we present best practices for AI-supported systematic and exploratory searching.
Advice: For specific queries, we are happy to provide individual support – .
Academic writing
The use of generative AI and AI-based tools in the writing process raises a number of questions for many students and other members of ETH Zurich: Which tools may be used? For what purposes are they permitted? How should their use be declared? What are the legal and ethical implications of using them? Does authorship change, or does it remain the same?
You can find further information on our webpage Plagiarism and generative Artificial Intelligence, in the self-paced course AI-based Tools for Scientific Writing and Research on the Moodle learning platform, or in our in-person course Mastering scientific writing with AI-based tools.
The Scientific writing webpage also provides checklists for working with AI-based tools and for correct citation.
Information on available data sources and conditions of use:
- Use of licensed full texts in generative AI tools
- Text and Data Mining (TDM): resources for data-driven and computational research
lease also refer to the general guidelines of ETH Zurich mentioned above, in particular the Download Guidelines for Users of AI Tools (PDF, 162 KB).
The ETH Library provides access to its own datasets via a developer portal, where you can also find further information on the available APIs.
In addition, the collections and archives of the ETH Library support the use of their holdings for digital research (“Collections as Data”). To this end, a range of services has been developed, including:
- CollaDa: Exploring ORD practices in the context of Collections as Data
- Introduction to Python: A six-day course offering a hands-on introduction to Python programming and the analysis of textual data
- external page Datastories: A series of Jupyter notebooks for researchers in the humanities with limited programming experience
The contact point for projects in this area is the Digital Scholarship Services unit.
AI Workshop Series
In 2026, the ETH Library launched the workshop series “AI in Swiss Academic Libraries”. Through this series, it provides a platform to discuss and examine the development and potential of artificial intelligence and its implications for academic libraries.
- Further information on the AI workshop series.
The ETH Library is engaged at the national level in the following bodies on AI in libraries:
- external page Bibliosuisse Commission on AI in Libraries
- external page SLiNER Working Group on AI (AKAI)
Projects
The ETH Library is currently developing prototypes for automated subject indexing in order to enhance digital library services beyond simple keyword searching.
The aim is to automatically assign thematic descriptors to library resources that reflect their core content. These subject terms enable new forms of semantic search, where user queries are matched not only by identical terms but also by conceptual similarity to documents. This creates a consistent thematic structure across different data collections.
Automated subject indexing forms an important foundation for modern discovery services. It focuses on the verbal description of content and supports thematic exploration.
In the long term, such approaches can help to make large digital collections more accessible and enable new forms of thematic exploration within library catalogues.
Between 2021 and 2022, the ETH Library’s Image Archive utilised computer vision and artificial intelligence for the automated classification of images in the E-Pics Image Archive and E-Pics Animals, Plants, Biotopes databases. This process is also known as autotagging. Autotagging was not conceived as a replacement for, but rather as a supplement to, manual tagging. Following the first run using the ‘General’ computer vision model from the company ‘Clarifai’, over a million images had been tagged with keywords. However, the expenditure of time and computing power during ongoing operations was considerable. To improve the quality of the AI-supported classification and due to a lack of in-house ground truth data, the image archive’s data pool, comprising over a million tagged images, was subjected to a qualitative visual analysis. To this end, all 4,600 keywords were first checked individually to assess the AI’s recognition quality and validate the keywords. The analysis revealed that around half of the keywords were incorrect or problematic in terms of content and had to be deleted. In the first half of 2026, the ‘Sharedien’ backend – which has been in use since 2024 – will assess which AI tools should be deployed.
Project duration: 01/01/2026–31/12/2026
Together with the School Board minutes, the School Board files are the most important historical source for the history of ETH Zurich up to 1969. In particular, research into the files from 1854 to 1931 has so far been extremely time-consuming, as the only access is via handwritten ‘business control books’, some of which are written in cursive script.
Until now, the sheer volume of the records – 27 linear metres or 83,096 documents – has stood in the way of a more in-depth cataloguing. The aim of the project is now to carry out an archival cataloguing of the records through the automated extraction of tabular information from the business control books. This is achieved using machine learning (ML) methods. Using scripting methods, the data is further processed for import as metadata for the archive information system. Finally, following manual quality control and data optimisation (human-in-the-loop), the metadata is imported into the archive information system and made publicly accessible via the Virtual Reading Room.