Navigation auf uzh.ch

Suche

Department of Computational Linguistics Digital Linguistics

Synthesizing eye-tracking data for natural language generation and evaluation - EyeNLG

Recent advancements in the field of natural language processing (NLP) make it clear that language-related AI will play a fundamental role in shaping the digital transformation of society. However, as impressive as the performance of large language models such as BERT(Devlin et al., 2019), LLaMA (Touvron et al., 2023) or ChatGPT (Open AI, 2022) might be, detailed analyses of these models have revealed that, in some aspects, they differ from humans in essential ways. While the linguistic representations of the models are surprisingly predictive of neural activation and behavioral patterns (e.g., Schrimpf et al., 2020), they are still fundamentally different from the implicit linguistic knowledge of humans (Sinha et al., 2021, Srivastava et al., 2022, Schuster and Linzen, 2022). Understanding in what (undesired) ways the linguistic behavior of large language models differs from humans is crucial for i) directing future research to improve model performance and our understanding thereof and ii) allowing society to adequately cope with the limitations of these models to avoid socially harmful effects. The MultiplEYE COST Action (CA21131) aims at leveraging human eye movement data to enhance and evaluate neural language models, with a special focus on multilingual approaches from which low-resource languages may benefit. Working Groups (WGs) 1 (“Enabling eye-tracking data collection”) and 2 (“Eye-tracking methodology”) collect a large-scale multilingual eye-tracking-while-reading corpus that provides Working Group 4 (“Natural language processing applications leveraging eye-tracking data”) with the data for the above-mentioned purposes. This project will play a central role to achieve the goals of the Action's WG 4

The project is funded by  the Swiss National Science Foundataion (SNSF) and is associated to the EU COST Action MultiplEYE (CA21131).

Project duration: 01.06.2024 – 31.05.2028

Principal Investigators:
Prof. Dr. Lena A. Jäger

Collaborators:
Emmanuele Chersoni, Department of Chinese and Bilingual Studies The Hong Kong Polytechnic University, Hongkong
Yu-Yin Hsu, Department of Chinese and Bilingual Studies The Hong Kong Polytechnic University, Hongkong
Noam Siegelman, The Hebrew University of Jerusalem, Israel
Lonneke van der Plas, Idiap Research Institute, Switzerland

PhD students and postdocs:
David Reich, Department of Computational Linguistics University of Zurich, Switzerland

Student assistants:
Isabelle Cretton