Text Technology/Digital Linguistics colloquium HS 2024-25

Time & Location: every 2-3 weeks on Tuesdays from 10:15 am to 12:00 pm in room BIN-2-A.10.
Please note that the room has changed from the previous semester.

Online participation via the MS Teams Team CL Colloquium is also possible.

Responsible: Sina Ahmadi

Colloquium Schedule

17.09.2024	Dr. Gail Weiss (EPFL)
01.10.2024	Dr. Yingqiang Gao	Dr. Sina Ahmadi
15.10.2024	Cui Ding	Dr. Jannis Vamvas
29.10.2024	Patrick Haller	Dr. Reto Gubelmann & Ghassen Karray
12.11.2024	Andrianos Michail	Sant Muniesa
26.11.2024	Jan Brasser	Lucas Möller (Universität Stuttgart)
10.12.2024	Prof. Dr. Sarah Ebling & Lukas Fischer, Dr. Yingqiang Gao, Anne Göhring	Michelle Wastl

17. Sept 2024

Gail Weiss: Thinking Like Transformers - A Practical Session

With the help of the RASP programming language, we can better imagine how transformers---the powerful attention based sequence processing architecture---solve certain tasks. Some tasks, such as simply repeating or reversing an input sequence, have reasonably straightforward solutions, but many others are more difficult. To unlock a fuller intuition of what can and cannot be achieved with transformers, we must understand not just the RASP operations but also how to use them effectively. In this session, I would like to discuss some useful tricks with you in more detail. How is the powerful selector_width operation yielded from the true RASP operations? How can a fixed-depth RASP program perform arbitrary length long-addition, despite the equally large number of potential carry operations such a computation entails? How might a transformer perform in-context reasoning? And are any of these solutions reasonable, i.e., realisable in practice? I will begin with a brief introduction of the base RASP operations to ground our discussion, and then walk us through several interesting task solutions. Following this, and armed with this deeper intuition of how transformers solve several tasks, we will conclude with a discussion of what this implies for how knowledge and computations must spread out in transformer layers and embeddings in practice.

1 Oct 2024

Dr. Yingqiang Gao: Mining Arguments in Scientific Documents

In today's talk, we show the critical role of well-structured scientific texts in writing arguments, which enhances clarity and reduces misinformation while promoting knowledge dissemination. We identified the challenges researchers face in maintaining coherence and factual accuracy during the writing process, highlighting the need for automation through AI-driven tools that integrate text retrieval and generation. Despite advancements in Natural Language Processing and Large Language Models, effective scientific writing assistants face hurdles, particularly in automatic text alignment and the reliability of generated content. To address these issues, we investigated empirical unsupervised methods for retrieving, aligning, and generating arguments in scientific documents, culminating in the development of a web application that applies these argument mining techniques.

Dr. Sina Ahmadi: Tracking Borrowed Words: A Multilingual Contrastive Dataset for Loanword Evaluation

Lexical borrowing, the adoption of words from one language into another, is a ubiquitous linguistic phenomenon influenced by geopolitical, societal, and technological factors. This talk explores lexical borrowing from a computational linguistics perspective. I present our effort to create a novel contrastive dataset comprising sentences with and without loanwords, designed to evaluate the impact of borrowings. Using this dataset, the performance of state-of-the-art machine translation and pretrained language models is assessed, quantifying their behavior and robustness in the presence and absence of loanwords. Our findings provide valuable insights into the challenges lexical borrowing poses for computational models and offer extensive analysis in multilingual contexts.

15 Oct 2024

Cui Ding: Measurement reliability of individual differences in sentence processing

Psycholinguistic theories traditionally assume similar cognitive mechanisms across different speakers. However, researchers have recently begun to recognize the need to account for individual differences that must be considered when explaining human cognition. To address this issue, an increasing body of work is investigating how individual differences interact with human sentence processing. Implicitly, these studies assume that individual effects are replicable over experimental sessions and that the method of assessment (e.g., ET vs SPR) is interchangeable. However, as noted in the reliability paradox (Hedge et al., 2018), this assumption is unwarranted. A crucial first step for a principled investigation of individual differences in sentence processing is establishing their measurement reliability, that is, the correlation of individual-level effects across multiple experimental sessions and methodological contexts. In this talk, I present the first German naturalistic reading corpus with four experimental sessions from each participant (two eye-tracking and two self-paced reading sessions), including a comprehensive assessment of participants' cognitive capacities and reading skills. I deploy a two-task Bayesian hierarchical model to assess the measurement reliability of individual differences among a range of effects in response to predictors of sentence processing difficulty that are well-established at the population level.

Dr. Jannis Vamvas: Towards Vector Representations of Textual Difference

I am introducing a new research project called «InvestigaDiff», which aims to enable synchronization of documents across different languages. Inspired by how programmers use diff tools to highlight changes in code, we are exploring whether similar concepts can be applied to natural language texts, even when they are in different languages. One research direction involves representation learning at the token level. I will present an idea for an approach that uses soft prompts to guide an LLM in rewriting one text into the other, with these soft prompts serving as the vector representations of textual difference.

29 Oct 2024

Patrick Haller: Leveraging large-scale paraphrasing and in-context learning for stability and bias assessment in LLMS

A growing body of work has been querying LLMs with questionnaires developed for human respondents to evaluate their potential biases, such as political or cultural biases. In this talk, I will present two projects aimed at studying the stability and reliability of these evaluations in the context of political questions. The first project investigates response stability in language models by probing LMs using 500 paraphrases per question to assess variability and structural biases. The second project introduces Questionnaire Modeling, a new probing task that incorporates human survey data as in-context examples to improve the stability of bias evaluation.

Dr. Reto Gubelmann & Ghassen Karray: Probing for Political Bias and Brittleness in LLMs’ Judgments on Formal and Material Inferences

We present research examining the political bias and the brittleness of LLMs in NLI. We first distinguish the concept of a strict, un-political notion of formal validity from notions of material and informal validity that are inherently perspectival. The article then assesses state-of-the-art LLMs regarding their political bias and brittleness in judging the validity or quality of such inferences. We run all experiments in English with samples from American politics as well as in German with sample arguments from Swiss politics. Our results show that the models exhibit bias in English, which can be mitigated with few-shot-prompts, as well as substantial brittleness, which, in contrast, increases with few-shot prompting.

12 Nov 2024

Andrianos Michail: PARAPHRASUS : A Comprehensive Benchmark for Evaluating Paraphrase Detection Models

The task of determining whether two texts are paraphrases has long been a challenge in NLP. However, the prevailing notion of paraphrase is often quite simplistic, offering only a limited view of the vast spectrum of paraphrase phenomena. Indeed, we find that evaluating models in a paraphrase dataset can leave uncertainty about their true semantic understanding. To alleviate this, we release paraphrasus, a benchmark designed for multi-dimensional assessment of paraphrase detection models and finer model selection. We find that paraphrase detection models under a fine-grained evaluation lens exhibit trade-offs that cannot be captured through a single classification dataset.

Sant Muniesa: Leveraging LLM for multilingual sign language translation

Sign language translation represents a crucial challenge to facilitate inclusive communication between deaf and hearing communities. In this presentation, I will share the progress made in my master's thesis, where I developed a model that uses advanced natural language processing techniques to translate sign language to text. Based on this experience, I will share a new approach aimed at improving alignment between different modalities-visual and textual-using Siamese networks together with Optimal Transport and CTC methods. This method, inspired by recent research in speech-to-text translation, seeks to efficiently align latent representations of different modalities at the encoder.

26 Nov 2024

Jan Brasser: Predicting Reading Abilities from Eye Movements on a Non-Reading Task

Early detection of reading disorders is critical for ensuring educational success and equity, as it enables timely support for children at risk. However, current risk assessments for atypical reading development are time-consuming, expensive, and require trained specialists. Additionally, these assessments often assume basic reading skills, limiting their application to children who have already begun learning to read. A system capable of accurately predicting a child’s reading abilities through non-reading tasks could assist early detection of developmental reading disorders. I will present the first steps toward developing such a system using machine learning techniques and Bayesian statistical modeling. Specifically, I (i) investigate whether children’s reading comprehension abilities—both at the time of eye-tracking data collection and one year later—can be predicted from eye movements recorded during a visual search task and (ii) explore which gaze features are most strongly predictive of reading comprehension abilities.

Lucas Möller: Explaining text similarities in Siamese transformers with feature-pair attributions

Siamese encoder models such as sentence transformers (SBERT) learn similarities between two inputs. They have proven to generate highly generalizable embeddings for tasks including semantic textual similarity, information retrieval, classification and clustering. However, little is known about how they actually compare inputs. A barrier is that similarities depend on feature-interactions rather than individual features alone. Therefore, common feature attribution methods do not function for this model class. To address this gap in our recent work, we have derived a local attribution method especially for Siamese encoders. The output takes the form of a token–token matrix and points out which token-pairs from the two inputs are important for an individual prediction. Applying it to SBERT models, we gain insights into which parts of speech and syntactic roles these models attend to, confirm that they mostly ignore negation, explore how they judge semantically opposite adjectives, and find that they exhibit lexical bias. In a collaboration with UZH, we are now looking into multi-lingual models and first results indicate that these models can learn strong cross-lingual alignment abilities despite their simple contrastive training objective.

10 Dec 2024

Prof. Dr. Sarah Ebling & Lukas Fischer, Dr. Yingqiang Gao, Anne Göhring: Flagship „Inclusive Information and Communication Technologies“ (IICT)

This talk will discuss progress in the ongoing Innosuisse flagship project „Inclusive Information and Communication Technologies“ (IICT). An emphasis will be on work that is not the subject of doctoral theses regularly presented in this colloquium; as such, contributions covered range from automatic text simplification to automatic translation into sign language and automatic translation of audio descriptions.

Michelle Wastl: A Cross-lingual, Document-level Dataset of Related News Articles

The two focal points of the «InvestigaDiff» project, introduced earlier this semester, are automatic recognition and synchronization of differences in text documents across languages. Developing and evaluating systems capable of detecting such differences requires datasets that accurately capture them. However, many existing datasets provide information on differences only as a byproduct and are not explicitly designed for this purpose. While it is possible to extend these corpora synthetically to fulfill the needs of our project, our aim is to base our systems on organic data to ensure them being able to deal with human-authored text with all its complexities and variability. To the extent of our knowledge, an organic, cross-lingual, document-level dataset of textual differences does not yet exist. We propose to fill this gap by collecting related article pairs by the Swiss news outlet 20 Minuten, which releases numerous articles in both French and German. While the main content of the article is mostly the same, it has to undergo several transformations to adhere to the cultural expectations of a news article in either language, and therefore capturing a plethora of changes between the two documents and providing a rich basis for an organic cross-lingual, document-level difference dataset.

Institut für Computerlinguistik

Quicklinks und Sprachwechsel

Hauptnavigation