Neuvième séance du séminaire "Humanités Numériques Tourangelles" organisé par Elena Pierazzo, Professeure en Humanités Numériques (CESR-Université de Tours), dans le cadre du projet ERC PRIMA
Sarah Oberbichler - Digital Humanities Lab, Leibniz Institute of European History
Argument Mining automatically extracts arguments and their explicit or implicit components from unstructured text. While the method is well established, recent AI advances have significantly expanded possibilities for computational argument analysis (Gorur et al., 2024; Cabessa et al., 2025). These developments have made argument mining increasingly viable for news media, where arguments are often communicated implicitly rather than through explicit logical structures.
Media arguments shape how events are perceived in public discourse. Through argumentative framing, media actors can construct specific interpretations of reality. For instance, when a disaster response is framed by praising a government's swift action and effective coordination, this can construct an intended argumentative narrative of institutional competence and strength. Systematically identifying and analyzing these intended narratives is crucial for understanding how and with what intention media reality is created.
In this talk, I will present findings from applying large language models to argument mining in historical news media. I will discuss challenges with closed-source models for this research context: their heavy alignment for general use introduces biases while their sensitivity to prompts and parameters creates further opportunities for "LLM hacking" (Baumann et al., 2025). This means that researchers can manipulate model configurations to produce desired results, whether intentionally or accidentally. This challenge is particularly relevant for argument mining, which is an interpretative task where results need to be transparent and verifiable. When models are difficult to inspect and reproduce, validating research findings becomes impossible.
To address these challenges, I present a specialized dataset used to fine-tune and post-train a small open-weight model, supporting transparency and reproducibility. I introduce an approach that reduces semantic ambiguity in prompting, mitigates configuration-dependent biases, and creates results that can be verified and keeps the researcher in the loop. This approach treats LLMs as measurement instruments requiring validation rather than as autonomous solutions.
► Sarah Oberbichler is a Research Associate at the Digital Humanities Lab of the Leibniz Institute of European History in Mainz, Germany and will start a new position as Assistant Professor in AI and History at the C²DH at the University of Luxembourg. Her work spans historical migration, environmental and media studies, natural language processing, and critical AI studies. In her current project, she explores transnational news flows across languages and borders, with a focus on natural disasters and migration.