Prof. Dr. Sabine Schulte im Walde

In Natural Language Processing (NLP), combinations of words are considered multi-word expressions (MWEs) if they are semantically idiosyncratic to some degree, i.e., the meaning of the combination is not entirely (or even not at all) predictable from the meanings of the constituents. MWEs subsume multiple morpho-syntactic types, including noun compounds (such as flea market) and particle verbs (such as give up). They have been explored extensively and across research disciplines from synchronic perspectives, but state-of-the-art studies are lacking empirical large-scale approaches towards diachronic models of MWE meaning.

Our project SemChangeMWE goes beyond the restricted synchronic concept of MWE meaning and provides a novel perspective on MWE emergence, MWE meaning changes and MWE compositionality (i.e., meaning transparency) by computationally modelling their diachronic properties and changes of properties. The project brings together our expertises in (a) computational models of MWE compositionality and meaning analogy, (b) computational models of diachronic meaning changes and meaning divergences in language variation, and (c) datasets of meaning components and meaning relatedness, in order to address the lack of computational diachronic models of MWE meaning.

The project SemChangeMWE is a SemRel project. It is funded by the DFG (Deutsche Forschungsgemeinschaft, the German Research Foundation) under research grant SCHU 2580/5-1.

Postdoctoral and Doctoral Researchers

Chris Jenkins (Doctoral Researcher)
Filip Miletić (Postdoctoral Researcher)

Student Researchers

Rebecca Fiestas Cueto
Samin Mahdizadeh
Maximilian Maurer
Emma Raimundo Schulz
Malak Rassem
Momo Takamatsu
Nina Vikhrova
James White

Publications

Fritz Günther, Sabine Schulte im Walde.
Compositionality Estimates for Morphologically Complex Words. [doi]
In: Reference Module in Social Sciences. Elsevier, Online, 2024.

Chris Jenkins, Filip Miletić, Sabine Schulte im Walde
To Split or Not to Split: Composing Compounds in Contextual Vector Spaces [pdf/poster]
In: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP). Singapore, December 2023.

Samin Mahdizadeh Sani, Malak Rassem, Chris Jenkins, Filip Miletić, Sabine Schulte im Walde
What Can Diachronic Contexts and Topics Tell Us About the Present-Day Compositionality of English Noun Compounds? [pdf/poster]
In: Proceedings of the Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING). Torino, Italy, May 2024.

Maximilian Maurer, Chris Jenkins, Filip Miletić, Sabine Schulte im Walde
Classifying Noun Compounds for Present-Day Compositionality: Contributions of Diachronic Frequency and Productivity Patterns [pdf]
In: Proceedings of the 19th Conference on Natural Language Processing (KONVENS). Ingolstadt, Germany, September 2023.

Filip Miletić, Sabine Schulte im Walde
A Systematic Search for Compound Semantics in Pretrained BERT Architectures [pdf/video/poster]
In: Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL). Dubrovnik, Croatia, May 2023.

Filip Miletić, Sabine Schulte im Walde
Semantics of Multiword Expressions in Transformer-Based Models: A Survey [pdf]
Transactions of the Association for Computational Linguistics (TACL) 12:593-612, 2024.

Malak Rassem, Myrto Tsigkouli, Chris Jenkins, Filip Miletić, Sabine Schulte im Walde
Visualising Changes in Semantic Neighbourhoods of English Noun Compounds over Time [pdf/poster/resource]
In: Proceedings of the 4th International Conference on Natural Language Processing for Digital Humanities (NLP4DH). Miami, Floria, November 2024.

Dominik Schlechtweg, Pierluigi Cassotti, Bill Noble, David Alfter, Sabine Schulte im Walde, Nina Tahmasebi
More DWUGs: Extending and Evaluating Word Usage Graph Datasets in Multiple Languages [pdf]
In: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP). Miami, Florida, November 2024.

Dominik Schlechtweg, Shafqat Mumtaz Virk, Pauline Sander, Emma Sköldberg, Lukas Theuer Linke, Tuo Zhang, Nina Tahmasebi, Jonas Kuhn, Sabine Schulte im Walde
The DURel Annotation Tool: Human and Computational Measurement of Semantic Proximity, Sense Clusters and Semantic Change [pdf/tool/demo video]
In: Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (EACL): System Demonstrations. St. Julians, Malta, March 2024.

Sabine Schulte im Walde
Collecting and Investigating Features of Compositionality Ratings [book url/preprint pdf/resource]
In: Voula Giouli / Verginica Barbu Mititelu (eds), Multiword Expressions in Lexical Resources. Linguistic, Lexicographic and Computational Perspectives. Berlin: Language Science Press, "Phraseology and Multiword Expressions", 2024.

Talks + Posters with Abstracts

Prajit Dhar, Sabine Schulte im Walde, Milena Rabovsky
How do Large Language Models Interpret Noun-Noun Compounds?
30th Conference on Architectures and Mechanisms of Language Processing (AMLaP)
Edinburgh, Scotland, September 5-7, 2024

Chris Jenkins
Composing Noun Compounds in Vector Spaces [abstract]
DGfS-CL Poster Session 2023 at the Annual Meeting of the DGfS
Universität Köln, March 8-10, 2023

Chris Jenkins, Filip Miletić, Sabine Schulte im Walde
Identification of Shifts in Metaphorical Usage of Compound Nouns over Time [abstract]
Workshop on Computational Approaches to Metaphor and Figurative Language at the Annual Meeting of the DGfS
Universität Bochum, February 28-March 1, 2024

Chris Jenkins, Filip Miletić, Sabine Schulte im Walde
Breakaway Compounds: Diachronic Change of Noun Compounds Sharing a Head Constituent [abstract]
Workshop on Diachronic Variation and Change at the Annual Meeting of the DGfS
Universität Mainz, March 6-7, 2025

Maximilian Maurer, Chris Jenkins, Filip Miletić, Sabine Schulte im Walde
Quantifying Changes in English Noun Compound Productivity and Meaning [abstract]
Workshop on Computational Models of Diachronic Language Change at the 26th International Conference on Historical Linguistics
Universität Heidelberg, September 4-8, 2023

Sabine Schulte im Walde
Feature-based Compositionality Ratings for Noun Compounds [abstract/poster]
DGfS-CL Poster Session 2023 at the Annual Meeting of the DGfS
Universität Köln, March 8-10, 2023

Invited Talks

Filip Miletić
University of Exeter, Neurocognition, Language and Vision Processing Group
Modeling the Compositionality of Noun Compounds: Challenges in Models and Data
May 17, 2024

Sabine Schulte im Walde
Berlin-Brandenburgische Akademie der Wissenschaften, DH Colloquium
Collecting and Investigating Features of Human Semantic Ratings and Resources
February 27, 2023

Sabine Schulte im Walde
Workshop on Multiword Expressions (MWE)
Figurative Language in Noun Compound Models across Target Properties, Domains and Time
Marseille, France, June 25, 2022

Sabine Schulte im Walde
Universität Trier, Kolloquium des Forschungsverbunds Patterns
Synchronic and Diachronic Distributional Models of Compound-Constituent Meaning Interactions
March 14, 2022

Resources and Tools

Dia-Sem-NN
A collection of diachronic in-context human ratings on meaning relatedness and changes for English and German noun-noun compounds

Dia-Neighbour-NN
A collection of diachronic nearest neighbours plus visualisation tool for noun-noun compounds and their constituents

DURel Annotation Tool and Word Usage Graphs (WUGs)
Human and computational measurement of semantic proximity, sense clusters and semantic change

Feature-Comp-NN
A feature-based collection of compositionality ratings for German noun-noun compounds

Events

SemChangeMWE Workshop 2024
Guests: Fritz Günther, Stefania Degaetano-Ortlieb, Prajit Dhar, Janis Pagel, Dominik Schlechtweg, Elke Teich, Lonneke van der Plas, Aline Villavicencio
September 26-27

SemChangeMWE:Computational Models of the Emergence and Diachronic Changeof Multi-Word Expression Meanings