The distinction between abstract and concrete words (such as dream in contrast to banana) represents a semantic categorisation highly relevant for Natural Language Processing (NLP) purposes. In this vein, our project MUDCAT investigates the notion of abstractness from a data-driven and application-oriented point of view. While the most long-standing discussions about abstractness have taken place in the cognitive sciences, we address and enhance critical issues in existing definitions, data collections and characterisations, and broaden and optimise the perspective towards effective exploitation in NLP approaches.

Up to date, definitions, collections and applications of abstractness have mostly been performed on a word-type basis without contextualisation. In contrast, MUDCAT will develop, exploit and apply empirical dimensions of abstractness while paying attention to a token-based, sense-related perspective across word classes (nouns, verbs, adjectives), across modalities (text, associations, features, images) and across languages (English, German, Italian). In this vein, we will collect novel human-generated norms on abstractness and exploit cross-lingual transfer to advance semi-automatic algorithms for norm generation. A major effort at the empirical layer will identify and induce word-class-dependent salient dimensions of abstractness from large-scale corpora, taking into account contextual conditions in the form of syntactic constellations (such as subcategorisation and modification). Considering that abstractness is conceptually distinguished from concreteness on multimodal grounds, we will go beyond the textual dimension and collect and explore multimodal facets of abstractness in free word associations, feature-property generation and images. Class-based and cross-lingual clustering approaches will investigate semantic and language generalisations of the multimodal characteristics. Finally, the multimodal cross-lingual empirical knowledge of abstractness will be applied to NLP tasks whose performance is known or expected to profit from abstractness knowledge. Accordingly, we will develop generic computational approaches to apply our enhanced abstractness information to semantic challenges: figurative language identification as concrete–abstract mapping task, and hypernymy detection as semantic generality task. Overall, MUDCAT will investigate the cross-lingual transferability in definitions and applications of abstract and concrete words for English, German and Italian, while taking ambiguity of targets and contexts into account.

The project MUDCAT is a SemRel project and part of the ongoing collaboration KATER between Jun.-Prof. Diego Frassinelli (University of Konstanz) and Prof. Sabine Schulte im Walde (University of Stuttgart). It is funded by the DFG (Deutsche Forschungsgemeinschaft, the German Research Foundation) under research grant SCHU 2580/4-1.



Annerose Eichel, Sabine Schulte im Walde
A Dataset for Physical and Abstract Plausibility and Sources of Human Disagreement [pdf/poster/resource]
In: Proceedings of the 17th Linguistic Annotation Workshop (LAW). Toronto, Canada, July 2023.

Anna Hülsing, Sabine Schulte im Walde
Cross-Lingual Metaphor Detection for Low- to High-Resource Languages [pdf/resource]
In: Proceedings of the 4th Workshop on Figurative Language Processing. Mexico City, Mexico, June 2024.

Mohammed Abdul Khaliq, Diego Frassinelli, Sabine Schulte im Walde
Comparison of Image Generation Models for Abstract and Concrete Event Descriptions [pdf/resource]
In: Proceedings of the 4th Workshop on Figurative Language Processing. Mexico City, Mexico, June 2024.

Urban Knupleš, Diego Frassinelli, Sabine Schulte im Walde
Investigating the Nature of Disagreements on Mid-Scale Ratings: A Case Study on the Abstractness-Concreteness Continuum [pdf/poster/supplement]
In: Proceedings of the SiGNLL Conference on Computational Natural Language Learning (CoNNL). Singapore, December 2023.

Prisca Piccirilli, Sabine Schulte im Walde
Features of Perceived Metaphoricity on the Discourse Level: Abstractness and Emotionality [pdf/poster/resource/bib]
In: Proceedings of the 13th International Conference on Language Resources and Evaluation (LREC). Marseille, France, June 2022.

Prisca Piccirilli, Sabine Schulte im Walde
What Drives the Use of Metaphorical Language? Negative Insights from Abstractness, Affect, Discourse Coherence and Contextualized Word Representations [pdf/bib]
In: Proceedings of the 11th Joint Conference on Lexical and Computational Semantics (*SEM), and non-archival paper at the NAACL 2022 Student Research Workshop (NAACL-SRW). Seattle, Washington, July 2022.

Sabine Schulte im Walde, Diego Frassinelli
Distributional Measures of Semantic Abstraction [doi/supplement/preprint pdf/bib]
Frontiers in Artificial Intelligence: Language and Computation 4:796756, 2022. Special Issue on Perspectives for Natural Language Processing between AI, Linguistics and Cognitive Science, edited by Alessandro Lenci and Sebastian Padó.

Tarun Tater, Diego Frassinelli, Sabine Schulte im Walde
Concreteness vs. Abstractness: A Selectional Preference Perspective [pdf/bib]
In: Proceedings of the AACL-IJCNLP 2022 Student Research Workshop (AACL-IJCNLP-SRW). Taipei, Taiwan, November 2022.

Tarun Tater, Sabine Schulte im Walde, Diego Frassinelli
Evaluating Semantic Relations in Predicting Textual Labels for Images of Abstract and Concrete Concepts
In: Proceedings of the Cognitive Modeling and Computational Linguistics Workshop (CMCL). Bangkok, Thailand, August 2024. To appear.

Talks + Posters with Abstracts

Annerose Eichel, Sabine Schulte im Walde
A Dataset for Physical and Abstract Plausibility and Sources of Human Disagreement [abstract/poster]
29th Conference on Architectures and Mechanisms of Language Processing (AMLaP)
San Sebastian, Spain, August 31-September 2, 2023

Anna Hülsing, Sabine Schulte im Walde
Cross-Lingual Metaphor Detection for Low-Resource Languages [abstract]
Workshop on Computational Approaches to Metaphor and Figurative Language at the Annual Meeting of the DGfS
Universität Bochum, February 28-March 1, 2024

Urban Knupleš, Diego Frassinelli, Sabine Schulte im Walde
Investigating the Nature of Disagreements on Mid-Scale Ratings: A Case Study on the Abstractness-Concreteness Continuum [abstract/poster]
DGfS-CL Poster Session 2024 at the Annual Meeting of the DGfS
Universität Bochum, February 28-March 1, 2024

Prisca Piccirilli, Sabine Schulte im Walde
Conditions for Perceived Metaphoricity in Discourses: Two Crowdsourcing Studies [abstract]
6th International Conference on Figurative Thought and Language (FTL)
Poznan, Poland, April 20-24, 2022

Invited Talks

Sabine Schulte im Walde
Workshop on Computational Approaches to Metaphor and Figurative Language
Annual Meeting of the DGfS, Universität Bochum
Interactions of Figurative Language, Abstractness and Plausibility in Verb-Object Event Descriptions
February 28-March 1, 2024

Sabine Schulte im Walde
Universität Göttingen, LinG/RTG Colloquium of the Research Training Group 2636 Form-Meaning Mismatches
Computational Models of Figurative Language and the Role of Abstractness
July 6, 2023

Sabine Schulte im Walde
Universität Düsseldorf, Computational Linguistics Research Colloquium
Computational Models of Figurative Language and the Role of Abstractness
February 4, 2022


Synonymous Literal and Metaphorical Expressions in Discourse

A Dataset for Physical and Abstract Plausibility and Sources of Human Disagreement


Introductory Course at the 33rd European Summer School in Logic, Language and Information (ESSLLI)
Cognitive and Computational Models of Abstractness
Diego Frassinelli, Sabine Schulte im Walde
National University of Ireland Galway, August 8-12, 2022

MUDCAT Workshop
Guests: Alessandro Lenci (Universita di Pisa), Gabriella Vigliocco (University College London)
May 20-21, 2022