Project: Distributional Approaches to Semantic Relatedness
The overall goal of the project is to explore the potential and the limits of distributional approaches to lexical semantics. While it is clear that distributional knowledge does not cover all the cognitive knowledge humans possess with respect to word meaning, distributional models are very attractive, as the underlying parameters are accessible from even low-level annotated corpus data. We are thus interested in maximising the benefit of distributional information for lexical semantics.More specifically, this proposal addresses distributional approaches with respect to semantic relatedness. We distinguish three types of semantic relatedness, which we argue will shed light on distributional modelling from different perspectives. The work is performed within an interdisciplinary framework, which allows us to explore distributional approaches through complementary evidence.
Interdisciplinarity: Theoretical linguistics provides the formal definitions of the semantic relatedness phenomena we are interested in, and cognitive linguistics tells us how humans perceive and express semantic relatedness. Both from the linguistic and the cognitive perspective, we expect a guidance towards selecting and implementing theoretically and cognitively adequate distributional attributes to model word meaning, and gold standards as seeds for computational algorithms and for intrinsic evaluations of the distributional models. As regards the cognitive perspective, we do not only expect cognitive evidence for the potential of distributional knowledge, but also clear evidence for its limits, as human judgements naturally comprise both distributional and world knowledge. Altogether, linguistic and cognitive feedback should help us to define simple, straightforward computational methods to assess information about distributional meaning. Furthermore, the computational perspective explores the applicability of our distributional semantic knowledge to statistical machine translation as an extrinsic evaluation.
Challenges: Within our interdisciplinary approach, we address two major challenges. Firstly, we are interested in a theoretically and cognitively adequate selection of features to model word meaning and word relatedness. In this respect, our proposal differs from approaches that are not interested in the actual meaning of their features but only in optimising a complex computational machinery that makes use of them. In contrast, our goal is to explore the meaning and the potential of comparatively simple distributional models. Secondly, our work aims to model word meaning with respect to word senses, thus addressing ambiguity. Even though ambiguity is a frequent target of computational models in general, it has largely been ignored in distributionality.
Semantic relatedness: In the following, we present our classification of three types of semantic relatedness, where we aim to bring together key aspects of the classes within our interdisciplinary distributional approach to lexical semantics. Our target language is German.
- Explicit semantic relations between word senses: Two words are semantically related, and the relation is explicitly specified, such as heiß/kalt `hot/cold' (antonymy), Amsel/Vogel `blackbird/bird' (hypernymy); zuschließen/schließen `close' (near-synonymy).
- Underspecified semantic relatedness across a set of words: A set of words is semantically related without necessarily specifying the semantic relations between the members; the common class implicitly refers to common properties of its members; examples are the adjectives toll/spitze/mies/verbesserungswürdig referring to a degree of appreciation, the prepositions gemäß/laut/nach which are near-synonymous but differ in their subcategorisation, and the verbs kaufen/verkaufen/kosten/bezahlen referring to a commercial situation.
- Degree of semantic relatedness between multi-words and their parts: A multi-word can be more or less compositional with respect to its parts, cf. the rather compositional particle verb zuschließen `close' with the non-compositional particle verb anfangen `begin', or the rather compositional compound noun Brotmesser `bread knife' with the non-compositional compound noun Klatschmohn `corn poppy'.
- The interaction of distributional approaches in modelling paradigmatic relations: We intend to work on paradigmatic relations as one group of explicit semantic relations that is still notoriously difficult to identify and distinguish, i.e., we are interested in distinguishing synonymy, antonymy, hypernymy, hyponymy, and co-hyponymy. Standard distributional models have difficulties distinguishing between paradigmatic relations, because the distributions in text tend to be very similar, cf. `The boy/girl/person loves/hates his cat', illustrating that the (co-)hyponyms boy, girl, and person as well as the antonyms love and hate can occur in identical contexts, respectively. Our aim is to enhance computational work on paradigmatic semantic relations.
- The definition, induction and application of preposition senses: Prepositions are notoriously ambiguous, cf. the various senses of the German preposition nach in `nach drei Stunden/Berlin/Meinung', referring to a temporal, directional, and accordance meaning. We address the lack of empirical semantic work with respect to German preposition senses as an instance of underspecified semantic relatedness across a set of words, and aim to find semantic classes of prepositions that abstract over the commonalities of similar preposition senses, cf. nach/vor vs. bis/nach/von/zu vs. gemäß/laut/nach.
- Modelling the compositionality of German multi-word expressions: Addressing the compositionality of multi-word expressions (MWEs) is a crucial ingredient for lexicography and NLP applications, to know whether the expression should be treated as a whole, or through its parts, and what the expression means. We approach the degree of compositionality through the degree of semantic relatedness between the parts and the whole. The core idea is to explore distributional models of the multi-word parts, in order to predict the degree of compositionality of the whole, concentrating on two types of MWEs, German noun compounds and German particle verbs.