In Natural Language Processing (NLP), combinations of words are considered multi-word expressions (MWEs) if they are semantically idiosyncratic to some degree, i.e., the meaning of the combination is not entirely (or even not at all) predictable from the meanings of the constituents. MWEs subsume multiple morpho-syntactic types, including noun compounds (such as flea market) and particle verbs (such as give up). They have been explored extensively and across research disciplines from synchronic perspectives, but state-of-the-art studies are lacking empirical large-scale approaches towards diachronic models of MWE meaning.

Our project SemChangeMWE goes beyond the restricted synchronic concept of MWE meaning and provides a novel perspective on MWE emergence, MWE meaning changes and MWE compositionality (i.e., meaning transparency) by computationally modelling their diachronic properties and changes of properties. The project brings together our expertises in (a) computational models of MWE compositionality and meaning analogy, (b) computational models of diachronic meaning changes and meaning divergences in language variation, and (c) datasets of meaning components and meaning relatedness, in order to address the lack of computational diachronic models of MWE meaning.

The project SemChangeMWE is funded by the DFG (Deutsche Forschungsgemeinschaft, the German Research Foundation) under research grant SCHU 2580/5-1.