Distributional Characterization of Derivation
Principal Investigator: Sebastian Padó
Researchers: Gabriella Lapesa, Max Kisselew
Former Researchers: Aurélie Herbelot, Alexis Palmer
Derivational morphology is an important process of word formation. Work in computational linguistics has usually focused on the orthographic level, modeling derivation as a string transformation. The semantic level, where orthographic derivation patterns such as-er, -ungcorrespond to a variety of semantic shifts, has received less attention in the field.
The goal of this project is to model the semantics of derivational patterns using distributional methods. We will work in the recently developed framework of compositional distributional semantic models (CDSMs) which assumes that derivation is essentially acompositionalprocess in which derivational patterns act as functors (represented as linear maps) that are applied to base terms (represented as vectors). We can then predict the meaning of a termw2derived from another termw1through a derivational patternMalgebraically. We assume that are two main types of semantic shifts underlying derivation that differ in their distributional character
- Topic shifts change the lexical context between the source and the derived term but leave argument structure constant; examples are diminutive (-chen) and gender (-in) suffixation. These shifts are best represented in word-based vector spaces.
- Syntactic shifts lead to argument structure changes between the source and the target term but leave the lexical context untouched; examples are agent and event nominalizations (-er, -ung). Representing such shifts requires syntax-based vector spaces.
- Combined shifts involve changes on both levels.
A major challenge is that most orthographic patterns are ambiguous, corresponding to more than one semantic shift (e.g., the event/state ambiguity of nominalizations). We will start by concentrating on derivational patterns where one orthographic pattern corresponds to one semantic relation, to test the feasibility of our modeling approach. These results will be subsequently generalized, tackling ambiguous patterns and will address ambiguity and idiosyncrasy with clustering methods.
The results of the project will have impact on both linguistics and computational linguistics. The benefit for computational linguistics will lie in improvements to distributional modeling through linguistic constraints. The improved models will be less sparse and more linguistically plausible and thus provide better modeling of meaning in NLP applications. The benefit for linguistics consists in establishing a methodological bridge between theoretical and distributional analysis with the goal of verifying and refining theoretically motivated classifications through the inclusion of corpus evidence.