Exemplar-Based Speech Representation
PIs: Jonas Kuhn & Sebastian Padó (former PIs: Bernd Möbius and Hinrich Schütze; Grzegorz Dogil †)
Researchers: Jagoda Bruni, Jörg Mayer, Michael Walsh (former members: Daniel Duran, Travis Wade)
Within the overarching topic of the SFB, Incremental Specification in Context, Exemplar Theory is a formal model of contextual perception and production. Thus, we use Exemplar Theory as a model of context that explicates how linguistic units are incrementally specified in production and to what degree the fully specified speech signal undergoes incremental processes of underspecification in perception.
The first phase of this project yielded two computational models which have facilitated the pursuit of the research agenda set out in the original A2 proposal. The first model, known as the Context Sequence Model, models speech perception by representing memory as a single ordered collection of acoustic cues from previously heard speech and encoded to preserve temporal patterns. The categorization of newly encountered speech sounds involves comparing the sounds, and their neighbouring contexts, with similar sequences in memory. The second model is the Mulit Level-Exemplar Model, whose key innovation is the explicit formalisation of the relationship between exemplars on the constituent level and exemplars on what is referred to as the unit level. Constituents are segments, for example, consonants and vowels in phonetics, and words in syntax. Units are represented by syllables in phonetics, and phrases or sentences in syntax. Both models have been succesful in accounting for a number of phenomena in phonetics and syntax.
In the second phase we have investigated how exemplars dynamically emerge from the speech stream (particularly from the perspective of learning correspondence between acoustics and articulation). Moreover, we have examined how well our models, developed in the first phase, capture linguistic abstraction. Additionally, we have examined the nature of relationship between exemplar storage and prosody and we have established at fine levels of phonetic detail the impact fo acquired phonetics on second language learning (in particular with respect to measuring effects of phonetic transfer and interference). The outcomes of these investigations can be summarized under topics of research listed below:
- exemplar constitution and multi-modal representations in the Context Sequence Model
- Exemplar Theory in prosody: temporal and tonal
- Exemplar Theory in language learning
- exemplar-theoretic language modelling
In the third phase we have continued our work on exemplar modelling, in particular by developing KaMoso, which combines social modelling principles and self-organising exemplar-theoretic dynamics into a single multi-agent framework. Using this framework we are currently exploring exemplar-theoretic influences and social pressures on competing phonetic forms in Tswana. In addition to KaMoso, we are also investigating the impact of exemplar dynamics on socio-phonetic phenomena, e.g. phonetic convergence. In one of our studies, in collaboration with A4, we hypothesised that exemplar frequency should have an impact on speech rate convergence, but only in the case of infrequent syllables, because although recently perceived syllables in a dialogue should have a higher activation level, high frequency syllables should already have a high resting activation and hence the recently perceived token should not be so influential. For infrequent syllables this should not be the case. Linear regression analyses of the GECO corpus confirmed our hypothesis. In a further exploration of the GECO corpus, in conjunction with A4, we examined convergence/divergence by calculating the distance between pitch accent realisations of conversational partners. We found that the extent to which these realisations converged/diverged was related to the extent to which partners liked each other, and whether they could see each other or not.