Regardless of how busy our lives can look, every day we need to interact with many different people. We listen and talk to colleagues at work, to friends over a cup of coffee, or even with strangers who ask us for directions. Every one of them has a different way of speaking - different accents, tones, and rhythms - but we seem to understand them effortlessly and switch between them with apparent ease. One reason for this extraordinary ability is that, during speech listening, our brain continuously generates predictions of upcoming words, based on context and odds. Some studies have investigated our response to changing speakers: this literature on talker variability highlighted the presence of a cognitive cost linked to the talker switch, emphasizing the presence of a difference in processing between a single and a multi-talker condition. However, these studies did not explore if this difference in processing is related also to other aspects of speech comprehension, such as the predictive process. The study described in this thesis aims to bridge this gap and understand how our brain processes new speakers’ voice information, and more importantly if and how a following adaptation to such information modulates the predictive process. To investigate this interaction effect, the research involved the use of continuous speech, instead of employing short and less naturalistic stimuli in the context of more controlled experimental design (largely used in the literature so far). We recorded the Electro-Encephalographic (EEG) signals of 30 participants, while they were presented with two pre-recorded travel stories. One was narrated by just one speaker (Single condition), whereas the other was narrated by 9 different speakers (Multi condition). Then, we measured the linear mapping between EEG signals and the features of interest encompassing speech envelope (the slow temporal variation of the speech signal) as a proxy of speech perception, and semantic surprisal (the negative logarithm of the conditional probability of the next word considering the preceding context) as an index of lexical prediction. Preliminary analysis showed that in the Multi condition, listeners rely on the speakers’ acoustic to process speech compared to when they are attending to the Single condition. Moreover, the analysis on the semantic surprisal suggest that in the Multi condition participants engage in a stronger prediction process. Both these phenomena can be associated with higher attention allocated in the speech to cope with a greater uncertainty introduced by talker variability.
“When a new speaker shapes predictions: EEG evidence of the role of talker variability in the modulation of predictive processing."
GUERRINI, REBECCA
2024/2025
Abstract
Regardless of how busy our lives can look, every day we need to interact with many different people. We listen and talk to colleagues at work, to friends over a cup of coffee, or even with strangers who ask us for directions. Every one of them has a different way of speaking - different accents, tones, and rhythms - but we seem to understand them effortlessly and switch between them with apparent ease. One reason for this extraordinary ability is that, during speech listening, our brain continuously generates predictions of upcoming words, based on context and odds. Some studies have investigated our response to changing speakers: this literature on talker variability highlighted the presence of a cognitive cost linked to the talker switch, emphasizing the presence of a difference in processing between a single and a multi-talker condition. However, these studies did not explore if this difference in processing is related also to other aspects of speech comprehension, such as the predictive process. The study described in this thesis aims to bridge this gap and understand how our brain processes new speakers’ voice information, and more importantly if and how a following adaptation to such information modulates the predictive process. To investigate this interaction effect, the research involved the use of continuous speech, instead of employing short and less naturalistic stimuli in the context of more controlled experimental design (largely used in the literature so far). We recorded the Electro-Encephalographic (EEG) signals of 30 participants, while they were presented with two pre-recorded travel stories. One was narrated by just one speaker (Single condition), whereas the other was narrated by 9 different speakers (Multi condition). Then, we measured the linear mapping between EEG signals and the features of interest encompassing speech envelope (the slow temporal variation of the speech signal) as a proxy of speech perception, and semantic surprisal (the negative logarithm of the conditional probability of the next word considering the preceding context) as an index of lexical prediction. Preliminary analysis showed that in the Multi condition, listeners rely on the speakers’ acoustic to process speech compared to when they are attending to the Single condition. Moreover, the analysis on the semantic surprisal suggest that in the Multi condition participants engage in a stronger prediction process. Both these phenomena can be associated with higher attention allocated in the speech to cope with a greater uncertainty introduced by talker variability.| File | Dimensione | Formato | |
|---|---|---|---|
|
Guerrini_Rebecca.pdf
Accesso riservato
Dimensione
868.1 kB
Formato
Adobe PDF
|
868.1 kB | Adobe PDF |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/96303