Modelling the interaction between binaural and temporal speech processing - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2020

Modelling the interaction between binaural and temporal speech processing

Saskia Röttges
  • Fonction : Auteur
Christopher Hauth
  • Fonction : Auteur
Jan Rennies
  • Fonction : Auteur

Résumé

When listening to speech in reverberant conditions, listeners profit from early speech reflections because they can be integrated with direct speech sound. In contrast, late reflections are typically detrimental because they cannot be integrated with the target speech. Rennies et al. (2019) measured speech reception thresholds (SRTs) in stationary noise in 86 conditions with different numbers and delay times of speech reflections. In some conditions, different interaural phase differences (IPDs) were introduced for the noise, the direct sound and the reflections in order to enable the listeners to use binaural unmasking. By analyzing the binaural room impulse response (BRIR) while using the binaural speech intelligibility model (BSIM), Rennies et al. (2019) found that listeners used a temporal window with a length of 100 ms to integrate useful information. Speech reflections outside this window were detrimental. Interestingly, this ?useful?-window not necessarily has to be an ?early?- window, as it is not required that the direct speech sound is included. It is rather important that the window includes the maximum number of useful speech reflections. In this study we use the BSIM blindly, that means without knowledge of the BRIR and without knowledge of the clean speech signal. This is achieved by maximizing the speech-like modulations in the binaural front-end of the model, which applies an equalization cancellation (EC) model. In this way, the useful speech information is maximized and the detrimental information is minimized blindly. As this model works bottom-up it can be combined with arbitrary speech intelligibility measures, for instance, the speech intelligibility index (SII) or the speech transmission index (STI).
Fichier non déposé

Dates et versions

hal-03242415 , version 1 (31-05-2021)

Identifiants

Citer

Saskia Röttges, Christopher Hauth, Jan Rennies, Thomas Brand. Modelling the interaction between binaural and temporal speech processing. Forum Acusticum, Dec 2020, Lyon, France. pp.2765-2765, ⟨10.48465/fa.2020.0206⟩. ⟨hal-03242415⟩
36 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More