Skip to Main content Skip to Navigation
Conference papers

Neural Representations of Dialogical History for Improving Upcoming Turn Acoustic Parameters Prediction

Abstract : Predicting the acoustic and linguistic parameters of an upcoming conversational turn is important for dialogue systems aiming to include low-level adaptation with the user. It is known that during an interaction speakers could influence each other speech production. However, the precise dynamics of the phenomena is not well-established, especially in the context of natural conversations. We developed a model based on an RNN architecture that predicts speech variables (Energy, F0 range and Speech Rate) of the upcoming turn using a representation vector describing speech information of previous turns. We compare the prediction performances when using a dialogical history (from both participants) vs. monological history (from only upcoming turn's speaker). We found that the information contained in previous turns produced by both the speaker and his interlocutor reduce the error in predicting current acoustic target variable. In addition the error in prediction decreases as increases the number of previous turns taken into account.
Document type :
Conference papers
Complete list of metadata

https://hal.archives-ouvertes.fr/hal-03224194
Contributor : Laurent Prévot <>
Submitted on : Wednesday, May 12, 2021 - 2:16:26 PM
Last modification on : Thursday, May 13, 2021 - 3:44:08 AM

File

2785.pdf
Publisher files allowed on an open archive

Identifiers

Collections

Citation

Simone Fuscone, Benoit Favre, Laurent Prevot. Neural Representations of Dialogical History for Improving Upcoming Turn Acoustic Parameters Prediction. Interspeech 2020, Oct 2020, Virtual (Shangai), China. pp.4203-4207, ⟨10.21437/interspeech.2020-2785⟩. ⟨hal-03224194⟩

Share

Metrics

Record views

58

Files downloads

19