Tale following: Real-time speech recognition applied to live performance - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2013

Tale following: Real-time speech recognition applied to live performance

Résumé

This paper describes a system for tale following, that is to say speaker-independent but text-dependent speech recognition follo wed by automatic alignment. The aim of this system is to follow in real-time the progress of actors reading a text in order to automatically trigger audio e vents. The speech recognition engine used is the well known Sphinx from CMU. We used the real-time implementation pocketsphinx, based on sphinx II, with the French acoustic models developed at LIUM. Extensive testing using 21 speakers from the PFC corpus (excerpts in ''standard french'') shows that decent performances are obt ained by the system -- around 30\% Word Error Rate (WER). However, testing using a recording during the rehearsals shows that in real conditions, the performance is a bit worse : the WER is 40\%. Thus, the strategy we devised for our final application includes the use of a constrained automatic alignment algorithm. The aligner is derived from a biological DNA sequences analysis algorithm. Using the whole system, the experiments report that events are triggered with an average delay of 9 s ($\pm$ 8 s). The system is integrated into a widely used real-time sound processing software, Max/MSP, which is here used to trigger audio ev ents, but could also be used to trigger other kinds of events such as lights, videos, etc.
Fichier principal
Vignette du fichier
fluxus_smc2013_revised.pdf (137.17 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-00868248 , version 1 (01-10-2013)

Identifiants

  • HAL Id : hal-00868248 , version 1

Citer

Jean-Luc Rouas, Boris Mansencal, Joseph Larralde. Tale following: Real-time speech recognition applied to live performance. SMC Sound and Music Computing, Jul 2013, Stockholm, Sweden. pp.389-394. ⟨hal-00868248⟩

Collections

CNRS
318 Consultations
383 Téléchargements

Partager

Gmail Facebook X LinkedIn More