Combining Acoustic Name Spotting and Continuous Context Models to improve Spoken Person Name Recognition in Speech

Abstract : Retrieving pronounced person names in spoken documents is a critical problematic in the context of audiovisual content indexing. In this paper, we present a cascading strategy for two methods dedicated to spoken name recognition in speech. The first method is an acoustic name spotting in phoneme confusion networks. It is based on a phonetic edition distance criterion based on phoneme probabilities held in confusion networks. The second method is a continuous context modelling approach applied on the 1-best transcription output. It relies on a probabilistic modelling of name-to-context dependencies. We assume that the combination of these methods, based on different types of information, may improve spoken name recognition performance. This assumption is studied through experiments done on a set of audiovisual documents from the development set of the REPERE challenge. Results report that combining acoustic and linguistic methods produces an absolute gain of 3% in terms of F-measure compared to the best system taken alone.
Document type :
Conference papers
Complete list of metadatas

Cited literature [30 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-02102829
Contributor : Corinne Fredouille <>
Submitted on : Friday, April 26, 2019 - 1:26:10 PM
Last modification on : Saturday, April 27, 2019 - 1:33:15 AM

File

i13_2539.pdf
Publisher files allowed on an open archive

Identifiers

  • HAL Id : hal-02102829, version 1

Collections

Citation

Benjamin Bigot, Gregory Senay, Georges Linarès, Corinne Fredouille, Richard Dufour. Combining Acoustic Name Spotting and Continuous Context Models to improve Spoken Person Name Recognition in Speech. Interspeech 2013, Aug 2013, Lyon, France. ⟨hal-02102829⟩

Share

Metrics

Record views

20

Files downloads

9