Text island spotting in large speech databases
Résumé
This paper addresses the problem of using journalist prompts or closed captions to build corpora for training speech recognition systems. Generally, these text documents are imperfect transcripts which suffer from the lack of timestamps. We propose a method combining a driven decoding algorithm and a fast-match process allowing to spot text-segments. This method is evaluated both on the French ESTER ([4]) corpus and on a large database composed of records from the Radio Television Belge Francophone (RTBF) associated to real prompts. Results show very good performance in terms of spotting; we observed a F-measure of about 98% on spotting the real text island provided by the RTBF corpus. Moreover, the decoding driven by the imperfect transcript island outperforms significantly the baseline system.
Domaines
Informatique [cs]
Fichier principal
Text_island_spotting_in_large_speech_databases.pdf (246.92 Ko)
Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...