Peut-on utiliser les étiqueteurs morphosyntaxiques pour améliorer la transcription automatique ?

Stéphane Huet 1 Guillaume Gravier 1 Pascale Sébillot 1
1 TEXMEX - Multimedia content-based indexing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : The aim of the paper is to study the interest of part-of-speech (POS) tagging to improve speech recognition. We first evaluate the part of misrecognized words that can be corrected using POS information ; an analysis of a short extract proves that an absolute decrease of the word error rate by 1.1 % can be expected. We also demonstrate quantitatively that traditional POS taggers are reliable when applied to spoken corpus, including automatic transcriptions. This new result enables us to effectively use POS tag knowledge to improve, in a postprocessing stage, the quality of transcriptions, especially correcting agreement errors.
Document type :
Conference papers
Complete list of metadatas

Cited literature [4 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-02021377
Contributor : Stéphane Huet <>
Submitted on : Friday, February 15, 2019 - 6:27:57 PM
Last modification on : Wednesday, February 20, 2019 - 1:22:56 AM
Long-term archiving on : Friday, May 17, 2019 - 9:44:33 AM

File

JEP06.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02021377, version 1

Citation

Stéphane Huet, Guillaume Gravier, Pascale Sébillot. Peut-on utiliser les étiqueteurs morphosyntaxiques pour améliorer la transcription automatique ?. 26èmes journées d'étude sur la parole (JEP), 2006, Dinard, France. ⟨hal-02021377⟩

Share

Metrics

Record views

18

Files downloads

22