Document Level Semantic Context for Retrieving OOV Proper Names

Imran Sheikh 1 Irina Illina 1 Dominique Fohr 1 Georges Linares 2
1 MULTISPEECH - Speech Modeling for Facilitating Oral-Based Communication
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : Recognition of Proper Names (PNs) in speech is important for content based indexing and browsing of audio-video data. However, many PNs are Out-Of-Vocabulary (OOV) words nfor LVCSR systems used in these applications due to the diachronic nature of data. By exploiting semantic context of the audio, relevant OOV PNs can be retrieved and then the target PNs can be recovered. To retrieve OOV PNs, we propose to represent their context with document level semantic vectors; and show that this approach is able to handle less frequent OOV PNs in the training data. We study different representations, including Random Projections, LSA, LDA, Skip-gram, CBOW and GloVe. A further evaluation of recovery of target OOV PNs using a phonetic search shows that document level semantic context is reliable for recovery of OOV PNs.
Type de document :
Communication dans un congrès
2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , Mar 2016, Shanghai, China. IEEE, pp.6050-6054, Proceeding of IEEE ICASSP 2016. 〈10.1109/ICASSP.2016.7472839〉
Liste complète des métadonnées

Littérature citée [27 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01331716
Contributeur : Dominique Fohr <>
Soumis le : jeudi 20 octobre 2016 - 09:58:51
Dernière modification le : mardi 18 décembre 2018 - 16:38:02

Fichier

draft-16Jan16 (1).pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Imran Sheikh, Irina Illina, Dominique Fohr, Georges Linares. Document Level Semantic Context for Retrieving OOV Proper Names. 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , Mar 2016, Shanghai, China. IEEE, pp.6050-6054, Proceeding of IEEE ICASSP 2016. 〈10.1109/ICASSP.2016.7472839〉. 〈hal-01331716〉

Partager

Métriques

Consultations de la notice

427

Téléchargements de fichiers

189