Neural Networks Revisited for Proper Name Retrieval from Diachronic Documents

Irina Illina 1 Dominique Fohr 1
1 MULTISPEECH - Speech Modeling for Facilitating Oral-Based Communication
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : Developing high-quality transcription systems for very large vocabulary corpora is a challenging task. Proper names are usually key to understanding the information contained in a document. To increase the vocabulary coverage, a huge amount of text data should be used. In this paper, we extend the previously proposed neural networks for word embedding models: word vector representation proposed by Mikolov is enriched by an additional non-linear transformation. This model allows to better take into account lexical and semantic word relationships. In the context of broadcast news transcription and in terms of recall, experimental results show a good ability of the proposed model to select new relevant proper names.
Type de document :
Communication dans un congrès
LTC Language & Technology Conference, Nov 2015, Poznan, Poland. proceedings of LTC2015, pp.120-124
Liste complète des métadonnées

Littérature citée [19 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01240480
Contributeur : Dominique Fohr <>
Soumis le : jeudi 10 décembre 2015 - 09:51:49
Dernière modification le : mardi 18 décembre 2018 - 16:38:02
Document(s) archivé(s) le : samedi 29 avril 2017 - 08:13:56

Fichier

ltc-009-illina.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01240480, version 1

Collections

Citation

Irina Illina, Dominique Fohr. Neural Networks Revisited for Proper Name Retrieval from Diachronic Documents. LTC Language & Technology Conference, Nov 2015, Poznan, Poland. proceedings of LTC2015, pp.120-124. 〈hal-01240480〉

Partager

Métriques

Consultations de la notice

340

Téléchargements de fichiers

141