Unsupervised and Lightly Supervised Part-of-Speech Tagging Using Recurrent Neural Networks

Abstract : In this paper, we propose a novel approach to induce automatically a Part-Of-Speech (POS) tagger for resource-poor languages (languages that have no labeled training data). This approach is based on cross-language projection of linguistic annotations from parallel corpora without the use of word alignment information. Our approach does not assume any knowledge about foreign languages, making it applicable to a wide range of resource-poor languages. We use Recurrent Neural Networks (RNNs) as multilingual analysis tool. Our approach combined with a basic cross-lingual projection method (using word alignment information) achieves comparable results to the state-of-the-art. We also use our approach in a weakly supervised context, and it shows an excellent potential for very low-resource settings (less than 1k training utterances).
Type de document :
Communication dans un congrès
29th Pacific Asia Conference on Language, Information and Computation (PACLIC), Oct 2015, Shangai, China. The 29th Pacific Asia Conference on Language, Information and Computation
Liste complète des métadonnées

Littérature citée [38 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01350113
Contributeur : Laurent Besacier <>
Soumis le : dimanche 21 août 2016 - 09:15:00
Dernière modification le : jeudi 11 octobre 2018 - 08:48:03
Document(s) archivé(s) le : mardi 22 novembre 2016 - 10:17:23

Fichier

PACLIC29-1016.18.pdf
Fichiers éditeurs autorisés sur une archive ouverte

Identifiants

  • HAL Id : hal-01350113, version 1

Citation

Othman Zennaki, Nasredine Semmar, Laurent Besacier. Unsupervised and Lightly Supervised Part-of-Speech Tagging Using Recurrent Neural Networks. 29th Pacific Asia Conference on Language, Information and Computation (PACLIC), Oct 2015, Shangai, China. The 29th Pacific Asia Conference on Language, Information and Computation. 〈hal-01350113〉

Partager

Métriques

Consultations de la notice

384

Téléchargements de fichiers

88