Unsupervised and Lightly Supervised Part-of-Speech Tagging Using Recurrent Neural Networks

Othman Zennaki; Nasredine Semmar; Laurent Besacier

Communication Dans Un Congrès Année : 2015

Unsupervised and Lightly Supervised Part-of-Speech Tagging Using Recurrent Neural Networks

(1) , (1) , (2, 3)

1
2
3

Othman Zennaki

Fonction : Auteur

Laboratoire d'Intégration des Systèmes et des Technologies

Nasredine Semmar

Fonction : Auteur

Laboratoire d'Intégration des Systèmes et des Technologies

Laurent Besacier

Fonction : Auteur correspondant
PersonId : 1521
IdHAL : laurent-besacier
ORCID : 0000-0001-7411-9125
IdRef : 079377017

Connectez-vous pour contacter l'auteur

Groupe d’Étude en Traduction Automatique/Traitement Automatisé des Langues et de la Parole

Institut universitaire de France

Résumé

In this paper, we propose a novel approach to induce automatically a Part-Of-Speech (POS) tagger for resource-poor languages (languages that have no labeled training data). This approach is based on cross-language projection of linguistic annotations from parallel corpora without the use of word alignment information. Our approach does not assume any knowledge about foreign languages, making it applicable to a wide range of resource-poor languages. We use Recurrent Neural Networks (RNNs) as multilingual analysis tool. Our approach combined with a basic cross-lingual projection method (using word alignment information) achieves comparable results to the state-of-the-art. We also use our approach in a weakly supervised context, and it shows an excellent potential for very low-resource settings (less than 1k training utterances).

Domaines

Informatique et langage [cs.CL]

Fichier principal

PACLIC29-1016.18.pdf (285.95 Ko)

Origine : Fichiers éditeurs autorisés sur une archive ouverte

Laurent Besacier : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01350113

Soumis le : dimanche 21 août 2016-09:15:00

Dernière modification le : lundi 15 avril 2024-11:25:23

Archivage à long terme le : mardi 22 novembre 2016-10:17:23

Dates et versions

hal-01350113 , version 1 (21-08-2016)

Identifiants

HAL Id : hal-01350113 , version 1

Citer

Othman Zennaki, Nasredine Semmar, Laurent Besacier. Unsupervised and Lightly Supervised Part-of-Speech Tagging Using Recurrent Neural Networks. 29th Pacific Asia Conference on Language, Information and Computation (PACLIC), Oct 2015, Shangai, China. ⟨hal-01350113⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CEA UGA CNRS LIG LIG_TDCGE_GETALP DRT LIST POLYTECH-GRENOBLE GS-ENGINEERING GS-COMPUTER-SCIENCE GS-SPORT-HUMAN-MOVEMENT LIG_SIDCH

307 Consultations

223 Téléchargements

Unsupervised and Lightly Supervised Part-of-Speech Tagging Using Recurrent Neural Networks

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager