Unsupervised and Lightly Supervised Part-of-Speech Tagging Using Recurrent Neural Networks

Abstract : In this paper, we propose a novel approach to induce automatically a Part-Of-Speech (POS) tagger for resource-poor languages (languages that have no labeled training data). This approach is based on cross-language projection of linguistic annotations from parallel corpora without the use of word alignment information. Our approach does not assume any knowledge about foreign languages, making it applicable to a wide range of resource-poor languages. We use Recurrent Neural Networks (RNNs) as multilingual analysis tool. Our approach combined with a basic cross-lingual projection method (using word alignment information) achieves comparable results to the state-of-the-art. We also use our approach in a weakly supervised context, and it shows an excellent potential for very low-resource settings (less than 1k training utterances).
Document type :
Conference papers
Liste complète des métadonnées

Cited literature [38 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01350113
Contributor : Laurent Besacier <>
Submitted on : Sunday, August 21, 2016 - 9:15:00 AM
Last modification on : Thursday, April 4, 2019 - 10:18:05 AM
Document(s) archivé(s) le : Tuesday, November 22, 2016 - 10:17:23 AM

File

PACLIC29-1016.18.pdf
Publisher files allowed on an open archive

Identifiers

  • HAL Id : hal-01350113, version 1

Citation

Othman Zennaki, Nasredine Semmar, Laurent Besacier. Unsupervised and Lightly Supervised Part-of-Speech Tagging Using Recurrent Neural Networks. 29th Pacific Asia Conference on Language, Information and Computation (PACLIC), Oct 2015, Shangai, China. The 29th Pacific Asia Conference on Language, Information and Computation. 〈hal-01350113〉

Share

Metrics

Record views

476

Files downloads

130