The Parallel Meaning Bank: Towards a Multilingual Corpus of Translations Annotated with Compositional Meaning Representations - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2017

The Parallel Meaning Bank: Towards a Multilingual Corpus of Translations Annotated with Compositional Meaning Representations

Résumé

The Parallel Meaning Bank is a corpus of translations annotated with shared, formal meaning representations comprising over 11 million words divided over four languages (English, German, Ital-ian, and Dutch). Our approach is based on cross-lingual projection: automatically produced (and manually corrected) semantic annotations for English sentences are mapped onto their word-aligned translations , assuming that the translations are meaning-preserving. The semantic annotation consists of five main steps: (i) seg-mentation of the text in sentences and lexical items; (ii) syntactic parsing with Com-binatory Categorial Grammar; (iii) universal semantic tagging; (iv) symboliza-tion; and (v) compositional semantic analysis based on Discourse Representation Theory. These steps are performed using statistical models trained in a semi-supervised manner. The employed annotation models are all language-neutral. Our first results are promising.
Fichier principal
Vignette du fichier
eacl2017.pdf (370.15 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Loading...

Dates et versions

hal-01630960 , version 1 (08-11-2017)

Identifiants

  • HAL Id : hal-01630960 , version 1

Citer

Lasha Abzianidze, Johannes Bjerva, Kilian Evang, Hessel Haagsma, Rik van Noord, et al.. The Parallel Meaning Bank: Towards a Multilingual Corpus of Translations Annotated with Compositional Meaning Representations. 15th Conference of the European Chapter of the Association for Computational Linguistics, Apr 2017, Valencia, Spain. pp.242 - 247. ⟨hal-01630960⟩
115 Consultations
189 Téléchargements

Partager

Gmail Facebook X LinkedIn More