Hierarchical Sub-sentential Alignment with Anymalign - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2012

Hierarchical Sub-sentential Alignment with Anymalign

Résumé

We present a sub-sentential alignment algorithm that relies on association scores between words or phrases. This algorithm is inspired by previous work on alignment by recursive binary segmentation and on document clustering. We evaluate the resulting alignments on machine translation tasks and show that we can obtain state-of-the-art results, with gains up to more than 4 BLEU points compared to previous work, with a method that is simple, independent of the size of the corpus to be aligned, and directly computes symmetric alignments. This work also provides new insights regarding the use of "heuristic" alignment scores in statistical machine translation.
Fichier principal
Vignette du fichier
LardilleuxYvonLepage_EAMT12.pdf (164.24 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00747385 , version 1 (31-10-2012)

Identifiants

  • HAL Id : hal-00747385 , version 1

Citer

Adrien Lardilleux, François Yvon, Yves Lepage. Hierarchical Sub-sentential Alignment with Anymalign. 16th annual conference of the European Association for Machine Translation (EAMT 2012), May 2012, Trento, Italy. pp.279-286. ⟨hal-00747385⟩
142 Consultations
281 Téléchargements

Partager

Gmail Facebook X LinkedIn More