Optimising Multiple Metrics with MERT

Abstract : The main metric used for SMT systems evaluation an optimisation is BLEU score but this metric is questioned about its relevance to human evaluation. Some other metrics already exist but none of them are in perfect harmony with human evaluation. On the other hand, most evaluations use multiple metrics (BLEU, TER, METEOR, etc.). Systems can optimise toward other metrics than BLEU. But optimisation with other metrics tends to decrease BLEU score. As Machine Translation evaluations still use BLEU as main metric, it is important to min-imise the decrease of BLEU. We propose to optimise toward a metric combination like BLEU-TER. This proposition includes two new open source scorers for MERT, the SMT optimisation tool. The first one is a TER scorer that allows us to optimise toward TER; the second one is a combination scorer. The latter one enables the combination of two or more metrics for the optimisation process. This paper also presents some experiments on the MERT optimisation in the Statistical Machine Translation system Moses with the TER and the BLEU metrics and some metric combinations .
Document type :
Journal articles
Complete list of metadatas

Cited literature [9 references]  Display  Hide  Download

Contributor : Christophe Servan <>
Submitted on : Friday, May 29, 2015 - 10:21:33 AM
Last modification on : Friday, December 2, 2016 - 4:50:56 PM
Long-term archiving on : Tuesday, September 15, 2015 - 8:11:38 AM


Publication funded by an institution




  • HAL Id : hal-01157949, version 1




Christophe Servan, Holger Schwenk. Optimising Multiple Metrics with MERT. The Prague Bulletin of Mathematical Linguistics, 2011, pp.109. ⟨hal-01157949⟩



Record views


Files downloads