Improving machine translation via triangulation and transliteration

Nadir Durrani, Philipp Koehn

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

15 Citations (Scopus)

Abstract

In this paper we improve Urdu→Hindi⇄English machine translation through triangulation and transliteration. First we built an Urdu→Hindi SMT system by inducing triangulated and transliterated phrase-tables from Urdu-English and Hindi-English phrase translation models. We then use it to translate the Urdu part of the Urdu-English parallel data into Hindi, thus creating an artificial Hindi-English parallel data. Our phrase-translation strategies give an improvement of up to +3.35 BLEU points over a baseline Urdu→Hindi system. The synthesized data improve Hindi→English system by +0.35 and English→Hindi system by +1.0 BLEU points.

Original languageEnglish
Title of host publicationProceedings of the 17th Annual Conference of the European Association for Machine Translation, EAMT 2014
EditorsMarko Tadic, Philipp Koehn, Philipp Koehn, Andy Way, Johann Roturier
PublisherEuropean Association for Machine Translation
Pages71-78
Number of pages8
ISBN (Electronic)9789535537533
Publication statusPublished - 2014
Externally publishedYes
Event17th Annual Conference of the European Association for Machine Translation, EAMT 2014 - Dubrovnik, Croatia
Duration: 16 Jun 201418 Jun 2014

Publication series

NameProceedings of the 17th Annual Conference of the European Association for Machine Translation, EAMT 2014

Conference

Conference17th Annual Conference of the European Association for Machine Translation, EAMT 2014
Country/TerritoryCroatia
CityDubrovnik
Period16/06/1418/06/14

Fingerprint

Dive into the research topics of 'Improving machine translation via triangulation and transliteration'. Together they form a unique fingerprint.

Cite this