Integrating an Unsupervised Transliteration Model into Statistical Machine Translation

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

73 Citations (Scopus)

Abstract

We investigate three methods for integrating an unsupervised transliteration model into an end-to-end SMT system. We induce a transliteration model from parallel data and use it to translate OOV words. Our approach is fully unsupervised and language independent. In the methods to integrate transliterations, we observed improvements from 0.23-0.75 (∆ 0.41) BLEU points across 7 language pairs. We also show that our mined transliteration corpora provide better rule coverage and translation quality compared to the gold standard transliteration corpora.

Original languageEnglish
Title of host publicationEACL 2014 - 14th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference
PublisherAssociation for Computational Linguistics (ACL)
Pages148-153
Number of pages6
ISBN (Electronic)9781937284992
Publication statusPublished - 2014
Externally publishedYes
Event14th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2014 - Gothenburg, Sweden
Duration: 26 Apr 201430 Apr 2014

Publication series

NameEACL 2014 - 14th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference

Conference

Conference14th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2014
Country/TerritorySweden
CityGothenburg
Period26/04/1430/04/14

Fingerprint

Dive into the research topics of 'Integrating an Unsupervised Transliteration Model into Statistical Machine Translation'. Together they form a unique fingerprint.

Cite this