Understanding and Improving Morphological Learning in the Neural Machine Translation Decoder

Fahim Dalvi, Nadir Durrani, Hassan Sajjad, Yonatan Belinkov*, Stephan Vogel

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

42 Citations (Scopus)

Abstract

End-to-end training makes the neural machine translation (NMT) architecture simpler, yet elegant compared to traditional statistical machine translation (SMT). However, little is known about linguistic patterns of morphology, syntax and semantics learned during the training of NMT systems, and more importantly, which parts of the architecture are responsible for learning each of these phenomena. In this paper we i) analyze how much morphology an NMT decoder learns, and ii) investigate whether injecting target morphology into the decoder helps it produce better translations. To this end we present three methods: i) joint generation, ii) joint-data learning, and iii) multi-task learning. Our results show that explicit morphological information helps the decoder learn target language morphology and improves the translation quality by 0.2–0.6 BLEU points.

Original languageEnglish
Title of host publication8th International Joint Conference on Natural Language Processing - Proceedings of the IJCNLP 2017, System Demonstrations
PublisherAssociation for Computational Linguistics (ACL)
Pages142-151
Number of pages10
ISBN (Electronic)9781948087025
Publication statusPublished - 2017
Event8th International Joint Conference on Natural Language Processing, IJCNLP 2017 - Taipei, Taiwan, Province of China
Duration: 27 Nov 20171 Dec 2017

Publication series

Name8th International Joint Conference on Natural Language Processing - Proceedings of the IJCNLP 2017, System Demonstrations
Volume1

Conference

Conference8th International Joint Conference on Natural Language Processing, IJCNLP 2017
Country/TerritoryTaiwan, Province of China
CityTaipei
Period27/11/171/12/17

Fingerprint

Dive into the research topics of 'Understanding and Improving Morphological Learning in the Neural Machine Translation Decoder'. Together they form a unique fingerprint.

Cite this