Evaluating Layers of Representation in Neural Machine Translation on Part-of-Speech and Semantic Tagging Tasks

Yonatan Belinkov, Lluís Màrquez, Hassan Sajjad, Nadir Durrani, Fahim Dalvi, James Glass

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

107 Citations (Scopus)

Abstract

While neural machine translation (NMT) models provide improved translation quality in an elegant framework, it is less clear what they learn about language. Recent work has started evaluating the quality of vector representations learned by NMT models on morphological and syntactic tasks. In this paper, we investigate the representations learned at different layers of NMT encoders. We train NMT systems on parallel data and use the models to extract features for training a classifier on two tasks: part-of-speech and semantic tagging. We then measure the performance of the classifier as a proxy to the quality of the original NMT model for the given task. Our quantitative analysis yields interesting insights regarding representation learning in NMT models. For instance, we find that higher layers are better at learning semantics while lower layers tend to be better for part-of-speech tagging.

Original languageEnglish
Title of host publication8th International Joint Conference on Natural Language Processing - Proceedings of the IJCNLP 2017, System Demonstrations
PublisherAssociation for Computational Linguistics (ACL)
Pages1-10
Number of pages10
ISBN (Electronic)9781948087025
Publication statusPublished - 2017
Event8th International Joint Conference on Natural Language Processing, IJCNLP 2017 - Taipei, Taiwan, Province of China
Duration: 27 Nov 20171 Dec 2017

Publication series

Name8th International Joint Conference on Natural Language Processing - Proceedings of the IJCNLP 2017, System Demonstrations
Volume1

Conference

Conference8th International Joint Conference on Natural Language Processing, IJCNLP 2017
Country/TerritoryTaiwan, Province of China
CityTaipei
Period27/11/171/12/17

Fingerprint

Dive into the research topics of 'Evaluating Layers of Representation in Neural Machine Translation on Part-of-Speech and Semantic Tagging Tasks'. Together they form a unique fingerprint.

Cite this