Effect of Post-processing on Contextualized Word Representations

Research output: Contribution to journalConference articlepeer-review

6 Citations (Scopus)

Abstract

Post-processing of static embedding has been shown to improve their performance on both lexical and sequence-level tasks. However, post-processing for contextualized embeddings is an under-studied problem. In this work, we question the usefulness of post-processing for contextualized embeddings obtained from different layers of pre-trained language models. More specifically, we standardize individual neuron activations using z-score, min-max normalization, and by removing top principal components using the all-but-the-top method. Additionally, we apply unit length normalization to word representations. On a diverse set of pretrained models, we show that post-processing unwraps vital information present in the representations for both lexical tasks (such as word similarity and analogy) and sequence classification tasks. Our findings raise interesting points in relation to the research studies that use contextualized representations, and suggest z-score normalization as an essential step to consider when using them in an application.

Original languageEnglish
Pages (from-to)3127-3142
Number of pages16
JournalProceedings - International Conference on Computational Linguistics, COLING
Volume29
Issue number1
Publication statusPublished - 2022
Event29th International Conference on Computational Linguistics, COLING 2022 - Hybrid, Gyeongju, Korea, Republic of
Duration: 12 Oct 202217 Oct 2022

Fingerprint

Dive into the research topics of 'Effect of Post-processing on Contextualized Word Representations'. Together they form a unique fingerprint.

Cite this