LU-BZU at SemEval-2021 Task 2: Word2Vec and Lemma2Vec performance in Arabic Word-in-Context disambiguation

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

15 Citations (Scopus)

Abstract

This paper presents a set of experiments to evaluate and compare between the performance of using CBOW Word2Vec and Lemma2Vec models for Arabic Word-in-Context (WiC) disambiguation without using sense inventories or sense embeddings. As part of the SemEval-2021 Shared Task 2 on WiC disambiguation, we used the dev.ar-ar dataset (2k sentence pairs) to decide whether two words in a given sentence pair carry the same meaning. We used two Word2Vec models: Wiki-CBOW, a pre-trained model on Arabic Wikipedia, and another model we trained on large Arabic corpora of about 3 billion tokens. Two Lemma2Vec models was also constructed based on the two Word2Vec models. Each of the four models was then used in the WiC disambiguation task, and then evaluated on the SemEval-2021 test.ar-ar dataset. At the end, we reported the performance of different models and compared between using lemma-based and word-based models.

Original languageEnglish
Title of host publicationSemEval 2021 - 15th International Workshop on Semantic Evaluation, Proceedings of the Workshop
EditorsAlexis Palmer, Nathan Schneider, Natalie Schluter, Guy Emerson, Aurelie Herbelot, Xiaodan Zhu
PublisherAssociation for Computational Linguistics (ACL)
Pages748-755
Number of pages8
ISBN (Electronic)9781954085701
DOIs
Publication statusPublished - 2021
Externally publishedYes
Event15th International Workshop on Semantic Evaluation, SemEval 2021, co-located with The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL-IJCNLP 2021 - Virtual, Online, Thailand
Duration: 5 Aug 20216 Aug 2021

Publication series

NameSemEval 2021 - 15th International Workshop on Semantic Evaluation, Proceedings of the Workshop

Conference

Conference15th International Workshop on Semantic Evaluation, SemEval 2021, co-located with The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL-IJCNLP 2021
Country/TerritoryThailand
CityVirtual, Online
Period5/08/216/08/21

Fingerprint

Dive into the research topics of 'LU-BZU at SemEval-2021 Task 2: Word2Vec and Lemma2Vec performance in Arabic Word-in-Context disambiguation'. Together they form a unique fingerprint.

Cite this