Event-Arguments Extraction Corpus and Modeling using BERT for Arabic

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Citations (Scopus)

Abstract

Event-argument extraction is a challenging task, particularly in Arabic due to sparse linguistic resources. To fill this gap, we introduce the WojoodHadath corpus (550k tokens) as an extension of Wojood, enriched with event-argument annotations. We used three types of event arguments: agent, location, and date, which we annotated as relation types. Our inter-annotator agreement evaluation resulted in 82.23% Kappa score and 87.2% F1-score. Additionally, we propose a novel method for event relation extraction using BERT, in which we treat the task as text entailment. This method achieves an F1-score of 94.01%. To further evaluate the generalization of our proposed method, we collected and annotated another out-of-domain corpus (about 80k tokens) called WojoodOutOfDomain and used it as a second test set, on which our approach achieved promising results (83.59% F1-score). Last but not least, we propose an end-to-end system for event-arguments extraction. This system is implemented as part of SinaTools, and both corpora are publicly available at https://sina.birzeit.edu/wojood.

Original languageEnglish
Title of host publicationArabicNLP 2024 - 2nd Arabic Natural Language Processing Conference, Proceedings of the Conference
EditorsNizar Habash, Houda Bouamor, Ramy Eskander, Nadi Tomeh, Ibrahim Abu Farha, Ahmed Abdelali, Samia Touileb, Injy Hamed, Yaser Onaizan, Bashar Alhafni, Wissam Antoun, Salam Khalifa, Hatem Haddad, Imed Zitouni, Badr AlKhamissi, Rawan Almatham, Khalil Mrini
PublisherAssociation for Computational Linguistics (ACL)
Pages309-319
Number of pages11
ISBN (Electronic)9798891761322
Publication statusPublished - 2024
Externally publishedYes
Event2nd Arabic Natural Language Processing Conference, ArabicNLP 2024 - Bangkok, Thailand
Duration: 16 Aug 2024 → …

Publication series

NameArabicNLP 2024 - 2nd Arabic Natural Language Processing Conference, Proceedings of the Conference

Conference

Conference2nd Arabic Natural Language Processing Conference, ArabicNLP 2024
Country/TerritoryThailand
CityBangkok
Period16/08/24 → …

Fingerprint

Dive into the research topics of 'Event-Arguments Extraction Corpus and Modeling using BERT for Arabic'. Together they form a unique fingerprint.

Cite this