Nabra: Syrian Arabic Dialects with Morphological Annotations

  • Amal Nayouf
  • , Tymaa Hasanain Hammouda
  • , Mustafa Jarrar
  • , Fadi A. Zaraket
  • , Mohamad Bassam Kurdy

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

12 Citations (Scopus)

Abstract

This paper presents Nâbr̄a (), a corpora of Syrian Arabic dialects with morphological annotations. A team of Syrian natives collected more than 6K sentences containing about 60K words from several sources including social media posts, scripts of movies and series, lyrics of songs and local proverbs to build Nâbr̄a. Nâbr̄a covers several local Syrian dialects including those of Aleppo, Damascus, Deir-ezzur, Hama, Homs, Huran, Latakia, Mardin, Raqqah, and Suwayda. A team of nine annotators annotated the 60K tokens with full morphological annotations across sentence contexts. We trained the annotators to follow methodological annotation guidelines to ensure unique morpheme annotations, and normalized the annotations. F1 and κ agreement scores ranged between 74% and 98% across features, showing the excellent quality of Nâbr̄a annotations. Our corpora are open-source and publicly available as part of the Currasat portal https://sina.birzeit.edu/currasat.

Original languageEnglish
Title of host publicationArabicNLP 2023 - 1st Arabic Natural Language Processing Conference, Porceedings
EditorsHassan Sawaf, Samhaa El-Beltagy, Wajdi Zaghouani, Walid Magdy, Nadi Tomeh, Ibrahim Abu Farha, Nizar Habash, Salam Khalifa, Amr Keleg, Hatem Haddad, Imed Zitouni, Ahmed Abdelali, Khalil Mrini, Rawan Almatham
PublisherAssociation for Computational Linguistics (ACL)
Pages12-23
Number of pages12
ISBN (Electronic)9781959429272
DOIs
Publication statusPublished - 2023
Externally publishedYes
Event1st Arabic Natural Language Processing Conference, ArabicNLP 2023 - Hybrid, Singapore, Singapore
Duration: 7 Dec 2023 → …

Publication series

NameArabicNLP 2023 - 1st Arabic Natural Language Processing Conference, Proceedings

Conference

Conference1st Arabic Natural Language Processing Conference, ArabicNLP 2023
Country/TerritorySingapore
CityHybrid, Singapore
Period7/12/23 → …

Fingerprint

Dive into the research topics of 'Nabra: Syrian Arabic Dialects with Morphological Annotations'. Together they form a unique fingerprint.

Cite this