Neural Classification of Argument Elements and Styles in Arabic Competitive Debates

Al Zawqari Ali*, Mohamed Ahmed, Abdul Gabbar Al-Sharafi, Mohammad M. Khader, Ali Safa, Gerd Vandersteen

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Argument mining for spoken Modern Standard Arabic (MSA) remains almost untouched, largely because no sizeable, well-annotated resource exists. We fill that gap with an 80-hour, 110-debate corpus drawn from 29 countries and labelled by experts with a hybrid scheme that blends Aristotle’s Logos-Ethos-Pathos with an extended 13-element Toulmin model. Using this dataset, we form two tasks: 1) three-way rhetorical-style classification, and 2) full 13-label element detection. Zero- and few-shot large language models (DeepSeek-7B, Falcon3-Mamba-7B) plateau at 46% accuracy and 30% F1-score, illustrating corpus complexity. After fine-tuning, Arabic pre-trained encoder-based models lift the performance in the first task to 81% accuracy and 80% F1-score, with the compact CAMeLBERT-quarter outperforming larger variants thanks to reduced over-fitting. For the second task, all models drop to 50% F1-score, yet performance rebounds to 81% accuracy and 80% F1-score when predictions are post-processed back to Logos, Pathos, Ethos, revealing that most errors in the second task occur within persuasion modes rather than across them. A long-context state-space model (Mamba-130M) nears encoder scores once labels are post-processed (75% accuracy) but lags on 13 argumentation elements, showing that Arabic pre-training matters more than sheer context length. Because many debating applications sit in educational settings, we also report fairness metrics. Demographic-parity and equalized-odds gaps stay below 0.06 for both gender and native-language groups, indicating that higher accuracy does not come at the expense of equity. Our annotated corpus and baseline models establish the first rigorous benchmark for argument mining in Arabic competitive debates and serve as a blueprint for applying these methods to other low-resource languages.

Original languageEnglish
Pages (from-to)115944-115959
Number of pages16
JournalIEEE Access
Volume13
DOIs
Publication statusPublished - 2 Jul 2025

Keywords

  • Accuracy
  • Adaptation models
  • Analytical models
  • Annotations
  • Competitive debate
  • Complexity theory
  • Corpus annotation
  • Cultural differences
  • Data mining
  • Data models
  • Multilingual
  • Natural language processing
  • Neural argument mining
  • Standards

Fingerprint

Dive into the research topics of 'Neural Classification of Argument Elements and Styles in Arabic Competitive Debates'. Together they form a unique fingerprint.

Cite this