Large scale Arabic error annotation: Guidelines and framework

  • Wajdi Zaghouani
  • , Behrang Mohit
  • , Nizar Habash
  • , Ossama Obeid
  • , Nadi Tomeh
  • , Alla Rozovskaya
  • , Noura Farra
  • , Sarah Alkuhlani
  • , Kemal Oflazer

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

79 Citations (Scopus)

Abstract

We present annotation guidelines and a web-based annotation framework developed as part of an effort to create a manually annotated Arabic corpus of errors and corrections for various text types. Such a corpus will be invaluable for developing Arabic error correction tools, both for training models and as a gold standard for evaluating error correction algorithms. We summarize the guidelines we created. We also describe issues encountered during the training of the annotators, as well as problems that are specific to the Arabic language that arose during the annotation process. Finally, we present the annotation tool that was developed as part of this project, the annotation pipeline, and the quality of the resulting annotations.

Original languageEnglish
Title of host publicationProceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014
EditorsNicoletta Calzolari, Khalid Choukri, Sara Goggi, Thierry Declerck, Joseph Mariani, Bente Maegaard, Asuncion Moreno, Jan Odijk, Helene Mazo, Stelios Piperidis, Hrafn Loftsson
PublisherEuropean Language Resources Association (ELRA)
Pages2362-2369
Number of pages8
ISBN (Electronic)9782951740884
Publication statusPublished - 2014
Externally publishedYes
Event9th International Conference on Language Resources and Evaluation, LREC 2014 - Reykjavik, Iceland
Duration: 26 May 201431 May 2014

Publication series

NameProceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014

Conference

Conference9th International Conference on Language Resources and Evaluation, LREC 2014
Country/TerritoryIceland
CityReykjavik
Period26/05/1431/05/14

Keywords

  • Arabic
  • Error annotation
  • Guidelines

Fingerprint

Dive into the research topics of 'Large scale Arabic error annotation: Guidelines and framework'. Together they form a unique fingerprint.

Cite this