SawtArabi: A Benchmark Corpus for Arabic TTS. Standard, Dialectal and Code-Switching

Vasista Sai Lodagala, Lamya Alkanhal, Daniel Izham, Shivam Mehta, Shammur Chowdhury, Aqeelah Makki, Hamdy S. Hussein, Gustav Eje Henter, Ahmed Ali

Research output: Contribution to journalConference articlepeer-review

Abstract

Curating Text-to-Speech (TTS) datasets is a strenuous task given the quality considerations. While it is hard to find high-quality TTS datasets in languages other than English, it is rare to come across code-switching (CS) datasets. As a part of this work, we curate a 4-hour Arabic-English TTS corpus consisting of code-switched Egyptian-English, monolingual Modern Standard Arabic (MSA), Egyptian, and English, all recorded by the same voice talent. We demonstrate the importance of vowelization and the need for better phonemization of Arabic text. To this effect, we present the modified espeak-ng phonemizer that handles various irregularities of espeak-ng over Arabic text. Upon training baseline TTS systems over this benchmark, we demonstrate its efficacy through extensive subjective evaluations.

Original languageEnglish
Pages (from-to)4793-4797
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
DOIs
Publication statusPublished - 2025
Event26th Interspeech Conference 2025 - Rotterdam, Netherlands
Duration: 17 Aug 202521 Aug 2025

Keywords

  • Code-switching
  • Dialectal Speech
  • Multilingual
  • Phonemization
  • Text-to-Speech Synthesis

Fingerprint

Dive into the research topics of 'SawtArabi: A Benchmark Corpus for Arabic TTS. Standard, Dialectal and Code-Switching'. Together they form a unique fingerprint.

Cite this