Skip to main navigation Skip to search Skip to main content

Recent advances in ASR Applied to an Arabic transcription system for Al-Jazeera

  • Patrick Cardinal*
  • , Ahmed Ali
  • , Najim Dehak
  • , Yu Zhang
  • , Tuka Al Hanai
  • , Yifan Zhang
  • , James Glass
  • , Stephan Vogel
  • *Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

Abstract

This paper describes a detailed comparison of several state-of-the-art speech recognition techniques applied to a limited Arabic broadcast news dataset. The different approaches were all trained on 50 hours of transcribed audio from the Al-Jazeera news channel. The best results were obtained using i-vector-based speaker adaptation in a training scenario using the Minimum Phone Error (MPE) criteria combined with sequential Deep Neural Network (DNN) training. We report results for two different types of test data: broadcast news reports, with a best word error rate (WER) of 17.86%, and a broadcast conversations with a best WER of 29.85%. The overall WER on this test set is 25.6%.

Original languageEnglish
Pages (from-to)2088-2092
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Publication statusPublished - 2014
Event15th Annual Conference of the International Speech Communication Association: Celebrating the Diversity of Spoken Languages, INTERSPEECH 2014 - Singapore, Singapore
Duration: 14 Sept 201418 Sept 2014

Keywords

  • ASR system
  • Arabic
  • Kaldi

Fingerprint

Dive into the research topics of 'Recent advances in ASR Applied to an Arabic transcription system for Al-Jazeera'. Together they form a unique fingerprint.

Cite this