DEVELOPMENT OF ANNOTATED BANGLA SPEECH CORPORA

Firoj Alam, S. M.Murtoza Habib, Dil Afroza Sultana, Mumit Khan

Research output: Contribution to conferencePaperpeer-review

11 Citations (Scopus)

Abstract

This paper describes the development procedure of three different Bangla read speech corpora which can be used for phonetic research and developing speech applications. Several criteria were maintained in the corpora development process that includes considering the phonetic and prosodic features during text selection. On the other hand, a specification was maintained in the recording phase as the speaking style is a vital part in speech applications. We also concentrated on proper text normalization, pronunciation, aligning, and labeling. The labeling was done manually - in the present endeavor sentence level labeling (annotation) was completed by maintaining a specification so that it could be expanded in future.

Original languageEnglish
Pages35-41
Number of pages7
Publication statusPublished - 2010
Externally publishedYes
Event2nd Workshop on Spoken Language Technologies for Under-Resourced Languages, SLTU 2010 - Penang, Malaysia
Duration: 3 May 20105 May 2010

Conference

Conference2nd Workshop on Spoken Language Technologies for Under-Resourced Languages, SLTU 2010
Country/TerritoryMalaysia
CityPenang
Period3/05/105/05/10

Keywords

  • Phonetic research
  • Speech corpora
  • Speech processing

Fingerprint

Dive into the research topics of 'DEVELOPMENT OF ANNOTATED BANGLA SPEECH CORPORA'. Together they form a unique fingerprint.

Cite this