Speech quality assessment using 2D neurogram orthogonal moments

Research output: Contribution to journalArticlepeer-review

6 Citations (Scopus)

Abstract

This study proposes a new objective speech quality measure using the responses of a physiologically-based computational model of auditory nerve (AN). The population response of the model AN fibers to a speech signal is represented by a 2D neurogram, and features of the neurogram are extracted by orthogonal moments. A special type of orthogonal moment, the orthogonal Tchebichef-Krawtchouk moment, is used in this study. The proposed measure is compared to the subjective scores from two standard databases, the NOIZEUS and the supplement 23 to the P series (P.Sup23) of ITU-T Recommendations. The NOIZEUS database is used in the assessment of 11 speech enhancement algorithms whereas the P.Sup23 database is used in the ITU-T 8 kbit/s codec (Recommendation G.729) characterization test. The performance of the proposed speech quality measure is also compared to the results from some traditional objective quality measures. In general, the proposed neural-response-based metric yielded better results than most of the traditional acoustic-property-based quality measures. The proposed metric can be applied to evaluate the performance of various speech-enhancement algorithms and compression systems.

Original languageEnglish
Pages (from-to)34-48
Number of pages15
JournalSpeech Communication
Volume80
DOIs
Publication statusPublished - 1 Jun 2016
Externally publishedYes

Keywords

  • Auditory-nerve model
  • Discrete Tchebichef-Krawtchouk Transform DTKT
  • Neurogram
  • Orthogonal moments
  • PESQ
  • POLQA
  • Speech quality assessment

Fingerprint

Dive into the research topics of 'Speech quality assessment using 2D neurogram orthogonal moments'. Together they form a unique fingerprint.

Cite this