Whispered speech recognition based on gammatone filterbank cepstral coefficients
- Authors: Marković B.1, Galić J.1, Grozdić Ð.1, Jovičić S.T.1, Mijić M.1
-
Affiliations:
- Telecommunication Department, School of Electrical Engineering
- Issue: Vol 62, No 11 (2017)
- Pages: 1255-1261
- Section: Theory and Methods of Signal Processing
- URL: https://journals.rcsi.science/1064-2269/article/view/198953
- DOI: https://doi.org/10.1134/S1064226917110134
- ID: 198953
Cite item
Abstract
This paper presents the results on whispered speech recognition using gammatone filterbank cepstral coefficients for speaker dependent mode. The isolated words used for this experiment are taken from the Whi-Spe database. Whispered speech recognition is based on dynamic time warping and hidden Markov models methods. The experiments are focused on the following modes: normal speech, whispered speech and their combinations (normal/whispered and whispered/normal). The results demonstrated an important improvement in recognition after application of cepstral mean subtraction, especially in mixed train/test scenarios.
About the authors
B. Marković
Telecommunication Department, School of Electrical Engineering
Author for correspondence.
Email: brankomarko@yahoo.com
Serbia, Belgrade, 11000
J. Galić
Telecommunication Department, School of Electrical Engineering
Email: brankomarko@yahoo.com
Serbia, Belgrade, 11000
Ð. Grozdić
Telecommunication Department, School of Electrical Engineering
Email: brankomarko@yahoo.com
Serbia, Belgrade, 11000
S. T. Jovičić
Telecommunication Department, School of Electrical Engineering
Email: brankomarko@yahoo.com
Serbia, Belgrade, 11000
M. Mijić
Telecommunication Department, School of Electrical Engineering
Email: brankomarko@yahoo.com
Serbia, Belgrade, 11000