Segmentation of the period of the fundamental tone of a voice source
- Авторы: Sorokin V.N.1
-
Учреждения:
- Institute for Information Transmission Problems
- Выпуск: Том 62, № 2 (2016)
- Страницы: 244-254
- Раздел: Acoustic Signal Processing. Computer Simulation
- URL: https://journals.rcsi.science/1063-7710/article/view/185652
- DOI: https://doi.org/10.1134/S1063771016020135
- ID: 185652
Цитировать
Аннотация
The extrema of the logarithmic derivative of the mean energy of a voice signal in the frequency range of 1000–3000 Hz are used to determine the instants of opening and closure of the glottis. The inaccuracy of analysis is estimated with the Arctic CMU database, which contains synchronous recordings of speech signals and electro-glottograms. The estimates of the instants of opening and closure of the glottis, found by the developed algorithm, are compared with the instants of the maximum and minimum of the derivative from electro-glottogram signals, which are taken as the “true” instants. The mean square deviation of the glottal opening instant from the extrema of the derivative from the electro-glottogram signals for different speakers is in the range of 1.03–1.64 ms. The error rate of a false estimate of the glottal opening instant is from 0.01 to 0.14%, and the error rate of omission is from 0.42 to 2.38%. An error-detection algorithm is developed. The mean square deviation with an relative—to the period of the fundamental tone—error in detecting the glottal opening instant is in the range of 13–18% for the most probable error from 0 to +5%.
Об авторах
V. Sorokin
Institute for Information Transmission Problems
Автор, ответственный за переписку.
Email: vns@iitp.ru
Россия, Bol’shoi Karetnyi per. 19, Moscow, 101447
Дополнительные файлы
