Speaker Modeling Using Emotional Speech for More Robust Speaker Identification


如何引用文章

全文:

开放存取 开放存取
受限制的访问 ##reader.subscriptionAccessGranted##
受限制的访问 订阅存取

详细

Automatic identity recognition in fast, reliable and non-intrusive way is one of the most challenging topics in digital world of today. A possible approach to identity recognition is the identification by voice. Characteristics of speech relevant for automatic speaker recognition can be affected by external factors such as noise and channel distortions, but also by speaker-specific conditions—emotional or health states. The improvement of a speaker recognition system by different model training strategies are addressed in this paper in order to obtain the best performance of the system with only a limited amount of neutral and emotional speech data. The models adopted are a Gaussian Mixture Model and i-vectors whose inputs are Mel Frequency Cepstral Coefficients, and the experiments have been conducted on the Russian Language Affective speech database. The results show that the appropriate use of emotional speech in speaker model training improves the robustness of a speaker recognition system – both when tested on neutral and emotional speech.

作者简介

M. Milošević

School of Electrical Engineering, University of Belgrade

编辑信件的主要联系方式.
Email: mm125026p@student.etf.bg.ac.rs
, Belgrade, 11000

Ž. Nedeljković

School of Electrical Engineering, University of Belgrade

Email: mm125026p@student.etf.bg.ac.rs
, Belgrade, 11000

U. Glavitsch

GlavitschEggler Software

Email: mm125026p@student.etf.bg.ac.rs
瑞士, Baden, 5400

Ž. Đurović

School of Electrical Engineering, University of Belgrade

Email: mm125026p@student.etf.bg.ac.rs
, Belgrade, 11000


版权所有 © Pleiades Publishing, Inc., 2019
##common.cookie##