Detection of HMM Synthesized Speech by Wavelet Logarithmic Spectrum


Cite item

Full Text

Open Access Open Access
Restricted Access Access granted
Restricted Access Subscription Access

Abstract

Automatic speaker verification systems have achieved great performance and been widely adopted in many security applications. One of the important requirements for the verification system is its resilience to spoofing attacks, such as impersonation, replay, speech synthesis and voice conversion. Among these attacks, speech synthesis has a high risk to the verification systems. In this paper, a novel detection method for computer-generated speech, especially for HMM synthetic speech, is proposed. It is found that the wavelet coefficients in specified position show the obvious difference between the synthetic and natural speech. The logarithmic spectrum features are extracted from the wavelet coefficients and support vector machine is used as the classifier to evaluate the performance of our proposed algorithm. The experimental results over SAS corpus show that the proposed algorithm can achieve high detection accuracy and low equal error rate.

About the authors

Diqun Yan

College of Information Science and Engineering, Ningbo University; Guangdong Key Laboratory of Intelligent Information Processing and Shenzhen Key Laboratory of Media Security

Author for correspondence.
Email: yandiqun@nbu.edu.cn
China, Ningbo, 315211; Shenzhen, 518060

Li Xiang

College of Information Science and Engineering, Ningbo University

Email: yandiqun@nbu.edu.cn
China, Ningbo, 315211

Zhifeng Wang

College of Information Science and Engineering, Ningbo University

Email: yandiqun@nbu.edu.cn
China, Ningbo, 315211

Rangding Wang

College of Information Science and Engineering, Ningbo University

Email: yandiqun@nbu.edu.cn
China, Ningbo, 315211

Supplementary files

Supplementary Files
Action
1. JATS XML

Copyright (c) 2019 Allerton Press, Inc.