Performance Optimization of Speech Recognition System with Deep Neural Network Model

Wei Guan

doi:10.3103/S1060992X18040094

Performance Optimization of Speech Recognition System with Deep Neural Network Model

Авторлар: Wei Guan ¹
Мекемелер:
1. College of Modern Science and Technology, China Jiliang University
Шығарылым: Том 27, № 4 (2018)
Беттер: 272-282
Бөлім: Article
URL: https://journals.rcsi.science/1060-992X/article/view/195142
DOI: https://doi.org/10.3103/S1060992X18040094
ID: 195142

Дәйексөз келтіру

Толық мәтін

Ашық рұқсат
Рұқсат жабық

Рұқсат берілді
Рұқсат жабық

Тек жазылушылар үшін

Аннотация
Авторлар туралы
Әдебиет тізімі
Қосымша файлдар
Статистика

Аннотация

With the development of internet, man-machine interaction has tended to be more important. Precise speech recognition has become an important means to achieve man-machine interaction. In this study, deep neural network model was used to enhance speech recognition performance. Feedforward fully connected deep neural network, time-delay neural network, convolutional neural network and feedforward sequence memory neural network were studied, and their speech recognition performance was studied by comparing their acoustic models. Moreover, the recognition performance of the model after adding different dimension human voice features was tested. The results showed that the performance of the speech recognition system could be improved effectively by using the deep neural network model, and the performance of feedforward sequence memory neural network was the best, followed by deep neural network, time-delay neural network and convolutional neural network. Different extraction features had different improvement effects on model performance. The performance of the model which was added with Fbank extraction features was superior to that added with Mel-frequency cepstrum coefficient (MFCC) extraction feature. The model performance improved after the addition of vocal characteristics. Different models had different vocal characteristic dimensions.

Негізгі сөздер

deep neural network, acoustic model, speech recognition, discriminative training, performance optimization

Авторлар туралы

Wei Guan

College of Modern Science and Technology, China Jiliang University

Хат алмасуға жауапты Автор.
Email: gwcjlu@163.com
ҚХР, HangzhouZhejiang

Қосымша файлдар

Әрекет

1. JATS XML

Жүктеу

Пайдаланушының аты
Құпиясөз
Мені есте сақтау

Құпия сөзді ұмыттыңыз ба?	Тіркеу

Пайдаланушының аты
Құпиясөз
Мені есте сақтау

Құпия сөзді ұмыттыңыз ба?	Тіркеу