Constructing a speech audio–video corpus by aligning long segments of speech and text

I. A. Karpukhin; A. S. Konushin

doi:10.3103/S0278641917020030

Constructing a speech audio–video corpus by aligning long segments of speech and text

Autores: Karpukhin I.A.¹, Konushin A.S.¹
Afiliações:
1. Department of Computational Mathematics and Cybernetics
Edição: Volume 41, Nº 2 (2017)
Páginas: 97-103
Seção: Article
URL: https://journals.rcsi.science/0278-6419/article/view/176185
DOI: https://doi.org/10.3103/S0278641917020030
ID: 176185

Citar

Texto integral

Acesso aberto
Acesso é fechado

Acesso está concedido
Acesso é fechado

Somente assinantes

Resumo
Sobre autores
Bibliografia
Arquivos suplementares
Estatísticas

Resumo

A new algorithm for aligning text with speech audio signals having lengths of up to several hours is proposed. The algorithm allows its quality to be effectively evaluated. The requirements on the acoustic model are not very demanding. The algorithm can be used to design an audio–video course for learning the Russian language.

Palavras-chave

aligning speech and text, audio–visual speech recognition

Sobre autores

I. Karpukhin

Department of Computational Mathematics and Cybernetics

Autor responsável pela correspondência
Email: karpuhini@yandex.ru
Rússia, Moscow, 119991

A. Konushin

Department of Computational Mathematics and Cybernetics

Email: karpuhini@yandex.ru
Rússia, Moscow, 119991

Arquivos suplementares

Ação

1. JATS XML

Baixar

Nome de usuário
Senha
Lembrar usuário

Esqueceu a senha?	Cadastro

Nome de usuário
Senha
Lembrar usuário

Esqueceu a senha?	Cadastro