Investigation of features for extraction of named entities from texts in Russian

V. A. Mozharova; N. V. Lukashevich

doi:10.3103/S0005105517030049

Investigation of features for extraction of named entities from texts in Russian

Autores: Mozharova V.A.¹, Lukashevich N.V.²
Afiliações:
1. Department of Computational Mathematics and Cybernetics
2. Scientific Research Computational Center
Edição: Volume 51, Nº 3 (2017)
Páginas: 127-134
Seção: Text Processing Automation
URL: https://journals.rcsi.science/0005-1055/article/view/150171
DOI: https://doi.org/10.3103/S0005105517030049
ID: 150171

Citar

Texto integral

Acesso aberto
Acesso é fechado

Acesso está concedido
Acesso é fechado

Somente assinantes

Resumo
Sobre autores
Bibliografia
Arquivos suplementares
Estatísticas

Resumo

This paper considers various features for extracting named entities from texts in Russian, which are used within the approaches based on machine learning, including the features of a token itself (lexeme), as well as vocabulary, contextual, cluster, and two-stage features. The contribution of each feature to improving the quality of extraction of named entities is studied. The CRF-classifier is used as a method of machine learning in the experiments that are described in this paper. The contribution of features is compared based on two open collections using the F-measure.

Palavras-chave

named entity, information extraction, machine learning

Sobre autores

V. Mozharova

Department of Computational Mathematics and Cybernetics

Autor responsável pela correspondência
Email: valerie.mozharova@gmail.com
Rússia, Moscow, 119991

N. Lukashevich

Scientific Research Computational Center

Email: valerie.mozharova@gmail.com
Rússia, Moscow, 119991

Arquivos suplementares

Ação

1. JATS XML

Baixar

Nome de usuário
Senha
Lembrar usuário

Esqueceu a senha?	Cadastro

Nome de usuário
Senha
Lembrar usuário

Esqueceu a senha?	Cadastro