Investigation of features for extraction of named entities from texts in Russian

V. A. Mozharova; N. V. Lukashevich

doi:10.3103/S0005105517030049

Investigation of features for extraction of named entities from texts in Russian

Авторлар: Mozharova V.A.¹, Lukashevich N.V.²
Мекемелер:
1. Department of Computational Mathematics and Cybernetics
2. Scientific Research Computational Center
Шығарылым: Том 51, № 3 (2017)
Беттер: 127-134
Бөлім: Text Processing Automation
URL: https://journals.rcsi.science/0005-1055/article/view/150171
DOI: https://doi.org/10.3103/S0005105517030049
ID: 150171

Дәйексөз келтіру

Толық мәтін

Ашық рұқсат
Рұқсат жабық

Рұқсат берілді
Рұқсат жабық

Тек жазылушылар үшін

Аннотация
Авторлар туралы
Әдебиет тізімі
Қосымша файлдар
Статистика

Аннотация

This paper considers various features for extracting named entities from texts in Russian, which are used within the approaches based on machine learning, including the features of a token itself (lexeme), as well as vocabulary, contextual, cluster, and two-stage features. The contribution of each feature to improving the quality of extraction of named entities is studied. The CRF-classifier is used as a method of machine learning in the experiments that are described in this paper. The contribution of features is compared based on two open collections using the F-measure.

Негізгі сөздер

named entity, information extraction, machine learning

Авторлар туралы

V. Mozharova

Department of Computational Mathematics and Cybernetics

Хат алмасуға жауапты Автор.
Email: valerie.mozharova@gmail.com
Ресей, Moscow, 119991

N. Lukashevich

Scientific Research Computational Center

Email: valerie.mozharova@gmail.com
Ресей, Moscow, 119991

Қосымша файлдар

Әрекет

1. JATS XML

Жүктеу

Пайдаланушының аты
Құпиясөз
Мені есте сақтау

Құпия сөзді ұмыттыңыз ба?	Тіркеу

Пайдаланушының аты
Құпиясөз
Мені есте сақтау

Құпия сөзді ұмыттыңыз ба?	Тіркеу