🔧На сайте запланированы технические работы
25.12.2025 в промежутке с 18:00 до 21:00 по Московскому времени (GMT+3) на сайте будут проводиться плановые технические работы. Возможны перебои с доступом к сайту. Приносим извинения за временные неудобства. Благодарим за понимание!
🔧Site maintenance is scheduled.
Scheduled maintenance will be performed on the site from 6:00 PM to 9:00 PM Moscow time (GMT+3) on December 25, 2025. Site access may be interrupted. We apologize for the inconvenience. Thank you for your understanding!

 

Investigation of features for extraction of named entities from texts in Russian


Дәйексөз келтіру

Толық мәтін

Ашық рұқсат Ашық рұқсат
Рұқсат жабық Рұқсат берілді
Рұқсат жабық Тек жазылушылар үшін

Аннотация

This paper considers various features for extracting named entities from texts in Russian, which are used within the approaches based on machine learning, including the features of a token itself (lexeme), as well as vocabulary, contextual, cluster, and two-stage features. The contribution of each feature to improving the quality of extraction of named entities is studied. The CRF-classifier is used as a method of machine learning in the experiments that are described in this paper. The contribution of features is compared based on two open collections using the F-measure.

Негізгі сөздер

Авторлар туралы

V. Mozharova

Department of Computational Mathematics and Cybernetics

Хат алмасуға жауапты Автор.
Email: valerie.mozharova@gmail.com
Ресей, Moscow, 119991

N. Lukashevich

Scientific Research Computational Center

Email: valerie.mozharova@gmail.com
Ресей, Moscow, 119991

Қосымша файлдар

Қосымша файлдар
Әрекет
1. JATS XML

© Allerton Press, Inc., 2017