Classification by compression: Application of information-theory methods for the identification of themes of scientific texts


如何引用文章

全文:

开放存取 开放存取
受限制的访问 ##reader.subscriptionAccessGranted##
受限制的访问 订阅存取

详细

A method for automatic classification of scientific texts based on data compression is proposed. The method is implemented and investigated based on the data from an archive of scientific texts (arXiv.org) and in the CyberLeninka scientific electronic library (CyberLeninka.ru). Experiments showed that the method correctly identified the themes of scientific texts with a probability of 75–95%; its accuracy depends on the quality of the original data.

作者简介

I. Selivanova

The State Public Scientific Technological Library, Siberian Branch

编辑信件的主要联系方式.
Email: selivanova@ict.sbras.ru
俄罗斯联邦, Novosibirsk, 123298

B. Ryabko

Novosibirsk State University; Institute of Computational Technologies, Siberian Branch

Email: selivanova@ict.sbras.ru
俄罗斯联邦, Novosibirsk, 630090; Novosibirsk, 630090

A. Guskov

Novosibirsk State University; Institute of Computational Technologies, Siberian Branch

Email: selivanova@ict.sbras.ru
俄罗斯联邦, Novosibirsk, 630090; Novosibirsk, 630090

补充文件

附件文件
动作
1. JATS XML

版权所有 © Allerton Press, Inc., 2017