A probabilistically entropic mechanism of topical clusterisation along with thematic annotation for evolution analysis of meaningful social information of internet sources
- Authors: Gydovskikh D.V.1, Moloshnikov I.A.1, Naumov A.V.1,2, Rybka R.B.1,3, Sboev A.G.1,2,3,4, Selivanov A.A.1
-
Affiliations:
- National Research Center Kurchatov Institute
- National Research Nuclear University MEPhI
- Moscow Technological University (MIREA)
- Plekhanov Russian University of Economics
- Issue: Vol 38, No 5 (2017)
- Pages: 910-913
- Section: Article
- URL: https://journals.rcsi.science/1995-0802/article/view/200035
- DOI: https://doi.org/10.1134/S1995080217050134
- ID: 200035
Cite item
Abstract
An approach to monitoring temporal evolution of thematic clusters with evaluating their relations on base of probability and entropy methods is presented. It allows to get a temporary map of nested topics with their short annotations, concerning a predetermined main theme. The methods of semantic analysis of texts to generate topics and to find the most emotive of them to reflect a social significance are used. The technology word2vec was implemented to determine the relation of topics and evaluate their proximity to the main theme.
To increase the usability the visualization of nested topics is realized on base of a WEB interface. The proposed approach complements well the popular software for analyzing big volumes of data such as Elasticsearch (search for thematically similar documents). Results of case study of analyzing the theme “AEROFLOT” on base of news corpus which consists of 3 million messages is presented.
Keywords
About the authors
D. V. Gydovskikh
National Research Center Kurchatov Institute
Author for correspondence.
Email: dmitrygagus@gmail.com
Russian Federation, Moscow
I. A. Moloshnikov
National Research Center Kurchatov Institute
Email: dmitrygagus@gmail.com
Russian Federation, Moscow
A. V. Naumov
National Research Center Kurchatov Institute; National Research Nuclear University MEPhI
Email: dmitrygagus@gmail.com
Russian Federation, Moscow; Moscow
R. B. Rybka
National Research Center Kurchatov Institute; Moscow Technological University (MIREA)
Email: dmitrygagus@gmail.com
Russian Federation, Moscow; Moscow
A. G. Sboev
National Research Center Kurchatov Institute; National Research Nuclear University MEPhI; Moscow Technological University (MIREA); Plekhanov Russian University of Economics
Email: dmitrygagus@gmail.com
Russian Federation, Moscow; Moscow; Moscow; Moscow
A. A. Selivanov
National Research Center Kurchatov Institute
Email: dmitrygagus@gmail.com
Russian Federation, Moscow