Machine learning methods for analyzing user behavior when accessing text data in information security problems
- Authors: Mashechkin I.V.1, Petrovskii M.I.1, Tsarev D.V.1
-
Affiliations:
- Department of Computational Mathematics and Cybernetics
- Issue: Vol 40, No 4 (2016)
- Pages: 179-184
- Section: Article
- URL: https://journals.rcsi.science/0278-6419/article/view/176155
- DOI: https://doi.org/10.3103/S0278641916040051
- ID: 176155
Cite item
Abstract
A new method for detecting user access to irrelevant documents based on estimating the document text membership in typical subject areas of the analyzed user is proposed. The typical subject areas are formed using subject area modeling implemented via orthonormal nonnegative matrix factorization. An experimental study with real corporate correspondence formed from an Enron data set demonstrates the high classification accuracy of the proposed method, compared to traditional approaches.
About the authors
I. V. Mashechkin
Department of Computational Mathematics and Cybernetics
Author for correspondence.
Email: mash@cs.msu.su
Russian Federation, Moscow, 119991
M. I. Petrovskii
Department of Computational Mathematics and Cybernetics
Email: mash@cs.msu.su
Russian Federation, Moscow, 119991
D. V. Tsarev
Department of Computational Mathematics and Cybernetics
Email: mash@cs.msu.su
Russian Federation, Moscow, 119991
Supplementary files
