Machine learning methods for analyzing user behavior when accessing text data in information security problems


Cite item

Full Text

Open Access Open Access
Restricted Access Access granted
Restricted Access Subscription Access

Abstract

A new method for detecting user access to irrelevant documents based on estimating the document text membership in typical subject areas of the analyzed user is proposed. The typical subject areas are formed using subject area modeling implemented via orthonormal nonnegative matrix factorization. An experimental study with real corporate correspondence formed from an Enron data set demonstrates the high classification accuracy of the proposed method, compared to traditional approaches.

About the authors

I. V. Mashechkin

Department of Computational Mathematics and Cybernetics

Author for correspondence.
Email: mash@cs.msu.su
Russian Federation, Moscow, 119991

M. I. Petrovskii

Department of Computational Mathematics and Cybernetics

Email: mash@cs.msu.su
Russian Federation, Moscow, 119991

D. V. Tsarev

Department of Computational Mathematics and Cybernetics

Email: mash@cs.msu.su
Russian Federation, Moscow, 119991

Supplementary files

Supplementary Files
Action
1. JATS XML

Copyright (c) 2016 Allerton Press, Inc.