Unsupervised Graph Anomaly Detection Algorithms Implemented in Apache Spark
- Authors: Semenov A.1, Mazeev A.1, Doropheev D.2, Yusubaliev T.3
-
Affiliations:
- Scientific Research Centre for Electronic Computer Technology (NICEVT) JSC
- Moscow Institute of Physics and Technology (State University)
- Quality Software Solutions Ltd.
- Issue: Vol 39, No 9 (2018)
- Pages: 1262-1269
- Section: Part 1. Special issue “High Performance Data Intensive Computing” Editors: V. V. Voevodin, A. S. Simonov, and A. V. Lapin
- URL: https://journals.rcsi.science/1995-0802/article/view/203172
- DOI: https://doi.org/10.1134/S1995080218090184
- ID: 203172
Cite item
Abstract
The graph anomaly detection problem occurs in many application areas and can be solved by spotting outliers in unstructured collections of multi-dimensional data points, which can be obtained by graph analysis algorithms. We implement the algorithm for the small community analysis and the approximate LOF algorithm based on Locality-Sensitive Hashing, apply the algorithms to a real world graph and evaluate scalability of the algorithms. We use Apache Spark as one of the most popular Big Data frameworks.
About the authors
A. Semenov
Scientific Research Centre for Electronic Computer Technology (NICEVT) JSC
Author for correspondence.
Email: semenov@nicevt.ru
Russian Federation, Varshavskoe sh. 125, Moscow, 117587
A. Mazeev
Scientific Research Centre for Electronic Computer Technology (NICEVT) JSC
Email: semenov@nicevt.ru
Russian Federation, Varshavskoe sh. 125, Moscow, 117587
D. Doropheev
Moscow Institute of Physics and Technology (State University)
Email: semenov@nicevt.ru
Russian Federation, Institutskii per. 9, Dolgoprudny, Moscow oblast, 141701
T. Yusubaliev
Quality Software Solutions Ltd.
Email: semenov@nicevt.ru
Russian Federation, Moscow