Approximation polynomial algorithm for the data editing and data cleaning problem


如何引用文章

全文:

开放存取 开放存取
受限制的访问 ##reader.subscriptionAccessGranted##
受限制的访问 订阅存取

详细

The work considers the mathematical aspects of one of the most fundamental problems of data analysis: search (choice) among a collection of objects for a subset of similar ones. In particular, the problem appears in connection with data editing and cleaning (removal of irrelevant (not similar) elements). We consider the model of this problem, i.e., the problem of searching for a subset of maximal cardinality in a finite set of points of the Euclidean space for which quadratic variation of points with respect to its unknown centroid does not exceed a given fraction of the quadratic variation of points of the input set with respect to its centroid. It is proved that the problem is strongly NP-hard. A polynomial 1/2-approximation algorithm is proposed. The results of the numerical simulation demonstrating the effectiveness of the algorithm are presented.

作者简介

A. Ageev

Sobolev Institute of Mathematics

Email: kelm@math.nsc.ru
俄罗斯联邦, Novosibirsk

A. Kel’manov

Sobolev Institute of Mathematics; Novosibirsk State University

编辑信件的主要联系方式.
Email: kelm@math.nsc.ru
俄罗斯联邦, Novosibirsk; Novosibirsk

A. Pyatkin

Sobolev Institute of Mathematics; Novosibirsk State University

Email: kelm@math.nsc.ru
俄罗斯联邦, Novosibirsk; Novosibirsk

S. Khamidullin

Novosibirsk State University

Email: kelm@math.nsc.ru
俄罗斯联邦, Novosibirsk

V. Shenmaier

Sobolev Institute of Mathematics

Email: kelm@math.nsc.ru
俄罗斯联邦, Novosibirsk

补充文件

附件文件
动作
1. JATS XML

版权所有 © Pleiades Publishing, Ltd., 2017