APPLICATION OF COMPUTER SIMULATION TO THE ANONYMIZATION OF PERSONAL DATA: STATE-OF-THE-ART AND KEY POINTS
- Авторлар: BORISOV A.1, BOSOV A.1, IVANOV A.1
-
Мекемелер:
- Federal Research Center “Informatics and Management” RAS
- Шығарылым: № 4 (2023)
- Беттер: 58-74
- Бөлім: DATA ANALYSIS
- URL: https://journals.rcsi.science/0132-3474/article/view/137641
- DOI: https://doi.org/10.31857/S0132347423040040
- EDN: https://elibrary.ru/RDBXOS
- ID: 137641
Дәйексөз келтіру
Аннотация
A new version of GInv (Gröbner Involutive) for computing involutive Gröbner bases is presented as a library in C++11. GInv uses object-oriented memory reallocation for dynamic data structures, such as lists, red-black trees, binary trees, and GMP libraries for arbitrary-precision integer calculations. The interface of the package is designed as a Python3 module.
Авторлар туралы
A. BORISOV
Federal Research Center “Informatics and Management” RAS
Email: aborisov@ipiran.ru
Moscow, Russia
A. BOSOV
Federal Research Center “Informatics and Management” RAS
Email: avbosov@ipiran.ru
Moscow, Russia
A. IVANOV
Federal Research Center “Informatics and Management” RAS
Хат алмасуға жауапты Автор.
Email: aivanov@ipiran.ru
Moscow, Russia
Әдебиет тізімі
- Aggarwal C.C., Yu P.S. A General Survey of Privacy-Preserving Data Mining Models and Algorithms. In: Aggarwal C.C., Yu P.S. (eds) Privacy-Preserving Data Mining. Advances in Database Systems. 2008. V. 34. Springer, Boston, MA.
- Domingo-Ferrer J., Farràs O., Ribes-González J., Sánchez D. Privacy-preserving cloud computing on sensitive data: A survey of methods, products and challenges // Computer Communications. 2019. V. 140–141. P. 38–60.
- Sahi M.A. et al. Privacy Preservation in e-Healthcare Environments: State of the Art and Future Directions // IEEE Access. 2018. V. 6. P. 464–478. https://doi.org/10.1109/ACCESS.2017.2767561
- Spiekermann S., Cranor L.F. Engineering Privacy // IEEE Transactions on Software Engineering. 2009. V. 35. № 1. P. 67–82. https://doi.org/10.1109/TSE.2008.88
- Verykios V.S., Bertino E., Fovino I.N., Provenza L.P., Saygin Y., Theodoridis Y. State-of-the-art in privacy preserving data mining // ACM SIGMOD Record. 2004. V. 33. № 1.
- Guide to Basic Data Anonymization Technique. Personal Data Protection Commission, Singapore. 2018.
- Newton E., Sweeney L., Malin B. Preserving Privacy by De-identifying Facial Images // IEEE Transactions on Knowledge and Data Engineering. 2005.
- Sweeney L. Privacy-Preserving Bio-terrorism Surveillance // AAAI Spring Symposium, AI Technologies for Homeland Security. 2005.
- Sweeney L. AI Technologies to Defeat Identity Theft Vulnerabilities // AAAI Spring Symposium, AI Technologies for Homeland Security. 2005.
- Sweeney L., Gross R. Mining Images in Publicly-Available Cameras for Homeland Security // AAAI Spring Symposium, AI Technologies for Homeland Security. 2005.
- Agrawal R., Srikant R. Privacy-Preserving Data Mining // Proceedings of the ACM SIGMOD Conference. 2000.
- Agrawal D., Aggarwal C.C. On the Design and Quantification of Privacy-Preserving Data Mining Algorithms // ACM PODS Conference. 2002.
- Aggarwal G., Feder T., Kenthapadi K., Motwani R., Panigrahy R., Thomas D., Zhu A. Approximation Algorithms for k-anonymity. Journal of Privacy Technology. 2005. № 20051120001.
- Aggarwal C.C. On k-anonymity and the curse of dimensionality // VLDB Conference. 2005.
- LeFevre K., DeWitt D., Ramakrishnan R. Incognito: Full Domain K-Anonymity // ACM SIGMOD Conference. 2005.
- Meyerson A., Williams R. On the complexity of optimal k-anonymity // ACM PODS Conference. 2004.
- Machanavajjhala A., Gehrke J., Kifer D., Venkitasubramaniam M. L-Diversity: Privacy Beyond k-Anonymity // ICDE Conference. 2006.
- Li N., Li T., Venkatasubramanian S. t-Closeness: Privacy beyond k-anonymity and l-diversity // ICDE Conference. 2007.
- Dwork C., Nissim K. Privacy-Preserving Data Mining on Vertically Partitioned Databases // CRYPTO. 2004.
- Vaidya J., Clifton C. Privacy-Preserving Decision Trees over vertically partitioned data // Lecture Notes in Computer Science. 2005. V. 3654.
- Yu H., Vaidya J., Jiang X. Privacy-Preserving SVM Classification on Vertically Partitioned Data // PAKDD Conference. 2006.
- Verykios V.S., Elmagarmid A., Bertino E., Saygin Y., Dasseni E. Association Rule Hiding // IEEE Transactions on Knowledge and Data Engineering. 2004. V. 16. № 4.
- Moskowitz I., Chang L. A decision theoretic system for information downgrading // Joint Conference on Information Sciences. 2000.
- Adam N., Wortmann J.C. Security-Control Methods for Statistical Databases: A Comparison Study // ACM Computing Surveys. 1989. V. 21. № 4.
- Liew C.K., Choi U.J., Liew C.J. A data distortion by probability distribution // ACM TODS. 1985. V. 10. № 3. P. 395–411.
- Warner S.L. Randomized Response: A survey technique for eliminating evasive answer bias // Journal of American Statistical Association. 1965. V. 60. № 309. P. 63–69.
- Silverman B.W. Density Estimation for Statistics and Data Analysis. Chapman and Hall. 1986.
- Aggarwal C.C. On Randomization, Public Information and the Curse of Dimensionality // ICDE Conference. 2007.
- Gambs S., Kegl B., Aimeur E. Privacy-Preserving Boosting // Knowledge Discovery and Data Mining Journal. 2007. V. 14. № 1. P. 131–170.
- Zhang P., Tong Y., Tang S., Yang D. Privacy-Preserving Naive Bayes Classifier // Lecture Notes in Computer Science. 2005. V. 3584.
- Evfimievski A., Srikant R., Agrawal R., Gehrke J. Privacy-Preserving Mining of Association Rules // ACM KDD Conference. 2002.
- Rizvi S., Haritsa J. Maintaining Data Privacy in Association Rule Mining // VLDB Conference. 2002.
- Agrawal R., Srikant R., Thomas D. Privacy-Preserving OLAP // Proceedings of the ACM SIGMOD Conference. 2005.
- Polat H., Du W. SVD-based collaborative filtering with privacy // ACM SAC Symposium. 2005.
- Bertino E., Fovino I., Provenza L. A Framework for Evaluating Privacy-Preserving Data Mining Algorithms // Data Mining and Knowledge Discovery Journal. 2005. V. 11. P. 121–154.
- Evfimievski A., Gehrke J., Srikant R. Limiting Privacy Breaches in Privacy Preserving Data Mining // ACM PODS Conference. 2003.
- Huang Z., Du W., Chen B. Deriving Private Information from Randomized Data // ACM SIGMOD Conference. 2005. P. 37–48.
- Kargupta H., Datta S.,Wang Q., Sivakumar K. On the Privacy Preserving Properties of Radom Data Perturbation Techniques // ICDM Conference. 2003. P. 99–106.
- Johnson W., Lindenstrauss J. Extensions of Lipshitz Mapping into Hilbert Space // Contemporary Math. 1984. V. 26. P. 189–206.
- Oliveira S.R.M., Zaiane O. Privacy Preserving Clustering by Data Transformation // Proc. 18th Brazilian Symp. Databases. 2003. P. 304–318.
- Oliveira S.R.M., Zaiane O. Data Perturbation by Rotation for Privacy-Preserving Clustering // Technical Report TR04-17, Department of Computing Science, University of Alberta, Edmonton, AB, Canada. 2004.
- Chen K., Liu L. Privacy-preserving data classification with rotation perturbation // ICDM Conference. 2005.
- Liu K., Kargupta H., Ryan J. Random Projection Based Multiplicative Data Perturbation for Privacy Preserving Distributed Data Mining // IEEE Transactions on Knowledge and Data Engineering. 2006. V. 18. № 1.
- Kim J., Winkler W. Multiplicative Noise for Masking Continuous Data // Technical Report Statistics 2003-01, Statistical Research Division, US Bureau of the Census, Washington D.C. 2003.
- Mukherjee S., Chen Z., Gangopadhyay S. A privacy-preserving technique for Euclidean distance-based mining algorithms using Fourier based transforms // VLDB Journal. 2006.
- Liu K., Giannella C. Kargupta H. An Attacker’s View of Distance Preserving Maps for Privacy-Preserving Data Mining // PKDD Conference. 2006.
- Fienberg S., McIntyre J. Data Swapping: Variations on a Theme by Dalenius and Reiss // Technical Report, National Institute of Statistical Sciences. 2003.
- Samarati P. Protecting Respondents’ Identities in Microdata Release // IEEE Trans. Knowl. Data Eng. 2001. V. 13. № 6. P. 1010–1027.
- Bayardo R.J., Agrawal R. Data Privacy through Optimal k-Anonymization // Proceedings of the ICDE Conference. 2005. P. 217–228.
- Fung B., Wang K., Yu P. Top-Down Specialization for Information and Privacy Preservation // ICDE Conference. 2005.
- Wang K., Yu P., Chakraborty S. Bottom-Up Generalization: A Data Mining Solution to Privacy Protection // ICDM Conference. 2004.
- Domingo-Ferrer J., Mateo-Sanz J. Practical data-oriented micro-aggregation for statistical disclosure control // IEEE TKDE. 2002. V. 14. № 1.
- Aggarwal G., Feder T., Kenthapadi K., Khuller S., Motwani R., Panigrahy R., Thomas D., Zhu A. Achieving Anonymity via Clustering // ACM PODS Conference. 2006.
- Aggarwal C.C., Yu P.S. A Condensation approach to privacy preserving data mining // EDBT Conference. 2004.
- Winkler W. Using simulated annealing for k-anonymity // Technical Report 7, US Census Bureau, Washington D.C. 20233. 2002.
- Iyengar V.S. Transforming Data to Satisfy Privacy Constraints // KDD Conference. 2002.
- Lakshmanan L., Ng R., Ramesh G. To Do or Not To Do: The Dilemma of Disclosing Anonymized Data // ACM SIGMOD Conference. 2005.
- Aggarwal C.C., Yu P.S. On Variable Constraints in Privacy-Preserving Data Mining // SIAM Conference. 2005.
- Xiao X., Tao Y. Personalized Privacy Preservation // ACM SIGMOD Conference. 2006.
- Wang K., Fung B.C.M. Anonymization for Sequential Releases // ACM KDD Conference. 2006.
- Pei J., Xu J., Wang Z., Wang W., Wang K. Maintaining k-Anonymity against Incremental Updates // Symposium on Scientific and Statistical Database Management. 2007.
- Aggarwal C.C., Yu P.S. On Privacy-Preservation of Text and Sparse Binary Data with Sketches // SIAM Conference on Data Mining. 2007.
- Aggarwal C.C., Yu P.S. On Anonymization of String Data // SIAM Conference on Data Mining. 2007.
- Martin D., Kifer D., Machanavajjhala A., Gehrke J., Halpern J. Worst-Case Background Knowledge // ICDE Conference. 2007.
- Pinkas B. Cryptographic Techniques for Privacy-Preserving Data Mining // ACM SIGKDD Explorations. 2002. V. 4. № 2.
- Even S., Goldreich O., Lempel A. A Randomized Protocol for Signing Contracts // Communications of the ACM. 1985. V. 28.
- Rabin M.O. How to exchange secrets by oblivious transfer // Washington D.C. 20233TR-81, Aiken Corporation Laboratory. 1981.
- Naor M., Pinkas B. Efficient Oblivious Transfer Protocols // SODA Conference. 2001.
- Yao A.C. How to Generate and Exchange Secrets // FOCS Conference. 1986.
- Chaum D., Crepeau C., Damgard I. Multiparty unconditionally secure protocols // ACM STOC Conference. 1988.
- Ioannidis I., Grama A., Atallah M. A secure protocol for computing dot-products in clustered and distributed environments // International Conference on Parallel Processing. 2002.
- Du W., Atallah M. Secure Multi-party Computation: A Review and Open Problems // CERIAS Technical Report 2001-51, Purdue University. 2001.
- Clifton C., Kantarcioglou M., Lin X., Zhu M. Tools for privacy preserving distributed data mining // ACM SIGKDD Explorations. 2002. V. 4. № 2.
- Lindell Y., Pinkas B. Privacy-Preserving Data Mining // CRYPTO. 2000.
- Kantarcioglu M., Vaidya J. Privacy-Preserving Naive Bayes Classifier for Horizontally Partitioned Data // IEEE Workshop on Privacy-Preserving Data Mining. 2003.
- Yu H., Jiang X., Vaidya J. Privacy-Preserving SVM using nonlinear Kernels on Horizontally Partitioned Data // SAC Conference. 2006.
- Yang Z., Zhong S., Wright R. Privacy-Preserving Classification of Customer Data without Loss of Accuracy // SDM Conference. 2006.
- Kantarcioglu M., Clifton C. Privacy-Preserving Distributed Mining of Association Rules on Horizontally Partitioned Data // IEEE TKDE Journal. 2004. V. 16. № 9.
- Inan A., Saygin Y., Savas E., Hintoglu A., Levi A. Privacy-Preserving Clustering on Horizontally Partitioned Data // Data Engineering Workshops. 2006.
- Jagannathan G., Wright R. Privacy-Preserving Distributed k-means clustering over arbitrarily partitioned data // ACM KDD Conference. 2005.
- Jagannathan G., Pillaipakkamnatt K., Wright R. A New Privacy-Preserving Distributed k-Clustering Algorithm // SIAM Conference on Data Mining. 2006.
- Polat H., Du W. Privacy-Preserving Top-N Recommendations on Horizontally Partitioned Data // Web Intelligence. 2005.
- Bawa M., Bayardo R.J., Agrawal R. Privacy-Preserving Indexing of Documents on the Network // VLDB Conference. 2003.
- Vaidya J., Clifton C. Privacy-Preserving Association Rule Mining in Vertically Partitioned Databases // ACM KDD Conference. 2002.
- Vaidya J., Clifton C. Privacy-Preserving Naive Bayes Classifier over vertically partitioned data // SIAM Conference. 2004.
- Vaidya J., Clifton C. Privacy-Preserving k-means clustering over vertically partitioned Data // ACM KDD Conference. 2003.
- Jiang W., Clifton C. Privacy-preserving distributed k-Anonymity // Proceedings of the IFIP 11.3 Working Conference on Data and Applications Security. 2005.
- Wang K., Fung B.C.M., Dong G. Integrating Private Databases for Data Analysis // Lecture Notes in Computer Science. 2005. V. 3495.
- Zhong S., Yang Z., Wright R. Privacy-enhancing k-anonymization of customer data // Proc. of the ACM SIGMOD-SIGACT-SIGART Principles of Database Systems, Baltimore, MD. 2005.
- Bettini C., Wang X.S., Jajodia S. Protecting Privacy against Location Based Personal Identification // Proc. of Secure Data Management Workshop, Trondheim, Norway. 2005.
- Gedik B., Liu L. A customizable k-anonymity model for protecting location privacy // ICDCS Conference. 2005.
- Mimoto T., Kiyomoto Sh., Miyaji A. Secure Data Management Technology // In Security Infrastructure Technology for Integrated Utilization of Big Data (T. Mimoto and A. Miyaji eds.), Singapore, Springer Open. 2020.
- Oliveira S.R.M., Zaiane O., Saygin Y. Secure Association-Rule Sharing // PAKDD Conference. 2004.
- Saygin Y., Verykios V., Clifton C. Using Unknowns to prevent discovery of Association Rules // ACM SIGMOD Record. 2001. V. 30. № 4.
- Atallah M., Elmagarmid A., Ibrahim M., Bertino E., Verykios V. Disclosure limitation of sensitive rules // Workshop on Knowledge and Data Engineering Exchange. 1999.
- Dasseni E., Verykios V., Elmagarmid A., Bertino E. Hiding Association Rules using Confidence and Support // 4th Information Hiding Workshop. 2001.
- Chang L., Moskowitz I. An integrated framework for database inference and privacy protection. Data and Applications Security. Kluwer. 2000.
- Saygin Y., Verykios V., Elmagarmid A. Privacy-Preserving Association Rule Mining // 12th International Workshop on Research Issues in Data Engineering. 2002.
- Wu Y.-H., Chiang C.-M., Chen A.L.P. Hiding Sensitive Association Rules with Limited Side Effects // IEEE Transactions on Knowledge and Data Engineering. 2007. V. 19. № 1.
- Aggarwal C., Pei J., Zhang B. A Framework for Privacy Preservation against Adversarial Data Mining // ACM KDD Conference. 2006.
- Chang L., Moskowitz I. Parsimonious downgrading and decision trees applied to the inference problem // New Security Paradigms Workshop. 1998.
- Natwichai J., Li X., Orlowska M. A Reconstruction-based Algorithm for Classification Rules Hiding // Australasian Database Conference. 2006.
- Kenthapadi K., Mishra N., Nissim K. Simulatable Auditing // ACM PODS Conference. 2005.
- Nabar S., Marthi B., Kenthapadi K., Mishra N., Motwani R. Towards Robustness in Query Auditing // VLDB Conference. 2006.
- Chawla S., Dwork C., McSherry F., Smith A., Wee H. Towards Privacy in Public Databases // TCC. 2005.
- Mishra N., Sandler M. Privacy vs Pseudorandom Sketches // ACM PODS Conference. 2006.
- Blum A., Dwork C., McSherry F., Nissim K. Practical Privacy: The SuLQ Framework // ACM PODS Conference. 2005.
- Dinur I., Nissim K. Revealing Information while preserving privacy // ACM PODS Conference. 2003.
- Dwork C., Kenthapadi K., McSherry F., Mironov I., Naor M. Our Data, Ourselves: Privacy via Distributed Noise Generation // EUROCRYPT. 2006.
- Dwork C., McSherry F., Nissim K., Smith A. Calibrating Noise to Sensitivity in Private Data Analysis // TCC. 2006.
- Wang K., Fung B.C.M., Yu P. Template based Privacy-Preservation in classification problems // ICDM Conference. 2005.
- Kifer D., Gehrke J. Injecting utility into anonymized datasets // SIGMOD Conference. 2006. P. 217–228.
- Xu J., Wang W., Pei J., Wang X., Shi B., Fu A.W.C. Utility Based Anonymization using Local Recoding // ACM KDD Conference. 2006.
- LeFevre K., DeWitt D., Ramakrishnan R. Workload Aware Anonymization // KDD Conference. 2006.
- Koudas N., Srivastava D., Yu T., Zhang Q. Aggregate Query Answering on Anonymized Tables // ICDE Conference. 2007.
- Malin B., Sweeney L. Re-identification of DNA through an automated linkage process // Proc. AMIA Symp. 2001. P. 423–427.
- Malin B. Why methods for genomic data privacy fail and what we can do to fix it // AAAS Annual Meeting, Seattle, WA. 2004.
- ARTICLE 29 DATA PROTECTION WORKING PARTY. Opinion 05/2014 on Anonymisation Techniques. Adopted on 10 April 2014.
- Sweeney L. Replacing Personally Identifiable Information in Medical Records, the Scrub System // Proc. AMIA Annu Fall Symp. 1996. P. 333–337.
- Sweeney L. Guaranteeing Anonymity while Sharing Data, the Datafly System // Proc. AMIA Annu Fall Symp. 1997. P. 51–55.
- Sweeney L. Privacy Technologies for Homeland Security // Testimony before the Privacy and Integrity Advisory Committee of the Department of Homeland Security, Boston, MA, June 15. 2005.
- Malin B., Sweeney L. Detrmining the identifiability of DNA database entries // Proc. AMIA Symp. 2000. P. 537–541.
- Malin B. Protecting DNA Sequence Anonymity with Generalization Lattices // Methods of Information in Medicine. 2005. V. 44. № 5. P. 687–692.
- Hodson H. Revealed: Google AI has access to huge haul of NHS patient data // New Scientist, 29 Apr 2016.
- Cadwalladr C., Graham-Harrison E. Revealed: 50 million facebook profiles harvested for Cambridge Analytica in major data breach // The Guardian, 17 Mar 2018.
- Harmon A. Indian tribe wins fight to limit research of its DNA // New York Times. 2010, April, 22.
- Meyer M. Law, Ethics & Science of Re-identification Demonstrations // Bill of Health: Examining the Intersection of Health Law, Biotechnology and Bioethics, Petrie Flom Center at Harvard University. 2021.
- Ohm P. Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization // UCLA Law Review. 2010. V. 57. P. 1700–1777.
- de Montjoye Y.-A., Radaelli L., Singh V.K., Pentland A. Unique in the shopping mall: on the reidentifiability of credit card metadata // Science. 2015. V. 347. P. 536–539.
- Golle P. Revisiting the uniqueness of simple demographics in the U.S. population // Workshop on privacy in the electronic society, New York, Association for Computive Machinery. 2006.
- Rocher L., Hendrickx J.M., de Montjoye Y.-A. Estimating the success of re-identifications in incomplete datasets using generative models // Nat. Commun.. 2019. V. 10. № 1 (3069).
- Culnane C., Rubinstein B.I.P., Teague V. Health data in an open world // Preprint at: https://arxiv.org/abs/ 1712.05627. 2017.
- Siddle J. I know where you were last summer: London’s public bike data is telling everyone where you’ve been // vartree.blogspot.com. 2014.
- Lavrenovs A., Podins K. Privacy violations in Riga open data public transport system // 2016 IEEE 4th Workshop on Advances in Information, Electronic and Electrical Engineering (AIEEE), Vilnius, Lithuania. 2016. P. 1–6.
- Narayanan A., Shmatikov V. Robust De-anonymization of Large Sparse Datasets // IEEE Symposium on Security and Privacy. 2008. P. 111–125.