Reverse Engineering of Software Using the Smart Brute Force Method: Step-by-Step Scheme
- Authors: Izrailov K.E.1, Buinevich M.V.2
-
Affiliations:
- Saint-Petersburg University of State Fire Service of EMERCOM of Russia
- MIREA – Russian Technological University
- Issue: Vol 11, No 4 (2025)
- Pages: 129-142
- Section: INFORMATION TECHNOLOGIES AND TELECOMMUNICATION
- URL: https://journals.rcsi.science/1813-324X/article/view/309040
- DOI: https://doi.org/10.31854/1813-324X-2025-11-4-129-142
- EDN: https://elibrary.ru/UOKLHB
- ID: 309040
Cite item
Full Text
Abstract
Introduction: software vulnerabilities is one of the leading causes of threats to information security. Such vulnerabilities can be countered by directly searching for them in the program code and correcting it. This requires converting the executable code to a higher-level representation that's more suitable for searching and fixes; however, for a number of reasons, existing solutions cannot be considered satisfactory. One of these solutions – an exhaustive search of all possible variants of the source code, converted to a given machine code – is extremely costly in every way.Purpose: developing a less costly and more efficient method of exhaustive searching through source code variants.Methods: quantitative and qualitative comparison of different source code generators, as well as the formalization of this method by writing it in an analytical form.Results: a 7-step scheme for selecting an instance of the source code according to a given machine code is proposed; the authors refer to this method as «smart» because of its optimal combinations of syntactic constructions of the programming language. This method of code generation is based on iterating through paths along the graph of syntactic rules that represent the formal syntax of a programming language in a given space. The syntax is presented as a parameter, which makes its steps completely invariant from the programming language of the source code. After multiple instances of the source code are generated, they are compiled into machine code and compared with the specified instance; if they match, the task of decompilation by smart exhaustive search is considered solved. Practical significance: despite the time cost of using exhaustive searching in solving such tasks, the smart iteration method has shown expert efficiency in a number of application scenarios; thus, it can be directly applied to reverse engineering.Discussion: the qualitative optimization of the "smart" exhaustive search can significantly improve it by genetic algorithms used.
About the authors
K. E. Izrailov
Saint-Petersburg University of State Fire Service of EMERCOM of Russia
Email: konstantin.izrailov@mail.ru
M. V. Buinevich
MIREA – Russian Technological University
Email: bmv1958@yandex.ru
References
- Tan T.-T., Wang B.-S., Tang Y., Zhou X. Crash Analysis Mechanisms in Vulnerability Mining Research // Proceedings of the 4th International Conference on Computer and Communication Systems (Singapore, Singapore, 23‒25 February 2019). IEEE, 2019. PP. 355‒359. doi: 10.1109/CCOMS.2019.8821775
- Chondamrongkul N., Sun J., Warren I. Automated Security Analysis for Microservice Architecture // Proceedings of the International Conference on Software Architecture Companion (Salvador, Brazil, 16‒20 March 2020). IEEE, 2020. PP. 79‒82. doi: 10.1109/ICSA-C50368.2020.00024
- Iannone E., Guadagni R., Ferrucci F., De Lucia A., Palomba F. The Secret Life of Software Vulnerabilities: a Large-Scale Empirical Study // IEEE Transactions on Software Engineering. 2023. Vol. 49. Iss. 1. PP. 44‒63. doi: 10.1109/TSE.2022.3140868. EDN:GKKIKO
- Fu J., Zhang K., Zheng J., Li W., Zhu Y. Research and Application of Grey Box Detection Technology Based on Reverse Engineering and Dynamic Pollution Diffusion // Proceedings of the 7th Information Technology and Mechatronics Engineering Conference (Chongqing, China, 15‒17 September 2023). IEEE, 2023. PP. 2380‒2384. doi: 10.1109/ITOEC57671.2023.10291380
- Devine T.R., Campbell M., Anderson M., Dzielski D. SREP+SAST: A Comparison of Tools for Reverse Engineering Machine Code to Detect Cybersecurity Vulnerabilities in Binary Executables // Proceedings of the International Conference on Computational Science and Computational Intelligence (Las Vegas, USA, 14‒16 December 2022). IEEE, 2022. PP. 862‒869. doi: 10.1109/CSCI58124.2022.00156
- Bhardwaj V., Kukreja V., Sharma C., Kansal I., Popali R. Reverse Engineering-A Method for Analyzing Malicious Code Behavior // Proceedings of the International Conference on Advances in Computing, Communication, and Control (Mumbai, India, 03‒04 December 2021). IEEE, 2021. PP. 1‒5. doi: 10.1109/ICAC353642.2021.9697150
- Израилов К.Е., Покусов В.В. Архитектура программной платформы преобразования машинного кода в высоко-уровневое представление для экспертного поиска уязвимостей // Электронный сетевой политематический журнал «Научные труды КубГТУ». 2021. № 6. С. 93‒111. EDN:AIOUWF
- Буйневич М.В., Израилов К.Е., Покусов В.В., Тайлаков В.А., Федулина И.Н. Интеллектуальный метод алгоритмизации машинного кода в интересах поиска в нем уязвимостей // Защита информации. Инсайд. 2020. № 5(95). С. 57‒63. EDN:HIHDOM
- Cummins C., Fisches Z.V., Ben-Nun T., Hoefler T., O'Boyle M.F.P., Leather H. ProGraML: A Graph-based Program Representation for Data Flow Analysis and Compiler Optimizations // Proceedings of the 38th International Conference on Machine Learning (PMLR, 18‒24 July 2021). 2021. Vol. 139. PP. 2244‒2253.
- Израилов К.Е. Концепция генетической декомпиляции машинного кода телекоммуникационных устройств // Труды учебных заведений связи. 2021. Т. 7. № 4. С. 95‒109. doi: 10.31854/1813-324X-2021-7-4-95-109. EDN:AIOFPM
- Tonis R.B.M. Automating Scientific Paper Screening with Backus-Naur Form (BNF) Grammars // Didactica danubiensis. 2024. Vol. 4. Iss. 1. PP. 46–57.
- Израилов К.Е. Концепция генетической деэволюции представлений программы. Часть 1 // Вопросы кибербезопасности. 2024. № 1(59). С. 61‒66. doi: 10.21681/2311-3456-2024-1-61-66. EDN:CBCKRF
- Израилов К.Е. Концепция генетической деэволюции представлений программы. Часть 2 // Вопросы кибербезопасности. 2024. № 2(60). С. 81‒86. doi: 10.21681/2311-3456-2024-2-81-86. EDN:JUBPML
- Hamberger P., Klammer C., Luger T., Moser M., Pfeiffer M., Piereder C. Specification-Based Test Case Generation for C++ Engineering Software // Proceedings of the International Conference on Software Maintenance and Evolution (ICSME, Bogotá, Colombia, 01‒06 October 2023). IEEE, 2023. PP. 519‒529. doi: 10.1109/ICSME58846.2023.00066
- Sato Y. Specification-Based Test Case Generation with Constrained Genetic Programming // Proceedings of the 20th International Conference on Software Quality, Reliability and Security Companion (QRS-C, Macau, China, 11‒14 December 2020). IEEE, 2020. PP. 98‒103. doi: 10.1109/QRS-C51114.2020.00027
- Huang C., Zhou H., Zhao H., Cai W., Zhou Z.Q., Jiang M. On the Usefulness of Crossover in Search-Based Test Case Generation: An Industrial Report // Proceedings of the 29th Asia-Pacific Software Engineering Conference (APSEC, Japan, 06‒09 December 2022). IEEE, 2022. PP. 417‒421. doi: 10.1109/APSEC57359.2022.00054
- Schwachhofer D., Angione F., Becker S., Wagner S., Sauer M., Bernardi P., Polian I. Optimizing System-Level Test Program Generation via Genetic Programming // Proceedings of the European Test Symposium (ETS, The Hague, Netherlands, 20‒24 May 2024). IEEE, 2024. PP. 1‒4. doi: 10.1109/ETS61313.2024.10567817
- Supaartagorn C. Web application for automatic code generator using a structured flowchart // Proceedings of the International Conference on Software Engineering and Service Science (ICSESS, Beijing, China, 24‒26 November 2017). IEEE, 2017. PP. 114‒117. doi: 10.1109/ICSESS.2017.8342876
- Shinde K., Sun Y. Template-Based Code Generation Framework for Data-Driven Software Development // Proceedings of the 4th Intl Conf on Applied Computing and Information Technology / 3rd Intl Conf on Computational Science / Intelligence and Applied Informatics / 1st Intl Conf on Big Data, Cloud Computing, Data Science & Engineering (ACIT-CSII-BCD, Las Vegas, USA, 12‒14 December 2016). IEEE, 2016. PP. 55‒60. doi: 10.1109/ACIT-CSII-BCD.2016.023
- Shimonaka K., Sumi S., Higo Y., Kusumoto S. Identifying Auto-Generated Code by Using Machine Learning Techniques // Proceedings of the 7th International Workshop on Empirical Software Engineering in Practice (IWESEP, Osaka, Japan, 13 March 2016). IEEE, 2016. PP. 18‒23. doi: 10.1109/IWESEP.2016.18
- Igwe K., Pillay N. Automatic programming using genetic programming // Proceedings of the Third World Congress on Information and Communication Technologies (WICT 2013, Hanoi, Vietnam, 15‒18 December 2013). IEEE, 2013. PP. 337‒342. doi: 10.1109/WICT.2013.7113158
- Бирюков Д.Н., Дудкин А.С., Захаров О.О. Способ тестирования средств защиты информации на основе применения многовариантной генерации исходного кода по заданной функциональной спецификации // Труды Военно-космической академии имени А.Ф. Можайского. 2022. № 684. С. 113‒122. EDN:BJWKLG
- Самохвалов Э.Н., Ревунков Г.И., Гапанюк Ю.Е. Генерация исходного кода программного обеспечения на основе многоуровневого набора правил // Вестник Московского государственного технического университета им. Н.Э. Баумана. Серия Приборостроение. 2014. № 5(98). С. 77‒87. EDN:SVZLSL
- Соколов А.П., Макаренков В.М., Першин А.Ю., Лаишевский И.А. Разработка программного обеспечения генерации кода на основе шаблонов при создании систем инженерного анализа // Программная инженерия. 2019. Т. 10. № 9-10. С. 400‒416. doi: 10.17587/prin.10.400-416. EDN:CHYPRE
- Довгаль В.М., Корольков О.Ф., Чаплыгин А.А., Королькова В.О. К вопросу решения проблемы автоматической генерации кода программ по заданному управляющему продукционному алгоритму // В мире научных открытий. 2012. № 1(25). С. 220‒235. EDN:PBBWKP
- Андрианова А.А., Ицыксон В.М. Технология анализа исходного кода программного обеспечения и частичных спецификаций для автоматизированной генерации тестов // Системы и средства информатики. 2014. Т. 24. № 2. С. 99‒113. doi: 10.14357/08696527140207. EDN:SJHATL
- Саух А.М., Хмельнов А.Е. Трансляция фрагментов исходных текстов программ с использованием спецификаций синтаксиса и семантики языков программирования // Вестник Новосибирского государственного университета. Серия: Информационные технологии. 2013. Т. 11. № 3. С. 53‒62. EDN:RCHBLB
- Haq I.U., Caballero J.A. Survey of Binary Code Similarity // ACM Computing Surveys. 2021. Vol. 54. Iss. 3. PP. 1‒38. doi: 10.1145/3446371. EDN:KEPQCC
- Куделя В.Н. Методы перечисления путей в графе // Наукоемкие технологии в космических исследованиях Земли. 2023. Т. 15. № 5. С. 28‒38. doi: 10.36724/2409-5419-2023-15-5-28-38. EDN:HQEASN
- Кусаинов А.Р., Глазырина Н.С. Обзор инструментов статического анализа программного кода // Colloquium-Journal. 2020. № 32-1(84). С. 48‒52. EDN:JXSKQX
Supplementary files

