On the Representation of Results of Binary Code Reverse Engineering

V. A. Padaryan; I. N. Ledovskikh

doi:10.1134/S0361768818030064

On the Representation of Results of Binary Code Reverse Engineering

Авторы: Padaryan V.A.¹, Ledovskikh I.N.¹
Учреждения:
1. Institute for System Programming
Выпуск: Том 44, № 3 (2018)
Страницы: 200-206
Раздел: Article
URL: https://journals.rcsi.science/0361-7688/article/view/176615
DOI: https://doi.org/10.1134/S0361768818030064
ID: 176615

Цитировать

Полный текст

Открытый доступ
Доступ закрыт

Доступ предоставлен
Доступ закрыт

Только для подписчиков

Аннотация
Об авторах
Список литературы
Дополнительные файлы
Статистика

Аннотация

A representation of algorithms extracted from binary code by reverse engineering is discussed. Both intermediate representations designed for automatic analysis and final representations passed to the end user are considered. The two main tasks of reverse engineering—automatic detection of exploitable vulnerabilities and discovery of undocumented features— are analyzed. The basic scheme of the system implementing the automatic detection of exploitable vulnerabilities is presented and the key properties of the intermediate representation designed for solving this problem using an efficient generation of a system of equations for an SMT solver are described. The workflow for discovering undocumented features is described. These steps are the localization of the algorithm, its representation in the form that is convenient for analysis, and investigation of its properties. To automate the first phase, a combined static and dynamic representation is constructed, which includes OS-level events and calls to library functions; they serve as anchor points used by the analyst for the algorithm localization. The further support of localization uses code slicing and navigation algorithms. Once the algorithm is localized, the further work goes in two directions: interactive construction of a compact annotated representation of the algorithm by a flowchart and automated investigation of the algorithm properties aimed at determining declared and undeclared data flows. The representation of the algorithm is based on the construction of simplified models of functions taking into account input and output buffers and on the automatic detection of data dependences between buffers of various function calls. The overall scenario of the analyst' work with such a flowchart in the context of discovering undocumented features is described; this scenario is based on annotating the declared data flows and on the automatic detection of undeclared data flows. In conclusion, an example of the resulting representation is discussed and the directions of further research are discussed.

Ключевые слова

binary code, combined analysis, intermediate representation

Об авторах

V. Padaryan

Institute for System Programming

Автор, ответственный за переписку.
Email: vartan@ispras.ru
Россия, Moscow, 109004

I. Ledovskikh

Institute for System Programming

Email: vartan@ispras.ru
Россия, Moscow, 109004

Дополнительные файлы

Доп. файлы

Действие

1. JATS XML

Скачать

Имя пользователя
Пароль
Запомнить меня

Забыли пароль?	Регистрация

Имя пользователя
Пароль
Запомнить меня

Забыли пароль?	Регистрация