How to Use C Compiler for Reverse Engineering
Member 13737597 - 27/Jul/2020
Member 13737597 - 27/Jul/2020
[SHOWTOGROUPS=4,20,22]
This article shows how to bring the power of real C compiler for reverse engineering purposes.
Make your reverse engineering effort successful. Switch to real C language, say no to pseudocode.
Introduction
We can use IDA to examine assembly code, and translate disassembly to some pseude C code by hand. We can even use HexRays decompiler. However, no matter how good pseudo code is, it is still pseudocode. It cannot be compiled and tested. The chances of errors and ambiguities increase with amount of pseudocode, and we can finally get lost.
I thought about another method of reverse engineering. We will still use IDA to inspect disassembly, however we will translate disassembly to real C code on function basis, and force inspected program to use our code instead. We will accumulate our code in a separate DLL, just like IDA's database for inspected executable. This way, we will have solid ground of C (type system, clear names, function prototypes). We can write something that actually builds, and we can run the program to see if we got it right. If we made errors, the program will probably crash.
As we advance through reverse engineering process, we can translate more functions, rename struct fields, functions, variables in our C code, when their purpose becomes clear. Also we can make changes to IDA's database to keep things synced. I think it increases our chances to achieve our reverse engineering goals, especially for large programs, that contain thousands of functions.
Let's See the Idea
I think it's preferable to run program as usual, without any tricks, like CreateProcess with suspended flag, DLL injection, etc. The idea is to create dlc DLL for each image (EXE, DLL) we are interested in. This dlc DLL will be loaded together (right after) with image, and will replace original functions with jumps to our own functions (inside dlc DLL). How can we force dlc DLL loading? That's simple, we need to modify image's import descriptor (for real-world programs, each image will always import functions from at least one module). Also, we need to make sure that image's code section has write permission.
Let's demonstrate simple 32-bit example. It will be empty console application (it just returns number).
main.c:
Hide Copy Code
int main(int argc, char *argv[])
{
return 0x10203040; // to find function in IDA disassembly
}
Compile EXE image and see its imports:
dumpbin /imports main.exe
[/SHOWTOGROUPS]
This article shows how to bring the power of real C compiler for reverse engineering purposes.
Make your reverse engineering effort successful. Switch to real C language, say no to pseudocode.
Introduction
We can use IDA to examine assembly code, and translate disassembly to some pseude C code by hand. We can even use HexRays decompiler. However, no matter how good pseudo code is, it is still pseudocode. It cannot be compiled and tested. The chances of errors and ambiguities increase with amount of pseudocode, and we can finally get lost.
I thought about another method of reverse engineering. We will still use IDA to inspect disassembly, however we will translate disassembly to real C code on function basis, and force inspected program to use our code instead. We will accumulate our code in a separate DLL, just like IDA's database for inspected executable. This way, we will have solid ground of C (type system, clear names, function prototypes). We can write something that actually builds, and we can run the program to see if we got it right. If we made errors, the program will probably crash.
As we advance through reverse engineering process, we can translate more functions, rename struct fields, functions, variables in our C code, when their purpose becomes clear. Also we can make changes to IDA's database to keep things synced. I think it increases our chances to achieve our reverse engineering goals, especially for large programs, that contain thousands of functions.
Let's See the Idea
I think it's preferable to run program as usual, without any tricks, like CreateProcess with suspended flag, DLL injection, etc. The idea is to create dlc DLL for each image (EXE, DLL) we are interested in. This dlc DLL will be loaded together (right after) with image, and will replace original functions with jumps to our own functions (inside dlc DLL). How can we force dlc DLL loading? That's simple, we need to modify image's import descriptor (for real-world programs, each image will always import functions from at least one module). Also, we need to make sure that image's code section has write permission.
Let's demonstrate simple 32-bit example. It will be empty console application (it just returns number).
main.c:
Hide Copy Code
int main(int argc, char *argv[])
{
return 0x10203040; // to find function in IDA disassembly
}
Compile EXE image and see its imports:
dumpbin /imports main.exe
[/SHOWTOGROUPS]