In the next chapter, Breaking ELF Software Protection, we will discuss the ins and outs of software encryption and packing with ELF executables. Viruses and malware are very commonly encrypted or packed with some type of protection mechanism, which can also include anti-debugging techniques to make analyzing the binary very difficult. Without giving a complete exegesis on the subject, here are some common anti-debugging measures taken by ELF binary protectors that are commonly used to wrap around malware.
This technique takes advantage of the fact that a program can only be traced by one process at a time. Almost all debuggers use ptrace, including GDB. The idea is that a program can trace itself so that no other debugger can attach.
void anti_debug_check(void)
{
if (ptrace(PTRACE_TRACEME, 0, 0, 0) < 0) {
printf("A debugger is attached, but not for long!\n");
kill(getpid());
exit(0);
}
}The function in Illustration 4.9 will kill the program (itself) if one is attached with a debugger; it will know because it will fail to trace itself. Otherwise, it will succeed in tracing itself, and no other tracers will be allowed, preventing debuggers.
While debugging, we often set breakpoints, and when a breakpoint is hit, it generates a SIGTRAP signal, which is caught by our debugger's signal handler; the program halts and we can inspect it. With this technique, the program sets up a signal handler to catch SIGTRAP signals and then deliberately issues a breakpoint instruction. When the program's SIGTRAP handler catches it, it will increment a global variable from 0 to 1.
The program can then check to see whether the global variable is set to 1, if it is, that means that our program caught the breakpoint and there is no debugger present; otherwise, if it is 0, it must have been caught by a debugger. At this point, the program can choose to kill itself or exit in order to prevent debugging:
static int caught = 0;
int sighandle(int sig)
{
caught++;
}
int detect_debugger(void)
{
__asm__ volatile("int3");
if (!caught) {
printf("There is a debugger attached!\n");
return 1;
}
}This dynamic file exists for every process and includes a lot of information, including whether or not the process is currently being traced.
An example of the layout of /proc/self/status, which can be parsed to detect tracers/debuggers, is as follows:
ryan@elfmaster:~$ head /proc/self/status Name: head State: R (running) Tgid: 19813 Ngid: 0 Pid: 19813 PPid: 17364 TracerPid: 0 Uid: 1000 1000 1000 1000 Gid: 31337 31337 31337 31337 FDSize: 256
As highlighted in the preceding output, tracerPid: 0 means that the process is not being traced. All that a program must do to see whether it is being traced is to open /proc/self/status and check whether or not the value is 0. If not, then it knows it is being traced and it can kill itself or exit.
Code obfuscation (also known as code transformation) is a technique where assembly-level code is modified to include opaque branch instructions or misaligned instructions that throw off the disassembler's ability to read the bytecode correctly. Consider the following example:
jmp antidebug + 1 antidebug: .short 0xe9 ;first byte of a jmp instruction mov $0x31337, %eax
When the preceding code is compiled and viewed with the objdump disassembler, it looks like this:
4: eb 01 jmp 7 <antidebug+0x1> <antidebug:> 6: e9 00 b8 37 13 jmpq 1337b80b b: 03 00 add (%rax),%eax
The code is actually doing a mov $0x31337, %eax operation, and functionally, it performs that correctly, but because there was a single 0xe9 before that, the disassembler perceived it as a jmp instruction (since 0xe9 is the prefix for a jmp).
So, code transformation doesn't change the way the code functions, only how it looks. A smart disassembler such as IDA wouldn't be fooled by the preceding code snippet, because it uses control flow analysis when generating the disassembly.
This is a technique that I conceived in 2008 and have not seen used widely, but I would be surprised if it hasn't been used somewhere. The idea behind this uses the knowledge we have gained about the ELF string tables for symbol names and section headers. Tools such as objdump and gdb (often used in reverse engineering) rely on the string table to learn the names of functions and sections within an ELF file. This technique scrambles the order of the name of each symbol and section. The result is that section headers will be all mixed up (or appear to be) and so will the names of functions and symbols.
This technique can be very misleading to a reverse engineer; for instance, they might think they are looking at a function called check_serial_number(), when really they are looking at safe_strcpy(). I have implemented this in a tool called elfscure, available at http://www.bitlackeys.org/projects/elfscure.c.