Identifying protected binaries

Identifying a protected binary is the first step in reverse-engineering it. We discussed the common anatomy of protected ELF executables in Chapter 5, Linux Binary Protection. Remember from what we learned that a protected binary is actually two executables that have been merged together: you have the stub executable (the decryptor program) and then the target executable.

One program is responsible for decrypting the other, and it is this program that is going to typically be the wrapper that wraps or contains an encrypted binary within it, as a payload of sorts. Identifying this outer program that we call a stub is typically pretty easy because of the blatant oddities you will see in the program header table.

Let's take a look at a 64-bit ELF binary that is protected using a protector I wrote in 2009 called elfcrypt:

$ readelf -l test.elfcrypt

Elf file type is EXEC (Executable file)
Entry point 0xa01136
There are 2 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000000 0x0000000000a00000 0x0000000000a00000
                 0x0000000000002470 0x0000000000002470  R E    1000
  LOAD           0x0000000000003000 0x0000000000c03000 0x0000000000c03000
                 0x000000000003a23f 0x000000000003b4df  RW     1000

So what are we seeing here? Or rather what are we not seeing?

This almost looks like a statically compiled executable because there is no PT_DYNAMIC segment and there is no PT_INTERP segment. However, if we run this binary and check /proc/$pid/maps, we see that this is not a statically compiled binary, but is in fact dynamically linked.

The following is the output from /proc/$pid/maps in the protected binary:

7fa7e5d44000-7fa7e9d43000 rwxp 00000000 00:00 0
7fa7e9d43000-7fa7ea146000 rw-p 00000000 00:00 0
7fa7ea146000-7fa7ea301000 r-xp 00000000 08:01 11406096  /lib/x86_64-linux-gnu/libc-2.19.so7fa7ea301000-7fa7ea500000 ---p 001bb000 08:01 11406096  /lib/x86_64-linux-gnu/libc-2.19.so
7fa7ea500000-7fa7ea504000 r--p 001ba000 08:01 11406096  /lib/x86_64-linux-gnu/libc-2.19.so
7fa7ea504000-7fa7ea506000 rw-p 001be000 08:01 11406096  /lib/x86_64-linux-gnu/libc-2.19.so
7fa7ea506000-7fa7ea50b000 rw-p 00000000 00:00 0
7fa7ea530000-7fa7ea534000 rw-p 00000000 00:00 0
7fa7ea535000-7fa7ea634000 rwxp 00000000 00:00 0                          [stack:8176]
7fa7ea634000-7fa7ea657000 rwxp 00000000 00:00 0
7fa7ea657000-7fa7ea6a1000 r--p 00000000 08:01 11406093  /lib/x86_64-linux-gnu/ld-2.19.so
7fa7ea6a1000-7fa7ea6a5000 rw-p 00000000 00:00 0
7fa7ea856000-7fa7ea857000 r--p 00000000 00:00 0

We can clearly see that the dynamic linker is mapped into the process address space, and so is libc. As discussed in Chapter 5, Linux Binary Protection, this is because the protection stub becomes responsible for loading the dynamic linker and setting up the auxiliary vector.

From the program header output, we can also see that the text segment address is 0xa00000, which is unusual. The default linker script used for compiling executables in x86_64 Linux defines the text address as 0x400000, and on 32-bit systems it is 0x8048000. Having a text address other than the default does not, on its own, suggest anything malicious, but should immediately raise suspicion. In the case of a binary protector, the stub must have a virtual address that does not conflict with the virtual address of the self-embedded executable it is protecting.

Analyzing a protected binary

True binary protection schemes that really do a good job will not be very easy to circumvent, but in more cases than not you can use some intermediate reverse engineering efforts to get past the encryption layer. The stub is responsible for decrypting the self-embedded executable within it, which can therefore be extracted from memory. The trick is to allow the stub to run long enough to map the encrypted executable into memory and decrypt it.

A very general algorithm can be used that tends to work on simple protectors, especially if they do not incorporate any anti-debugging techniques.

Determine the approximate number of instructions in the stub's text segment, represented by N.
Trace the program for N instructions.
Dump the memory from the expected location of the text segment (for example, 0x400000) and locate its data segment by using the program headers from the newly found text segment.

A good example of this simple technique can be demonstrated with Quenya, the 32-bit ELF manipulation software that I coded in 2008.

Note

UPX uses no anti-debugging techniques and is therefore relatively straightforward to unpack.

The following are the program headers of a packed executable:

$ readelf -l test.packed

Elf file type is EXEC (Executable file)
Entry point 0xc0c500
There are 2 program headers, starting at offset 52

Program Headers:
  Type          Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD          0x000000 0x00c01000 0x00c01000 0x0bd03 0x0bd03 R E 0x1000
  LOAD          0x000f94 0x08063f94 0x08063f94 0x00000 0x00000 RW  0x1000

We can see that the stub begins at 0xc01000, and Quenya will presume that the real text segment is at the expected address for a 32-bit ELF executable: 0x8048000.

Here is Quenya using its unpack feature to decompress test.packed:

$ quenya

Welcome to Quenya v0.1 -- the ELF modification and analysis tool
Designed and maintained by ElfMaster

Type 'help' for a list of commands
[Quenya v0.1@workshop] unpack test.packed test.unpacked
Text segment size: 48387 bytes
[+] Beginning analysis for executable reconstruction of process image (pid: 2751)
[+] Getting Loadable segment info...
[+] Found loadable segments: text segment, data segment
[+] text_vaddr: 0x8048000 text_offset: 0x0
[+] data_vaddr: 0x8062ef8 data_offset: 0x19ef8
[+] Dynamic segment location successful
[+] PLT/GOT Location: Failed
[+] Could not locate PLT/GOT within dynamic segment; attempting to skip PLT patches...
Opening output file: test.unpacked
Successfully created executable

As we can see, the Quenya unpack feature has allegedly unpacked the UPX packed executable. We can verify this by simply looking at the program headers of the unpacked executable:

readelf -l test.unpacked

Elf file type is EXEC (Executable file)
Entry point 0x804c041
There are 9 program headers, starting at offset 52

Program Headers:
  Type          Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  PHDR          0x000034 0x08048034 0x08048034 0x00120 0x00120 R E 0x4
  INTERP        0x000154 0x08048154 0x08048154 0x00013 0x00013 R   0x1
      [Requesting program interpreter: /lib/ld-linux.so.2]
  LOAD          0x000000 0x08048000 0x08048000 0x19b80 0x19b80 R E 0x1000
  LOAD          0x019ef8 0x08062ef8 0x08062ef8 0x00448 0x0109c RW  0x1000
  DYNAMIC       0x019f04 0x08062f04 0x08062f04 0x000f8 0x000f8 RW  0x4
  NOTE          0x000168 0x08048168 0x08048168 0x00044 0x00044 R   0x4
  GNU_EH_FRAME  0x016508 0x0805e508 0x0805e508 0x00744 0x00744 R   0x4
  GNU_STACK     0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x10
  GNU_RELRO     0x019ef8 0x08062ef8 0x08062ef8 0x00108 0x00108 R   0x1

Notice that the program headers are completely different from the ones we looked at previously when the executable was still packed. This is because we are no longer looking at the stub executable. We are looking at the executable that was compressed inside the stub. The unpacking technique we used is very generic and not very effective for more complicated protection schemes, but helps beginners gain an understanding into the process of reversing protected binaries.

Previous Chapter

Identifying text segment padding infections

Next Chapter

IDA Pro

Table of Contents for Learning Linux Binary Analysis

Identifying protected binaries

Analyzing a protected binary

Note

Table of Contents for
Learning Linux Binary Analysis