Table of Contents for
Learning Linux Binary Analysis

Version ebook / Retour

Cover image for bash Cookbook, 2nd Edition Learning Linux Binary Analysis by Ryan elfmaster O'Neill Published by Packt Publishing, 2016
  1. Cover
  2. Table of Contents
  3. Learning Linux Binary Analysis
  4. Learning Linux Binary Analysis
  5. Credits
  6. About the Author
  7. Acknowledgments
  8. About the Reviewers
  9. www.PacktPub.com
  10. Preface
  11. What you need for this book
  12. Who this book is for
  13. Conventions
  14. Reader feedback
  15. Customer support
  16. 1. The Linux Environment and Its Tools
  17. Useful devices and files
  18. Linker-related environment points
  19. Summary
  20. 2. The ELF Binary Format
  21. ELF program headers
  22. ELF section headers
  23. ELF symbols
  24. ELF relocations
  25. ELF dynamic linking
  26. Coding an ELF Parser
  27. Summary
  28. 3. Linux Process Tracing
  29. ptrace requests
  30. The process register state and flags
  31. A simple ptrace-based debugger
  32. A simple ptrace debugger with process attach capabilities
  33. Advanced function-tracing software
  34. ptrace and forensic analysis
  35. Process image reconstruction – from the memory to the executable
  36. Code injection with ptrace
  37. Simple examples aren't always so trivial
  38. Demonstrating the code_inject tool
  39. A ptrace anti-debugging trick
  40. Summary
  41. 4. ELF Virus Technology �� Linux/Unix Viruses
  42. ELF virus engineering challenges
  43. ELF virus parasite infection methods
  44. The PT_NOTE to PT_LOAD conversion infection method
  45. Infecting control flow
  46. Process memory viruses and rootkits – remote code injection techniques
  47. ELF anti-debugging and packing techniques
  48. ELF virus detection and disinfection
  49. Summary
  50. 5. Linux Binary Protection
  51. Stub mechanics and the userland exec
  52. Other jobs performed by protector stubs
  53. Existing ELF binary protectors
  54. Downloading Maya-protected binaries
  55. Anti-debugging for binary protection
  56. Resistance to emulation
  57. Obfuscation methods
  58. Protecting control flow integrity
  59. Other resources
  60. Summary
  61. 6. ELF Binary Forensics in Linux
  62. Detecting other forms of control flow hijacking
  63. Identifying parasite code characteristics
  64. Checking the dynamic segment for DLL injection traces
  65. Identifying reverse text padding infections
  66. Identifying text segment padding infections
  67. Identifying protected binaries
  68. IDA Pro
  69. Summary
  70. 7. Process Memory Forensics
  71. Process memory infection
  72. Detecting the ET_DYN injection
  73. Linux ELF core files
  74. Summary
  75. 8. ECFS – Extended Core File Snapshot Technology
  76. The ECFS philosophy
  77. Getting started with ECFS
  78. libecfs – a library for parsing ECFS files
  79. readecfs
  80. Examining an infected process using ECFS
  81. The ECFS reference guide
  82. Process necromancy with ECFS
  83. Learning more about ECFS
  84. Summary
  85. 9. Linux /proc/kcore Analysis
  86. stock vmlinux has no symbols
  87. /proc/kcore and GDB exploration
  88. Direct sys_call_table modifications
  89. Kprobe rootkits
  90. Debug register rootkits – DRR
  91. VFS layer rootkits
  92. Other kernel infection techniques
  93. vmlinux and .altinstructions patching
  94. Using taskverse to see hidden processes
  95. Infected LKMs – kernel drivers
  96. Notes on /dev/kmem and /dev/mem
  97. /dev/mem
  98. K-ecfs – kernel ECFS
  99. Kernel hacking goodies
  100. Summary
  101. Index

Examining an infected process using ECFS

Before we show the effectiveness of ECFS with a real-world example, it would be helpful to have a little background of the method of infection that we will use from a hacker's perspective. It is often very useful for a hacker to be able to incorporate anti-forensic techniques into their workflow on compromised systems so that their programs, especially the ones that serve as backdoors and such, can remain hidden to the untrained eye.

One such technique is to perform process cloaking. This is the act of running a program inside of an existing process, ideally inside of a process that is known to be benign but persistent, such as ftpd or sshd. The Saruman anti-forensics exec (http://www.bitlackeys.org/#saruman) allows an attacker to inject a complete, dynamically linked PIE executable into an existing process address space and run it.

It uses a thread injection technique so that the injected program can run simultaneously with the host program. This particular hacker technique was something that I came up with and designed in 2013, but I have no doubt that other such tools have existed for much longer than this in the underground scene. Typically, this type of anti-forensic technique would go unnoticed and would be very difficult to detect.

Let's see what type of efficiency and accuracy we can achieve by analyzing such a process with ECFS technology.

Infecting the host process

The host process is a benign process, and typically it would be something like sshd or ftpd, as already mentioned. For the sake of our example, we will use a simple and persistent program called host; it simply runs in an infinite loop, printing a message on the screen. We will then inject a remote server backdoor into the process using the Saruman anti-forensics exec launcher program.

In terminal 1, run the host program:

$ ./host
I am the host
I am the host
I am the host

In terminal 2, inject the backdoor into the process:

$ ./launcher `pidof host` ./server
[+] Thread injection succeeded, tid: 16187
[+] Saruman successfully injected program: ./server
[+] PT_DETACHED -> 16186
$

Capturing and analyzing an ECFS snapshot

Now, if we capture a snapshot of the process either by using the ecfs_snapshot utility or by signaling the process to the core dump, we can begin our examination.

The symbol table analysis

Let's look at the symbol table analysis of the host.16186 snapshot:

 readelf -s host.16186

Symbol table '.dynsym' contains 6 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 00007fba3811e000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 00007fba3818de30     0 FUNC    GLOBAL DEFAULT  UND puts
     2: 00007fba38209860     0 FUNC    GLOBAL DEFAULT  UND write
     3: 00007fba3813fdd0     0 FUNC    GLOBAL DEFAULT  UND __libc_start_main
     4: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND __gmon_start__
     5: 00007fba3818c4e0     0 FUNC    GLOBAL DEFAULT  UND fopen

Symbol table '.symtab' contains 6 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000400470    96 FUNC    GLOBAL DEFAULT   10 sub_400470
     1: 00000000004004d0    42 FUNC    GLOBAL DEFAULT   10 sub_4004d0
     2: 00000000004005bd    50 FUNC    GLOBAL DEFAULT   10 sub_4005bd
     3: 00000000004005ef    69 FUNC    GLOBAL DEFAULT   10 sub_4005ef
     4: 0000000000400640   101 FUNC    GLOBAL DEFAULT   10 sub_400640
     5: 00000000004006b0     2 FUNC    GLOBAL DEFAULT   10 sub_4006b0

The readelf command allows us to view the symbol tables. Notice that a symbol table exists for both the dynamic symbols in .dynsym and the symbols for local functions, which are stored in the .symtab symbol table. ECFS is able to reconstruct the dynamic symbol table by accessing the dynamic segment and finding DT_SYMTAB.

Note

The .symtab symbol table is a bit trickier but extremely valuable. ECFS uses a special method of parsing the PT_GNU_EH_FRAME segment that contains frame description entries in a dwarf format; these are used for exception handling. This information is useful for gathering the location and size of every single function defined within the binary.

In cases such as functions being obfuscated, tools such as IDA would fail to identify every function defined within a binary or core file, but the ECFS technology will succeed. This is one of the major impacts that ECFS makes on the reverse engineering world—a near-foolproof method of locating and sizing every function and producing a symbol table. In the host.16186 file, the symbol table is fully reconstructed. This is useful because it could aid us in detecting whether or not any PLT/GOT hooks are being used to redirect shared library functions, and if so, we can identify the actual names of functions that have been hijacked.

The section header analysis

Now, let's look at the section header analysis of the host.16186 snapshot.

My version of readelf has been slightly modified so that it recognizes the following custom types: SHT_INJECTED and SHT_PRELOADED. Without this modification to readelf, it will simply show the numerical values associated with those definitions. Check out include/ecfs.h for the definitions, and add them to the readelf source code if you like:

$ readelf -S host.16186
There are 46 section headers, starting at offset 0x255464:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .interp           PROGBITS         0000000000400238  00002238
       000000000000001c  0000000000000000   A       0     0     1
  [ 2] .note             NOTE             0000000000000000  000005f0
       000000000000133c  0000000000000000   A       0     0     4
  [ 3] .hash             GNU_HASH         0000000000400298  00002298
       000000000000001c  0000000000000000   A       0     0     4
  [ 4] .dynsym           DYNSYM           00000000004002b8  000022b8
       0000000000000090  0000000000000018   A       5     0     8
  [ 5] .dynstr           STRTAB           0000000000400348  00002348
       0000000000000049  0000000000000018   A       0     0     1
  [ 6] .rela.dyn         RELA             00000000004003c0  000023c0
       0000000000000018  0000000000000018   A       4     0     8
  [ 7] .rela.plt         RELA             00000000004003d8  000023d8
       0000000000000078  0000000000000018   A       4     0     8
  [ 8] .init             PROGBITS         0000000000400450  00002450
       000000000000001a  0000000000000000  AX       0     0     8
  [ 9] .plt              PROGBITS         0000000000400470  00002470
       0000000000000060  0000000000000010  AX       0     0     16
  [10] ._TEXT            PROGBITS         0000000000400000  00002000
       0000000000001000  0000000000000000  AX       0     0     16
  [11] .text             PROGBITS         00000000004004d0  000024d0
       00000000000001e2  0000000000000000           0     0     16
  [12] .fini             PROGBITS         00000000004006b4  000026b4
       0000000000000009  0000000000000000  AX       0     0     16
  [13] .eh_frame_hdr     PROGBITS         00000000004006e8  000026e8
       000000000000003c  0000000000000000  AX       0     0     4
  [14] .eh_frame         PROGBITS         0000000000400724  00002728
       0000000000000114  0000000000000000  AX       0     0     8
  [15] .ctors            PROGBITS         0000000000600e10  00003e10
       0000000000000008  0000000000000008   A       0     0     8
  [16] .dtors            PROGBITS         0000000000600e18  00003e18
       0000000000000008  0000000000000008   A       0     0     8
  [17] .dynamic          DYNAMIC          0000000000600e28  00003e28
       00000000000001d0  0000000000000010  WA       0     0     8
  [18] .got.plt          PROGBITS         0000000000601000  00004000
       0000000000000048  0000000000000008  WA       0     0     8
  [19] ._DATA            PROGBITS         0000000000600000  00003000
       0000000000001000  0000000000000000  WA       0     0     8
  [20] .data             PROGBITS         0000000000601040  00004040
       0000000000000010  0000000000000000  WA       0     0     8
  [21] .bss              PROGBITS         0000000000601050  00004050
       0000000000000008  0000000000000000  WA       0     0     8
  [22] .heap             PROGBITS         0000000000e9c000  00006000
       0000000000021000  0000000000000000  WA       0     0     8
  [23] .elf.dyn.0        INJECTED         00007fba37f1b000  00038000
       0000000000001000  0000000000000000  AX       0     0     8
  [24] libc-2.19.so.text SHLIB            00007fba3811e000  0003b000
       00000000001bb000  0000000000000000   A       0     0     8
  [25] libc-2.19.so.unde SHLIB            00007fba382d9000  001f6000
       00000000001ff000  0000000000000000   A       0     0     8
  [26] libc-2.19.so.relr SHLIB            00007fba384d8000  001f6000
       0000000000004000  0000000000000000   A       0     0     8
  [27] libc-2.19.so.data SHLIB            00007fba384dc000  001fa000
       0000000000002000  0000000000000000   A       0     0     8
  [28] ld-2.19.so.text   SHLIB            00007fba384e3000  00201000
       0000000000023000  0000000000000000   A       0     0     8
  [29] ld-2.19.so.relro  SHLIB            00007fba38705000  0022a000
       0000000000001000  0000000000000000   A       0     0     8
  [30] ld-2.19.so.data   SHLIB            00007fba38706000  0022b000
       0000000000001000  0000000000000000   A       0     0     8
  [31] .procfs.tgz       LOUSER+0         0000000000000000  00254388
       00000000000010dc  0000000000000001           0     0     8
  [32] .prstatus         PROGBITS         0000000000000000  00253000
       00000000000002a0  0000000000000150           0     0     8
  [33] .fdinfo           PROGBITS         0000000000000000  002532a0
       0000000000000ac8  0000000000000228           0     0     4
  [34] .siginfo          PROGBITS         0000000000000000  00253d68
       0000000000000080  0000000000000080           0     0     4
  [35] .auxvector        PROGBITS         0000000000000000  00253de8
       0000000000000130  0000000000000008           0     0     8
  [36] .exepath          PROGBITS         0000000000000000  00253f18
       000000000000001c  0000000000000008           0     0     1
  [37] .personality      PROGBITS         0000000000000000  00253f34
       0000000000000004  0000000000000004           0     0     1
  [38] .arglist          PROGBITS         0000000000000000  00253f38
       0000000000000050  0000000000000001           0     0     1
  [39] .fpregset         PROGBITS         0000000000000000  00253f88
       0000000000000400  0000000000000200           0     0     8
  [40] .stack            PROGBITS         00007fff4447c000  0022d000
       0000000000021000  0000000000000000  WA       0     0     8
  [41] .vdso             PROGBITS         00007fff444a9000  0024f000
       0000000000002000  0000000000000000  WA       0     0     8
  [42] .vsyscall         PROGBITS         ffffffffff600000  00251000
       0000000000001000  0000000000000000  WA       0     0     8
  [43] .symtab           SYMTAB           0000000000000000  0025619d
       0000000000000090  0000000000000018          44     0     4
  [44] .strtab           STRTAB           0000000000000000  0025622d
       0000000000000042  0000000000000000           0     0     1
  [45] .shstrtab         STRTAB           0000000000000000  00255fe4
       00000000000001b9  0000000000000000           0     0     1

Section 23 is of particular interest to us; it has been marked as a suspicious ELF object with the injected denotation:

  [23] .elf.dyn.0        INJECTED         00007fba37f1b000  00038000
       0000000000001000  0000000000000000  AX       0     0     8 

When the ECFS heuristics detects an ELF object as suspicious and it can't find that particular object in its list of mapped shared libraries, it names the section in the following format:

.elf.<type>.<count>

The type can be one of four:

  • ET_NONE
  • ET_EXEC
  • ET_DYN
  • ET_REL

In our example, it is obviously ET_DYN, represented as dyn. The count is simply the index of injected objects that have been found. In this case, the index is 0 as it is the first and only injected ELF object that was found in this particular process.

The type INJECTED obviously denotes that the section contains an ELF object that was determined suspicious or injected through unnatural means. In this particular case, the process was infected with Saruman (described earlier), which injects a Position-Independent Executable (PIE). A PIE executable is of type ET_DYN, similar to shared libraries, which is why ECFS has marked it as such.

Extracting parasite code with readecfs

We have spotted a section in the ECFS core file that relates to parasitic code, which is an injected PIE executable in this case. The next step is to investigate the code itself. This can be done in one of the following ways: the objdump utility or a more advanced disassembler such as IDA pro can be used to navigate to the section called .elf.dyn.0, or the readecfs utility can first be used to extract the parasitic code from the ECFS core file:

$ readecfs -O host.16186 .elf.dyn.0 parasite_code.exe

- readecfs output for file host.16186
- Executable path (.exepath): /home/ryan/git/saruman/host
- Command line: ./host                                                                          

[+] Copying section data from '.elf.dyn.0' into output file 'parasite_code.exe'

We now have a singular copy of the parasite code that has been extracted from the process image, thanks to ECFS. The task of identifying this particular malware and then extracting it would be an extremely tedious task without ECFS. Now we can examine parasite_code.exe as a separate file, open it up in IDA, and so on:

root@elfmaster:~/ecfs/cores# readelf -l parasite_code.exe
readelf: Error: Unable to read in 0x40 bytes of section headers
readelf: Error: Unable to read in 0x780 bytes of section headers

Elf file type is DYN (Shared object file)
Entry point 0xdb0
There are 9 program headers, starting at offset 64

Program Headers:
 Type        Offset             VirtAddr           PhysAddr
              FileSiz            MemSiz              Flags  Align
 PHDR         0x0000000000000040 0x0000000000000040 0x0000000000000040
              0x00000000000001f8 0x00000000000001f8  R E    8
 INTERP       0x0000000000000238 0x0000000000000238 0x0000000000000238
              0x000000000000001c 0x000000000000001c  R      1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
 LOAD         0x0000000000000000 0x0000000000000000 0x0000000000000000
              0x0000000000001934 0x0000000000001934  R E    200000
 LOAD         0x0000000000001df0 0x0000000000201df0 0x0000000000201df0
              0x0000000000000328 0x0000000000000330  RW     200000
 DYNAMIC      0x0000000000001e08 0x0000000000201e08 0x0000000000201e08
              0x00000000000001d0 0x00000000000001d0  RW     8
 NOTE         0x0000000000000254 0x0000000000000254 0x0000000000000254
              0x0000000000000044 0x0000000000000044  R      4
 GNU_EH_FRAME 0x00000000000017e0 0x00000000000017e0 0x00000000000017e0
              0x000000000000003c 0x000000000000003c  R      4
  GNU_STACK   0x0000000000000000 0x0000000000000000 0x0000000000000000
              0x0000000000000000 0x0000000000000000  RW     10
  GNU_RELRO   0x0000000000001df0 0x0000000000201df0 0x0000000000201df0
              0x0000000000000210 0x0000000000000210  R      1
readelf: Error: Unable to read in 0x1d0 bytes of dynamic section

Notice that readelf is complaining in the preceding output. This is because the parasite that we extracted does not have a section header table of its own. In future, the readecfs utility will be able to reconstruct a minimal section header table for mapped ELF objects that are extracted from the overall ECFS core file.

Analyzing the Azazel userland rootkit

As mentioned in Chapter 7, Process Memory Forensics, the Azazel userland rootkit is a userland rootkit that infects a process by means of LD_PRELOAD, where the Azazel shared library is linked to the process, and hijacks various libc functions. In Chapter 7, Process Memory Forensics, we used GDB and readelf to inspect a process for this particular rootkit infection. Now let's try the ECFS method to do this type of process introspection. The following is an ECFS snapshot of a process from the executable host2 that has been infected with the Azazel rootkit.

The symbol table of the host2 process reconstructed

Now, this is the symbol table of host2 with process reconstruction:

$ readelf -s host2.7254

Symbol table '.dynsym' contains 7 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 00007f0a0d0ed070     0 FUNC    GLOBAL DEFAULT  UND unlink
     2: 00007f0a0d06fe30     0 FUNC    GLOBAL DEFAULT  UND puts
     3: 00007f0a0d0bcef0     0 FUNC    GLOBAL DEFAULT  UND opendir
     4: 00007f0a0d021dd0     0 FUNC    GLOBAL DEFAULT  UND __libc_start_main
     5: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND __gmon_start__
     6: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND fopen
 
 Symbol table '.symtab' contains 5 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 00000000004004b0   112 FUNC    GLOBAL DEFAULT   10 sub_4004b0
     1: 0000000000400520    42 FUNC    GLOBAL DEFAULT   10 sub_400520
     2: 000000000040060d    68 FUNC    GLOBAL DEFAULT   10 sub_40060d
     3: 0000000000400660   101 FUNC    GLOBAL DEFAULT   10 sub_400660
     4: 00000000004006d0     2 FUNC    GLOBAL DEFAULT   10 sub_4006d0

We can see from the preceding symbol table that host2 is a simple program and has only a few shared library calls (this is shown in the .dynsym symbol table): unlink, puts, opendir, and fopen.

The section header table of the host2 process reconstructed

Let's see what the section header table of host2 looks like with process reconstruction:

$ readelf -S host2.7254

There are 65 section headers, starting at offset 0x27e1ee:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .interp           PROGBITS         0000000000400238  00002238
       000000000000001c  0000000000000000   A       0     0     1
  [ 2] .note             NOTE             0000000000000000  00000900
       000000000000105c  0000000000000000   A       0     0     4
  [ 3] .hash             GNU_HASH         0000000000400298  00002298
       000000000000001c  0000000000000000   A       0     0     4
  [ 4] .dynsym           DYNSYM           00000000004002b8  000022b8
       00000000000000a8  0000000000000018   A       5     0     8
  [ 5] .dynstr           STRTAB           0000000000400360  00002360
       0000000000000052  0000000000000018   A       0     0     1
  [ 6] .rela.dyn         RELA             00000000004003e0  000023e0
       0000000000000018  0000000000000018   A       4     0     8
  [ 7] .rela.plt         RELA             00000000004003f8  000023f8
       0000000000000090  0000000000000018   A       4     0     8
  [ 8] .init             PROGBITS         0000000000400488  00002488
       000000000000001a  0000000000000000  AX       0     0     8
  [ 9] .plt              PROGBITS         00000000004004b0  000024b0
       0000000000000070  0000000000000010  AX       0     0     16
  [10] ._TEXT            PROGBITS         0000000000400000  00002000
       0000000000001000  0000000000000000  AX       0     0     16
  [11] .text             PROGBITS         0000000000400520  00002520
       00000000000001b2  0000000000000000           0     0     16
  [12] .fini             PROGBITS         00000000004006d4  000026d4
       0000000000000009  0000000000000000  AX       0     0     16
  [13] .eh_frame_hdr     PROGBITS         0000000000400708  00002708
       0000000000000034  0000000000000000  AX       0     0     4
  [14] .eh_frame         PROGBITS         000000000040073c  00002740
       00000000000000f4  0000000000000000  AX       0     0     8
  [15] .ctors            PROGBITS         0000000000600e10  00003e10
       0000000000000008  0000000000000008   A       0     0     8
  [16] .dtors            PROGBITS         0000000000600e18  00003e18
       0000000000000008  0000000000000008   A       0     0     8
  [17] .dynamic          DYNAMIC          0000000000600e28  00003e28
       00000000000001d0  0000000000000010  WA       0     0     8
  [18] .got.plt          PROGBITS         0000000000601000  00004000
       0000000000000050  0000000000000008  WA       0     0     8
  [19] ._DATA            PROGBITS         0000000000600000  00003000
       0000000000001000  0000000000000000  WA       0     0     8
  [20] .data             PROGBITS         0000000000601048  00004048
       0000000000000010  0000000000000000  WA       0     0     8
  [21] .bss              PROGBITS         0000000000601058  00004058
       0000000000000008  0000000000000000  WA       0     0     8
  [22] .heap             PROGBITS         0000000000602000  00005000
       0000000000021000  0000000000000000  WA       0     0     8
  [23] libaudit.so.1.0.0 SHLIB            0000003001000000  00026000
       0000000000019000  0000000000000000   A       0     0     8
  [24] libaudit.so.1.0.0 SHLIB            0000003001019000  0003f000
       00000000001ff000  0000000000000000   A       0     0     8
  [25] libaudit.so.1.0.0 SHLIB            0000003001218000  0003f000
       0000000000001000  0000000000000000   A       0     0     8
  [26] libaudit.so.1.0.0 SHLIB            0000003001219000  00040000
       0000000000001000  0000000000000000   A       0     0     8
  [27] libpam.so.0.83.1. SHLIB            0000003003400000  00041000
       000000000000d000  0000000000000000   A       0     0     8
  [28] libpam.so.0.83.1. SHLIB            000000300340d000  0004e000
       00000000001ff000  0000000000000000   A       0     0     8
  [29] libpam.so.0.83.1. SHLIB            000000300360c000  0004e000
       0000000000001000  0000000000000000   A       0     0     8
  [30] libpam.so.0.83.1. SHLIB            000000300360d000  0004f000
       0000000000001000  0000000000000000   A       0     0     8
  [31] libutil-2.19.so.t SHLIB            00007f0a0cbf9000  00050000
       0000000000002000  0000000000000000   A       0     0     8
  [32] libutil-2.19.so.u SHLIB            00007f0a0cbfb000  00052000
       00000000001ff000  0000000000000000   A       0     0     8
  [33] libutil-2.19.so.r SHLIB            00007f0a0cdfa000  00052000
       0000000000001000  0000000000000000   A       0     0     8
  [34] libutil-2.19.so.d SHLIB            00007f0a0cdfb000  00053000
       0000000000001000  0000000000000000   A       0     0     8
  [35] libdl-2.19.so.tex SHLIB            00007f0a0cdfc000  00054000
       0000000000003000  0000000000000000   A       0     0     8
  [36] libdl-2.19.so.und SHLIB            00007f0a0cdff000  00057000
       00000000001ff000  0000000000000000   A       0     0     8
  [37] libdl-2.19.so.rel SHLIB            00007f0a0cffe000  00057000
       0000000000001000  0000000000000000   A       0     0     8
  [38] libdl-2.19.so.dat SHLIB            00007f0a0cfff000  00058000
       0000000000001000  0000000000000000   A       0     0     8
  [39] libc-2.19.so.text SHLIB            00007f0a0d000000  00059000
       00000000001bb000  0000000000000000   A       0     0     8
  [40] libc-2.19.so.unde SHLIB            00007f0a0d1bb000  00214000
       00000000001ff000  0000000000000000   A       0     0     8
  [41] libc-2.19.so.relr SHLIB            00007f0a0d3ba000  00214000
       0000000000004000  0000000000000000   A       0     0     8
  [42] libc-2.19.so.data SHLIB            00007f0a0d3be000  00218000
       0000000000002000  0000000000000000   A       0     0     8
  [43] azazel.so.text    PRELOADED        00007f0a0d3c5000  0021f000
       0000000000008000  0000000000000000   A       0     0     8
  [44] azazel.so.undef   PRELOADED        00007f0a0d3cd000  00227000
       00000000001ff000  0000000000000000   A       0     0     8
  [45] azazel.so.relro   PRELOADED        00007f0a0d5cc000  00227000
       0000000000001000  0000000000000000   A       0     0     8
  [46] azazel.so.data    PRELOADED        00007f0a0d5cd000  00228000
       0000000000001000  0000000000000000   A       0     0     8
  [47] ld-2.19.so.text   SHLIB            00007f0a0d5ce000  00229000
       0000000000023000  0000000000000000   A       0     0     8
  [48] ld-2.19.so.relro  SHLIB            00007f0a0d7f0000  00254000
       0000000000001000  0000000000000000   A       0     0     8
  [49] ld-2.19.so.data   SHLIB            00007f0a0d7f1000  00255000
       0000000000001000  0000000000000000   A       0     0     8
  [50] .procfs.tgz       LOUSER+0         0000000000000000  0027d038
       00000000000011b6  0000000000000001           0     0     8
  [51] .prstatus         PROGBITS         0000000000000000  0027c000
       0000000000000150  0000000000000150           0     0     8
  [52] .fdinfo           PROGBITS         0000000000000000  0027c150
       0000000000000ac8  0000000000000228           0     0     4
  [53] .siginfo          PROGBITS         0000000000000000  0027cc18
       0000000000000080  0000000000000080           0     0     4
  [54] .auxvector        PROGBITS         0000000000000000  0027cc98
       0000000000000130  0000000000000008           0     0     8
  [55] .exepath          PROGBITS         0000000000000000  0027cdc8
       000000000000001c  0000000000000008           0     0     1
  [56] .personality      PROGBITS         0000000000000000  0027cde4
       0000000000000004  0000000000000004           0     0     1
  [57] .arglist          PROGBITS         0000000000000000  0027cde8
       0000000000000050  0000000000000001           0     0     1
  [58] .fpregset         PROGBITS         0000000000000000  0027ce38
       0000000000000200  0000000000000200           0     0     8
  [59] .stack            PROGBITS         00007ffdb9161000  00257000
       0000000000021000  0000000000000000  WA       0     0     8
  [60] .vdso             PROGBITS         00007ffdb918f000  00279000
       0000000000002000  0000000000000000  WA       0     0     8
  [61] .vsyscall         PROGBITS         ffffffffff600000  0027b000
       0000000000001000  0000000000000000  WA       0     0     8
  [62] .symtab           SYMTAB           0000000000000000  0027f576
       0000000000000078  0000000000000018          63     0     4
  [63] .strtab           STRTAB           0000000000000000  0027f5ee
       0000000000000037  0000000000000000           0     0     1
  [64] .shstrtab         STRTAB           0000000000000000  0027f22e
       0000000000000348  0000000000000000           0     0     1

The ELF sections 43 through 46 are all immediately suspicious because they are marked with the PRELOADED section type, which indicates that they are mappings from a shared library that was preloaded with the LD_PRELOAD environment variable:

  [43] azazel.so.text    PRELOADED        00007f0a0d3c5000  0021f000
       0000000000008000  0000000000000000   A       0     0     8
  [44] azazel.so.undef   PRELOADED        00007f0a0d3cd000  00227000
       00000000001ff000  0000000000000000   A       0     0     8
  [45] azazel.so.relro   PRELOADED        00007f0a0d5cc000  00227000
       0000000000001000  0000000000000000   A       0     0     8
  [46] azazel.so.data    PRELOADED        00007f0a0d5cd000  00228000
       0000000000001000  0000000000000000   A       0     0     8

Various userland rootkits, such as Azazel, use LD_PRELOAD as their means of injection. The next step is to look at the PLT/GOT (global offset table) and check whether it contains any pointers to functions outside of the respective boundaries.

You might recall from previous chapters that the GOT contains a table of pointer values that should point to either of these:

  • A PLT stub in the corresponding PLT entry (remember the lazy linking concepts from Chapter 2, The ELF Binary Format)
  • If the particular GOT entry has already been resolved by the linker in some way (lazy or strict linking), then it will point to the shared library function denoted by the corresponding relocation entry from the .rela.plt section of the executable

Validating the PLT/GOT with ECFS

Understanding and systematically validating the integrity of the PLT/GOT is tedious by hand. Fortunately, there is a very easy way to do this with ECFS. If you prefer to write your own tool, then you should use the libecfs function that is designed specifically for this purpose:

ssize_t get_pltgot_info(ecfs_elf_t *desc, pltgot_info_t **pginfo)

This function allocates an array of structs, each element pertaining to a single PLT/GOT entry.

The C struct named pltgot_info_t has the following format:

typedef struct pltgotinfo {
   unsigned long got_site; // addr of the GOT entry itself
   unsigned long got_entry_va; // pointer value stored in the GOT entry
   unsigned long plt_entry_va; // the expected PLT address
   unsigned long shl_entry_va; // the expected shared lib function addr
} pltgot_info_t;

An example of using this function can be found in ecfs/libecfs/main/detect_plt_hooks.c. This is a simple demonstrative tool for detecting shared library injection and PLT/GOT hooks, which is shown and commented for clarity later in this chapter. The readecfs utility also demonstrates the use of the get_pltgot_info() function when passed the -g flag.

The readecfs output for PLT/GOT validation

- readecfs output for file host2.7254
- Executable path (.exepath): /home/user/git/azazel/host2
- Command line: ./host2
- Printing out GOT/PLT characteristics (pltgot_info_t):
gotsite    gotvalue       gotshlib          pltval         symbol
0x601018   0x7f0a0d3c8c81  0x7f0a0d0ed070   0x4004c6      unlink
0x601020   0x7f0a0d06fe30  0x7f0a0d06fe30   0x4004d6      puts
0x601028   0x7f0a0d3c8d77  0x7f0a0d0bcef0   0x4004e6      opendir
0x601030   0x7f0a0d021dd0  0x7f0a0d021dd0   0x4004f6      __libc_start_main

The preceding output is easy to parse. The gotvalue should have an address that matches either gotshlib or pltval. We can see, however, that the very first entry, which is for the symbol unlink, has an address 0x7f0a0d3c8c81. This does not match with the expected shared library function or PLT value.

More investigation would show that the address points to a function within azazel.so. From the preceding output, we can see that the only two functions that have not been tampered with are puts and __libc_start_main. For an even greater insight into the detection process, let's take a look at the source code for a tool that does automatic PLT/GOT validation as part of its detection capabilities. This tool is called detect_plt_hooks and was written in C. It utilizes the libecfs API to load and parse ECFS snapshots.

Note that the following code has approximately 50 lines of source code, which is quite remarkable. If we were not using ECFS or libecfs, it would take approximately 3,000 lines of C code to accurately analyze a process image for shared library injection and PLT/GOT hooks. I know this because I have done it, and using libecfs is by far the most painless way to go about coding such tools.

Here's a code example using detect_plt_hooks.c:

#include "../include/libecfs.h"

int main(int argc, char **argv)
{
    ecfs_elf_t *desc;
    ecfs_sym_t *dsyms;
    char *progname;
    int i;
    char *libname;
    long evil_addr = 0;

    if (argc < 2) {
        printf("Usage: %s <ecfs_file>\n", argv[0]);
        exit(0);
    }
   
    /*
     * Load the ECFS file and creates descriptor
     */
    desc = load_ecfs_file(argv[1]);
    /*
     * Get the original program name
    */
    progname = get_exe_path(desc);
   
    printf("Performing analysis on '%s' which corresponds to executable: %s\n", argv[1], progname);

    /*
     * Look for any sections that are marked as INJECTED
     * or PRELOADED, indicating shared library injection
     * or ELF object injection.
     */
    for (i = 0; i < desc->ehdr->e_shnum; i++) {
        if (desc->shdr[i].sh_type == SHT_INJECTED) {
            libname = strdup(&desc->shstrtab[desc->shdr[i].sh_name]);
            printf("[!] Found malicously injected ET_DYN (Dynamic ELF): %s - base: %lx\n", libname, desc->shdr[i].sh_addr);
        } else
        if (desc->shdr[i].sh_type == SHT_PRELOADED) {
            libname = strdup(&desc->shstrtab[desc->shdr[i].sh_name]);
            printf("[!] Found a preloaded shared library (LD_PRELOAD): %s - base: %lx\n", libname, desc->shdr[i].sh_addr);
        }
    }
    /*
     * Load and validate the PLT/GOT to make sure that each
     * GOT entry points to its proper respective location
     * in either the PLT, or the correct shared lib function.
     */
    pltgot_info_t *pltgot;
    int gotcount = get_pltgot_info(desc, &pltgot);
    for (i = 0; i < gotcount; i++) {
        if (pltgot[i].got_entry_va != pltgot[i].shl_entry_va &&
            pltgot[i].got_entry_va != pltgot[i].plt_entry_va &&
            pltgot[i].shl_entry_va != 0) {
            printf("[!] Found PLT/GOT hook: A function is pointing at %lx instead of %lx\n",
                pltgot[i].got_entry_va, evil_addr = pltgot[i].shl_entry_va);
     /*
      * Load the dynamic symbol table to print the
      * hijacked function by name.
      */
            int symcount = get_dynamic_symbols(desc, &dsyms);
            for (i = 0; i < symcount; i++) {
                if (dsyms[i].symval == evil_addr) {
                    printf("[!] %lx corresponds to hijacked function: %s\n", dsyms[i].symval, &dsyms[i].strtab[dsyms[i].nameoffset]);
                break;
                }
            }
        }
    }
    return 0;
}