Table of Contents for
Learning Linux Binary Analysis

Version ebook / Retour

Cover image for bash Cookbook, 2nd Edition Learning Linux Binary Analysis by Ryan elfmaster O'Neill Published by Packt Publishing, 2016
  1. Cover
  2. Table of Contents
  3. Learning Linux Binary Analysis
  4. Learning Linux Binary Analysis
  5. Credits
  6. About the Author
  7. Acknowledgments
  8. About the Reviewers
  9. www.PacktPub.com
  10. Preface
  11. What you need for this book
  12. Who this book is for
  13. Conventions
  14. Reader feedback
  15. Customer support
  16. 1. The Linux Environment and Its Tools
  17. Useful devices and files
  18. Linker-related environment points
  19. Summary
  20. 2. The ELF Binary Format
  21. ELF program headers
  22. ELF section headers
  23. ELF symbols
  24. ELF relocations
  25. ELF dynamic linking
  26. Coding an ELF Parser
  27. Summary
  28. 3. Linux Process Tracing
  29. ptrace requests
  30. The process register state and flags
  31. A simple ptrace-based debugger
  32. A simple ptrace debugger with process attach capabilities
  33. Advanced function-tracing software
  34. ptrace and forensic analysis
  35. Process image reconstruction – from the memory to the executable
  36. Code injection with ptrace
  37. Simple examples aren't always so trivial
  38. Demonstrating the code_inject tool
  39. A ptrace anti-debugging trick
  40. Summary
  41. 4. ELF Virus Technology �� Linux/Unix Viruses
  42. ELF virus engineering challenges
  43. ELF virus parasite infection methods
  44. The PT_NOTE to PT_LOAD conversion infection method
  45. Infecting control flow
  46. Process memory viruses and rootkits – remote code injection techniques
  47. ELF anti-debugging and packing techniques
  48. ELF virus detection and disinfection
  49. Summary
  50. 5. Linux Binary Protection
  51. Stub mechanics and the userland exec
  52. Other jobs performed by protector stubs
  53. Existing ELF binary protectors
  54. Downloading Maya-protected binaries
  55. Anti-debugging for binary protection
  56. Resistance to emulation
  57. Obfuscation methods
  58. Protecting control flow integrity
  59. Other resources
  60. Summary
  61. 6. ELF Binary Forensics in Linux
  62. Detecting other forms of control flow hijacking
  63. Identifying parasite code characteristics
  64. Checking the dynamic segment for DLL injection traces
  65. Identifying reverse text padding infections
  66. Identifying text segment padding infections
  67. Identifying protected binaries
  68. IDA Pro
  69. Summary
  70. 7. Process Memory Forensics
  71. Process memory infection
  72. Detecting the ET_DYN injection
  73. Linux ELF core files
  74. Summary
  75. 8. ECFS – Extended Core File Snapshot Technology
  76. The ECFS philosophy
  77. Getting started with ECFS
  78. libecfs – a library for parsing ECFS files
  79. readecfs
  80. Examining an infected process using ECFS
  81. The ECFS reference guide
  82. Process necromancy with ECFS
  83. Learning more about ECFS
  84. Summary
  85. 9. Linux /proc/kcore Analysis
  86. stock vmlinux has no symbols
  87. /proc/kcore and GDB exploration
  88. Direct sys_call_table modifications
  89. Kprobe rootkits
  90. Debug register rootkits – DRR
  91. VFS layer rootkits
  92. Other kernel infection techniques
  93. vmlinux and .altinstructions patching
  94. Using taskverse to see hidden processes
  95. Infected LKMs – kernel drivers
  96. Notes on /dev/kmem and /dev/mem
  97. /dev/mem
  98. K-ecfs – kernel ECFS
  99. Kernel hacking goodies
  100. Summary
  101. Index

Linux ELF core files

In most UNIX flavored OSes, a process can be delivered a signal so that it dumps a core file. A core file is essentially a snapshot of the process and its state right before it cored (crashed or dumped). A core file is a type of ELF file that is primarily made up of program headers and memory segments. They also contain a fair amount of notes in the PT_NOTE segment that describe file mappings, shared library paths, and other information.

A core file by itself is not especially useful for process memory forensics, but it may yield some results to the more astute analyst.

Note

This is actually where ECFS comes into the picture; it is an extension of the regular Linux ELF core format and provides features that are specifically for forensic analysis.

Analysis of the core file – the Azazel rootkit

Here, we will infect a process with the azazel rootkit using the LD_PRELOAD environment variable, and then deliver an abort signal to the process so that we can capture a core dump for analysis.

Starting up an Azazel infected process and getting a core dump

$ LD_PRELOAD=./libselinux.so ./host &
[1] 9325
$ kill -ABRT `pidof host`
[1]+  Segmentation fault      (core dumped) LD_PRELOAD=./libselinux.so ./host

Core file program headers

In a core file, there are many program headers. All of them except one are of the PT_LOAD type. There is a PT_LOAD program header for every single memory segment in the process, with the exception of special devices (that is /dev/mem). Everything from shared libraries and anonymous mappings to the stack, the heap, text, and data segments is represented by a program header.

Then, there is one program header of the PT_NOTE type; it contains the most useful and descriptive information in the entire core file.

The PT_NOTE segment

The eu-readelf -n output that is shown next shows the parsing of the core file notes segment. The reason we used eu-readelf here instead of the regular readelf is that eu-readelf (the ELF Utils version) takes time to parse each entry in the notes segment, whereas the more commonly used readelf (the binutils version) only shows the NT_FILE entry:

$ eu-readelf -n core

Note segment of 4200 bytes at offset 0x900:
  Owner          Data size  Type
  CORE                 336  PRSTATUS
    info.si_signo: 11, info.si_code: 0, info.si_errno: 0, cursig: 11
    sigpend: <>
    sighold: <>
    pid: 9875, ppid: 7669, pgrp: 9875, sid: 5781
    utime: 5.292000, stime: 0.004000, cutime: 0.000000, cstime: 0.000000
    orig_rax: -1, fpvalid: 1
    r15:                       0  r14:                       0
    r13:         140736185205120  r12:                 4195616
    rbp:      0x00007fffb25380a0  rbx:                       0
    r11:                     582  r10:         140736185204304
    r9:                 15699984  r8:               1886848000
    rax:                      -1  rcx:                    -160
    rdx:         140674792738928  rsi:              4294967295
    rdi:                 4196093  rip:      0x000000000040064f
    rflags:   0x0000000000000286  rsp:      0x00007fffb2538090
    fs.base:   0x00007ff1677a1740  gs.base:   0x0000000000000000
    cs: 0x0033  ss: 0x002b  ds: 0x0000  es: 0x0000  fs: 0x0000  gs: 0x0000
  CORE                 136  PRPSINFO
    state: 0, sname: R, zomb: 0, nice: 0, flag: 0x0000000000406600
    uid: 0, gid: 0, pid: 9875, ppid: 7669, pgrp: 9875, sid: 5781
    fname: host, psargs: ./host
  CORE                 128  SIGINFO
    si_signo: 11, si_errno: 0, si_code: 0
    sender PID: 7669, sender UID: 0
  CORE                 304  AUXV
    SYSINFO_EHDR: 0x7fffb254a000
    HWCAP: 0xbfebfbff  <fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe>
    PAGESZ: 4096
    CLKTCK: 100
    PHDR: 0x400040
    PHENT: 56
    PHNUM: 9
    BASE: 0x7ff1675ae000
    FLAGS: 0
    ENTRY: 0x400520
    UID: 0
    EUID: 0
    GID: 0
    EGID: 0
    SECURE: 0
    RANDOM: 0x7fffb2538399
    EXECFN: 0x7fffb2538ff1
    PLATFORM: 0x7fffb25383a9
    NULL
  CORE                1812  FILE
    30 files:
   00400000-00401000 00000000 4096        /home/user/git/azazel/host
   00600000-00601000 00000000 4096        /home/user/git/azazel/host
   00601000-00602000 00001000 4096        /home/user/git/azazel/host
   3001000000-3001019000 00000000 102400  /lib/x86_64-linux-gnu/libaudit.so.1.0.0
   3001019000-3001218000 00019000 2093056 /lib/x86_64-linux-gnu/libaudit.so.1.0.0
   3001218000-3001219000 00018000 4096    /lib/x86_64-linux-gnu/libaudit.so.1.0.0
   3001219000-300121a000 00019000 4096    /lib/x86_64-linux-gnu/libaudit.so.1.0.0
   3003400000-300340d000 00000000 53248   /lib/x86_64-linux-gnu/libpam.so.0.83.1
   300340d000-300360c000 0000d000 2093056 /lib/x86_64-linux-gnu/libpam.so.0.83.1
   300360c000-300360d000 0000c000 4096    /lib/x86_64-linux-gnu/libpam.so.0.83.1
   300360d000-300360e000 0000d000 4096    /lib/x86_64-linux-gnu/libpam.so.0.83.1
  7ff166bd9000-7ff166bdb000 00000000 8192    /lib/x86_64-linux-gnu/libutil-2.19.so
  7ff166bdb000-7ff166dda000 00002000 2093056 /lib/x86_64-linux-gnu/libutil-2.19.so
  7ff166dda000-7ff166ddb000 00001000 4096    /lib/x86_64-linux-gnu/libutil-2.19.so
  7ff166ddb000-7ff166ddc000 00002000 4096    /lib/x86_64-linux-gnu/libutil-2.19.so
  7ff166ddc000-7ff166ddf000 00000000 12288   /lib/x86_64-linux-gnu/libdl-2.19.so
  7ff166ddf000-7ff166fde000 00003000 2093056 /lib/x86_64-linux-gnu/libdl-2.19.so
  7ff166fde000-7ff166fdf000 00002000 4096    /lib/x86_64-linux-gnu/libdl-2.19.so
  7ff166fdf000-7ff166fe0000 00003000 4096    /lib/x86_64-linux-gnu/libdl-2.19.so
  7ff166fe0000-7ff16719b000 00000000 1814528 /lib/x86_64-linux-gnu/libc-2.19.so
  7ff16719b000-7ff16739a000 001bb000 2093056 /lib/x86_64-linux-gnu/libc-2.19.so
  7ff16739a000-7ff16739e000 001ba000 16384   /lib/x86_64-linux-gnu/libc-2.19.so
  7ff16739e000-7ff1673a0000 001be000 8192    /lib/x86_64-linux-gnu/libc-2.19.so
  7ff1673a5000-7ff1673ad000 00000000 32768   /home/user/git/azazel/libselinux.so
  7ff1673ad000-7ff1675ac000 00008000 2093056 /home/user/git/azazel/libselinux.so
  7ff1675ac000-7ff1675ad000 00007000 4096    /home/user/git/azazel/libselinux.so
  7ff1675ad000-7ff1675ae000 00008000 4096    /home/user/git/azazel/libselinux.so
  7ff1675ae000-7ff1675d1000 00000000 143360 /lib/x86_64-linux-gnu/ld-2.19.so
  7ff1677d0000-7ff1677d1000 00022000 4096   /lib/x86_64-linux-gnu/ld-2.19.so
  7ff1677d1000-7ff1677d2000 00023000 4096   /lib/x86_64-linux-gnu/ld-2.19.so

Being able to view the register state, auxiliary vector, signal information, and file mappings is not bad news at all, but they are not enough by themselves to analyze a process for malware infection.

PT_LOAD segments and the downfalls of core files for forensics purposes

Each memory segment contains a program header that describes the offset, address, and size of the segment it represents. This would almost suggest that you can access every part of a process image through the program segments, but this is only partially true. The text image of the executable and every shared library that is mapped to the process get only the first 4,096 bytes of themselves dumped into a segment.

This is for saving space and because the Linux kernel developers figured that the text segment will not be modified in memory. So, it suffices to reference the original executable file and shared libraries when accessing the text areas from a debugger. If a core file were to dump the complete text segment for every shared library, then for a large program such as Wireshark or Firefox, the output core dump files would be enormous.

So for debugging reasons, it is usually okay to assume that the text segments have not changed in memory, and to just reference the executable and shared library files themselves to get the text. But what about runtime malware analysis and process memory forensics? In many cases, the text segments have been marked as writeable and contain polymorphic engines for code mutation, and in these instances, core files may be useless for viewing the code segments.

Also, what if the core file is the only artifact available for analysis and the original executable and shared libraries are no longer accessible? This further demonstrates why core files are not particularly good for process memory forensics; nor were they ever meant to be.

Note

In the next chapter, we will see how ECFS addresses many of the weaknesses that render core files a useless artifact for forensic purposes.

Using a core file with GDB for forensics

Combined with the original executable file, and assuming that no code modifications were made (to the text segment), we can still use core files to some avail for malware analysis. In this particular case, we are looking at a core file for the Azazel rootkit, which—as we demonstrated earlier in this chapter—has PLT/GOT hooks:

$ readelf -S host | grep got.plt
  [23] .got.plt          PROGBITS         0000000000601000  00001000
$ readelf -r host
Relocation section '.rela.plt' at offset 0x3f8 contains 6 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000601018  000100000007 R_X86_64_JUMP_SLO 0000000000000000 unlink + 0
000000601020  000200000007 R_X86_64_JUMP_SLO 0000000000000000 puts + 0
000000601028  000300000007 R_X86_64_JUMP_SLO 0000000000000000 opendir + 0
000000601030  000400000007 R_X86_64_JUMP_SLO 0000000000000000 __libc_start_main+0
000000601038  000500000007 R_X86_64_JUMP_SLO 0000000000000000 __gmon_start__ + 0
000000601040  000600000007 R_X86_64_JUMP_SLO 0000000000000000 fopen + 0

So, let's take a look at the function that we already know is hijacked by Azazel. The fopen function is one of the four shared library functions in the infected program, and as we can see from the preceding output, it has a GOT entry at 0x601040:

$ gdb -q ./host core
Reading symbols from ./host...(no debugging symbols found)...done.
[New LWP 9875]
Core was generated by `./host'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000000000040064f in main ()
(gdb) x/gx 0x601040
0x601040 <fopen@got.plt>:  0x00007ff1673a8609
(gdb)

If we look again at the NT_FILE entry in the PT_NOTE segment (readelf -n core), we can see at what address range the libc-2.19.so file is mapped to the memory, and check whether or not the GOT entry for fopen is pointing to libc-2.19.so as it should be:

$ readelf -n core
<snippet>
 0x00007ff166fe0000  0x00007ff16719b000  0x0000000000000000
        /lib/x86_64-linux-gnu/libc-2.19.so
</snippet>

The fopen@got.plt points to 0x7ff1673a8609. This is outside of the libc-2.19.so text segment range displayed previously, which is 0x7ff166fe0000 to 0x7ff16719b000. Examining a core file with GDB is very similar to examining a live process with GDB, and you can use the same method shown next to locate the environment variables and check whether LD_PRELOAD has been set.

Here's an example of locating environment variables in a core file:

(gdb) x/4096s $rsp

… scroll down a few pages …

0x7fffb25388db:  "./host"
0x7fffb25388e2:  "LD_PRELOAD=./libselinux.so"
0x7fffb25388fd:  "SHELL=/bin/bash"
0x7fffb253890d:  "TERM=xterm"
0x7fffb2538918:  "OLDPWD=/home/ryan"
0x7fffb253892a:  "USER=root"