Table of Contents for
Learning Linux Binary Analysis

Version ebook / Retour

Cover image for bash Cookbook, 2nd Edition Learning Linux Binary Analysis by Ryan elfmaster O'Neill Published by Packt Publishing, 2016
  1. Cover
  2. Table of Contents
  3. Learning Linux Binary Analysis
  4. Learning Linux Binary Analysis
  5. Credits
  6. About the Author
  7. Acknowledgments
  8. About the Reviewers
  9. www.PacktPub.com
  10. Preface
  11. What you need for this book
  12. Who this book is for
  13. Conventions
  14. Reader feedback
  15. Customer support
  16. 1. The Linux Environment and Its Tools
  17. Useful devices and files
  18. Linker-related environment points
  19. Summary
  20. 2. The ELF Binary Format
  21. ELF program headers
  22. ELF section headers
  23. ELF symbols
  24. ELF relocations
  25. ELF dynamic linking
  26. Coding an ELF Parser
  27. Summary
  28. 3. Linux Process Tracing
  29. ptrace requests
  30. The process register state and flags
  31. A simple ptrace-based debugger
  32. A simple ptrace debugger with process attach capabilities
  33. Advanced function-tracing software
  34. ptrace and forensic analysis
  35. Process image reconstruction – from the memory to the executable
  36. Code injection with ptrace
  37. Simple examples aren't always so trivial
  38. Demonstrating the code_inject tool
  39. A ptrace anti-debugging trick
  40. Summary
  41. 4. ELF Virus Technology �� Linux/Unix Viruses
  42. ELF virus engineering challenges
  43. ELF virus parasite infection methods
  44. The PT_NOTE to PT_LOAD conversion infection method
  45. Infecting control flow
  46. Process memory viruses and rootkits – remote code injection techniques
  47. ELF anti-debugging and packing techniques
  48. ELF virus detection and disinfection
  49. Summary
  50. 5. Linux Binary Protection
  51. Stub mechanics and the userland exec
  52. Other jobs performed by protector stubs
  53. Existing ELF binary protectors
  54. Downloading Maya-protected binaries
  55. Anti-debugging for binary protection
  56. Resistance to emulation
  57. Obfuscation methods
  58. Protecting control flow integrity
  59. Other resources
  60. Summary
  61. 6. ELF Binary Forensics in Linux
  62. Detecting other forms of control flow hijacking
  63. Identifying parasite code characteristics
  64. Checking the dynamic segment for DLL injection traces
  65. Identifying reverse text padding infections
  66. Identifying text segment padding infections
  67. Identifying protected binaries
  68. IDA Pro
  69. Summary
  70. 7. Process Memory Forensics
  71. Process memory infection
  72. Detecting the ET_DYN injection
  73. Linux ELF core files
  74. Summary
  75. 8. ECFS – Extended Core File Snapshot Technology
  76. The ECFS philosophy
  77. Getting started with ECFS
  78. libecfs – a library for parsing ECFS files
  79. readecfs
  80. Examining an infected process using ECFS
  81. The ECFS reference guide
  82. Process necromancy with ECFS
  83. Learning more about ECFS
  84. Summary
  85. 9. Linux /proc/kcore Analysis
  86. stock vmlinux has no symbols
  87. /proc/kcore and GDB exploration
  88. Direct sys_call_table modifications
  89. Kprobe rootkits
  90. Debug register rootkits – DRR
  91. VFS layer rootkits
  92. Other kernel infection techniques
  93. vmlinux and .altinstructions patching
  94. Using taskverse to see hidden processes
  95. Infected LKMs – kernel drivers
  96. Notes on /dev/kmem and /dev/mem
  97. /dev/mem
  98. K-ecfs – kernel ECFS
  99. Kernel hacking goodies
  100. Summary
  101. Index

Process image reconstruction – from the memory to the executable

One neat exercise to test our abilities with both the ELF format and ptrace is to design software that can reconstruct a process image back into a working executable. This is especially useful for the type of forensic work where we find a suspicious program running on the system. Extended core file snapshot (ECFS) technology is capable of this and extends the functionality into an innovative forensics and debugging format that is backward compatible with the traditional Linux core files' format. This is available at https://github.com/elfmaster/ecfs and is further documented in Chapter 8, ECFS – Extended Core File Snapshot Technology, in this book. Quenya also has this feature and is available for download at http://www.bitlackeys.org/projects/quenya_32bit.tgz.

Challenges for process-executable reconstruction

In order to reconstruct a process back into an executable we must first consider the challenges involved, as there are a myriad things to consider. There is one particular type of variables over which we have no control, and these are the global variables in the initialized data. They will have possibly changed at runtime to variables dictated by the code, and we will have no way of knowing what they are supposed to be initialized to before runtime. We may not even be able to find this out by static code analysis.

The following are the goals for executable reconstruction:

  • Take a process ID as an argument and reconstruct that process image back into its executable file state
  • We should construct a minimal set of section headers so that the program can be analyzed by tools such as objdump and gdb with better accuracy

Challenges for executable reconstruction

Full executable reconstruction is possible, but it comes with some challenges, especially when reconstructing a dynamically linked executable. Here, we will go over what the primary challenges are and what the general solution is for each one.

PLT/GOT integrity

The global offset table will be filled in with the resolved values of the corresponding shared library functions. This was, of course, done by the dynamic linker, and so we must replace these addresses with the original PLT stub addresses. We do this so that when the shared library functions are called for the first time, they trigger the dynamic linker properly through the PLT instruction that pushes the GOT offset onto the stack. Refer to the ELF and dynamic linking section of Chapter 2, The ELF Binary Format.

The following diagram demonstrates how GOT entries must be restored:

PLT/GOT integrity

Adding a section header table

Remember that a program's section header table is not loaded into the memory at runtime. This is because it is not needed. When reconstructing a process image back into an executable, it would be desirable (although not necessary) to add a section header table. It is perfectly possible to add every section header entry that was on the original executable, but a good ELF hacker can generate at least the basics.

So try to create a section header for the following sections: .interp, .note, .text, .dynamic, .got.plt, .data, .bss, .shstrtab, .dynsym, and .dynstr.

Note

If the executable that you are reconstructing is statically linked, then you won't have the .dynamic, .got.plt, .dynsym, or .dynstr sections.

The algorithm for the process

Let's look at executable reconstruction:

  1. Locate the base address of the executable (text segment). This can be done by parsing /proc/<pid>/maps:
    [First line of output from /proc/<pid>/maps file for program 'evil']
    
    00400000-401000 r-xp /home/ryan/evil
    

    Tip

    Use the PTRACE_PEEKTEXT request with ptrace to read in the entire text segment. You can see in a line from the preceding maps output that the address range for the text segment (marked r-xp) is 0x400000 to 0x401000, which is 4096 bytes. So, this is how large your buffer should be for the text segment. Since we have not covered how to use PTRACE_PEEKTEXT to read more than a long-sized word at a time, I have written a function called pid_read() that demonstrates a good way to do this.

    [Source code for pid_read() function]
    int pid_read(int pid, void *dst, const void *src, size_t len)
    {
      int sz = len / sizeof(void *);
      unsigned char *s = (unsigned char *)src;
      unsigned char *d = (unsigned char *)dst;
      unsigned long word;
      while (sz!=0) {
        word = ptrace(PTRACE_PEEKTEXT, pid, (long *)s, NULL);
        if (word == 1)
        return 1;
        *(long *)d = word;
        s += sizeof(long);
        d += sizeof(long);
      }
      return 0;
    }
  2. Parse the ELF file header (for example, Elf64_Ehdr) to locate the program header table:
    /* Where buffer is the buffer holding the text segment */
    Elf64_Ehdr *ehdr = (Elf64_Ehdr *)buffer;
    Elf64_Phdr *phdr = (Elf64_Phdr *)&buffer[ehdr->e_phoff];
  3. Then parse the program header table to find the data segment:
    for (c = 0; c < ehdr>e_phnum; c++)
    if (phdr[c].p_type == PT_LOAD && phdr[c].p_offset) {
      dataVaddr = phdr[c].p_vaddr;
      dataSize = phdr[c].p_memsz;
      break;
    }
    pid_read(pid, databuff, dataVaddr, dataSize);
  4. Read the data segment into a buffer, and locate the dynamic segment within it and then the GOT. Use d_tag from the dynamic segment to locate the GOT:

    Note

    We discussed the dynamic segment and its tag values in the Dynamic linking section of Chapter 2, The ELF Binary Format.

    Elf64_Dyn *dyn;
    for (c = 0; c < ehdr->e_phnum; c++) {
      if (phdr[c].p_type == PT_DYNAMIC) {
        dyn = (Elf64_Dyn *)&databuff[phdr[c].p_vaddr – dataAddr];
        break;
      }
      if (dyn) {
        for (c = 0; dyn[c].d_tag != DT_NULL; c++) {
          switch(dyn[c].d_tag) {
            case DT_PLTGOT:
            gotAddr = dyn[i].d_un.d_ptr;
            break;
            case DT_STRTAB:
            /* Get .dynstr info */
            break;
            case DT_SYMTAB:
            /* Get .dynsym info */
            break;
          }
        }
      }
  5. Once the GOT has been located, it must be restored to its state prior to runtime. The part that matters the most is restoring the original PLT stub addresses in each GOT entry so that lazy linking works at program runtime. See the ELF dynamic linking section of Chapter 2, The ELF Binary Format:
    00000000004003e0 <puts@plt>:
    4003e0: ff 25 32 0c 20 00 jmpq *0x200c32(%rip) # 601018 
    4003e6: 68 00 00 00 00 pushq $0x0
    4003eb: e9 e0 ff ff ff jmpq 4003d0 <_init+0x28>
    
  6. The GOT entry that is reserved for puts() should be patched to point back to the PLT stub code that pushes the GOT offset onto the stack for that entry. The address for this, 0x4003e6, is given in the preceding command. The method for determining the GOT-to-PLT entry relationship is left as an exercise for the reader.
  7. Optionally reconstruct a section header table. Then write the text and data segment (and the section header table) to the disk.

Process reconstruction with Quenya on a 32-bit test environment

A 32-bit ELF executable named dumpme simply prints the You can Dump my segments! string and then pauses, giving us time to reconstruct it.

Now, the following code demonstrates Quenya reconstructing a process image into an executable:

[Quenya v0.1@ELFWorkshop]
rebuild 2497 dumpme.out
[+] Beginning analysis for executable reconstruction of process image (pid: 2497)
[+] Getting Loadable segment info...
[+] Found loadable segments: text segment, data segment
Located PLT GOT Vaddr 0x804a000
Relevant GOT entries begin at 0x804a00c
[+] Resolved PLT: 0x8048336
PLT Entries: 5
Patch #1 [
0xb75f7040] changed to [0x8048346]
Patch #2 [
0xb75a7190] changed to [0x8048356]
Patch #3 [
0x8048366] changed to [0x8048366]
Patch #4 [
0xb755a990] changed to [0x8048376]
[+] Patched GOT with PLT stubs
Successfully rebuilt ELF object from memory
Output executable location: dumpme.out
[Quenya v0.1@ELFWorkshop]
quit

Here, we are demonstrating that the output executable runs correctly:

hacker@ELFWorkshop:~/
workshop/labs/exercise_9$ ./dumpme.out
You can Dump my segments!

Quenya has created a minimal section header table for the executable as well:

hacker@ELFWorkshop:~/
workshop/labs/exercise_9$ readelf -S
dumpme.out

There are seven section headers, starting at the offset 0x1118, as shown here:

Process reconstruction with Quenya on a 32-bit test environment

The source code for process reconstruction in Quenya is located primarily in rebuild.c, and Quenya may be downloaded from my site at http://www.bitlackeys.org/.