Table of Contents for
Learning Linux Binary Analysis

Version ebook / Retour

Cover image for bash Cookbook, 2nd Edition Learning Linux Binary Analysis by Ryan elfmaster O'Neill Published by Packt Publishing, 2016
  1. Cover
  2. Table of Contents
  3. Learning Linux Binary Analysis
  4. Learning Linux Binary Analysis
  5. Credits
  6. About the Author
  7. Acknowledgments
  8. About the Reviewers
  9. www.PacktPub.com
  10. Preface
  11. What you need for this book
  12. Who this book is for
  13. Conventions
  14. Reader feedback
  15. Customer support
  16. 1. The Linux Environment and Its Tools
  17. Useful devices and files
  18. Linker-related environment points
  19. Summary
  20. 2. The ELF Binary Format
  21. ELF program headers
  22. ELF section headers
  23. ELF symbols
  24. ELF relocations
  25. ELF dynamic linking
  26. Coding an ELF Parser
  27. Summary
  28. 3. Linux Process Tracing
  29. ptrace requests
  30. The process register state and flags
  31. A simple ptrace-based debugger
  32. A simple ptrace debugger with process attach capabilities
  33. Advanced function-tracing software
  34. ptrace and forensic analysis
  35. Process image reconstruction – from the memory to the executable
  36. Code injection with ptrace
  37. Simple examples aren't always so trivial
  38. Demonstrating the code_inject tool
  39. A ptrace anti-debugging trick
  40. Summary
  41. 4. ELF Virus Technology �� Linux/Unix Viruses
  42. ELF virus engineering challenges
  43. ELF virus parasite infection methods
  44. The PT_NOTE to PT_LOAD conversion infection method
  45. Infecting control flow
  46. Process memory viruses and rootkits – remote code injection techniques
  47. ELF anti-debugging and packing techniques
  48. ELF virus detection and disinfection
  49. Summary
  50. 5. Linux Binary Protection
  51. Stub mechanics and the userland exec
  52. Other jobs performed by protector stubs
  53. Existing ELF binary protectors
  54. Downloading Maya-protected binaries
  55. Anti-debugging for binary protection
  56. Resistance to emulation
  57. Obfuscation methods
  58. Protecting control flow integrity
  59. Other resources
  60. Summary
  61. 6. ELF Binary Forensics in Linux
  62. Detecting other forms of control flow hijacking
  63. Identifying parasite code characteristics
  64. Checking the dynamic segment for DLL injection traces
  65. Identifying reverse text padding infections
  66. Identifying text segment padding infections
  67. Identifying protected binaries
  68. IDA Pro
  69. Summary
  70. 7. Process Memory Forensics
  71. Process memory infection
  72. Detecting the ET_DYN injection
  73. Linux ELF core files
  74. Summary
  75. 8. ECFS – Extended Core File Snapshot Technology
  76. The ECFS philosophy
  77. Getting started with ECFS
  78. libecfs – a library for parsing ECFS files
  79. readecfs
  80. Examining an infected process using ECFS
  81. The ECFS reference guide
  82. Process necromancy with ECFS
  83. Learning more about ECFS
  84. Summary
  85. 9. Linux /proc/kcore Analysis
  86. stock vmlinux has no symbols
  87. /proc/kcore and GDB exploration
  88. Direct sys_call_table modifications
  89. Kprobe rootkits
  90. Debug register rootkits – DRR
  91. VFS layer rootkits
  92. Other kernel infection techniques
  93. vmlinux and .altinstructions patching
  94. Using taskverse to see hidden processes
  95. Infected LKMs – kernel drivers
  96. Notes on /dev/kmem and /dev/mem
  97. /dev/mem
  98. K-ecfs – kernel ECFS
  99. Kernel hacking goodies
  100. Summary
  101. Index

Process memory viruses and rootkits – remote code injection techniques

Up until now, we've covered the fundamentals of infecting ELF binaries with parasite code, which is enough to keep you busy for at least several months of coding and experimentation. This chapter would not be complete, though, without a thorough discussion of infecting process memory. As we've learned, a program in memory is not much different than it is on disk, and we can access and manipulate a running program with the ptrace system call, as shown in Chapter 3, Linux Process Tracing. Process infections are a lot more stealthy than binary infections, since they don't modify anything on disk. Therefore, process memory infections are usually an attempt at defeating forensic analysis. All of the ELF infection points that we just discussed are relevant to process infection, although injecting actual parasite code is done differently than it is with an ELF binary. Since it is in memory, we must get the parasite code into memory, which can be done by injecting it directly with PTRACE_POKETEXT (overwriting existing code) or, more preferably, by injecting shellcode that creates a new memory mapping to store the code. This is where things such as shared library injection come into play. Throughout the rest of this chapter, we will discuss some methods for remote code injection (injecting code into another process).

Shared library injection – .so injection/ET_DYN injection

This technique can be used to inject a shared library (whether malicious or not) into an existing process' address space. Once the library is injected, you may use one of the infection points described earlier to redirect control flow to the shared library through PLT/GOT redirection, function trampolines, and so on. The challenge is getting the shared library into the process, and this can be done in a number of ways.

.so injection with LD_PRELOAD

It is debatable whether we can actually call this method for injecting a shared library into a process is debatable injection, since it does not work on existing processes but rather the shared library is loaded upon execution of the program. This works by setting the LD_PRELOAD environment variable so that the desired shared library is loaded with precedence before any others. This can be a good way to quickly test subsequent techniques such as PLT/GOT redirection, but is not stealthy and does not work on existing processes.

Illustration 4.7 – using LD_PRELOAD to inject wicked.so.1

$ export LD_PRELOAD=/tmp/wicked.so.1

$ /usr/local/some_daemon

$ cp /lib/x86_64-linux-gnu/libm-2.19.so /tmp/wicked.so.1

$ export LD_PRELOAD=/tmp/wicked.so.1

$ /usr/local/some_daemon &

$ pmap `pidof some_daemon` | grep 'wicked'

00007ffaa731e000   1044K r-x-- wicked.so.1

00007ffaa7423000   2044K ----- wicked.so.1

00007ffaa7622000      4K r---- wicked.so.1

00007ffaa7623000      4K rw--- wicked.so.1

As you can see, our shared library, wicked.so.1, is mapped into the process address space. Amateurs tend to use this technique to create little userland rootkits that hijack glibc functions. This is because the preloaded library will take precedence over any of the other shared libraries, so if you name your functions the same as a glibc function such as open() or write() (which are wrappers for syscalls), then your preloaded libraries' version of the functions will execute and not the real open() and write(). This is a cheap and dirty way to hijack glibc functions and should not be used if an attacker wishes to remain stealthy.

.so injection with open()/mmap() shellcode

This is a way to load any file (including shared libraries) into the process address space by injecting shellcode (using ptrace) into an existing process' text segment and then executing it to perform open/mmap on a shared library into the process. We demonstrated this in Chapter 3, Linux Process Tracing, with our code_inject.c example, which loaded a very simple executable into the process. That same code could be used to load a shared library in as well. The problem with this technique is that most shared libraries that you will want to inject will require relocations. The open()/mmap() functions will only load the file into memory but won't handle code relocations, so mostly any shared library that you will want to load won't properly execute unless it's completely position-independent code. At this point, you could choose to manually handle the relocations by parsing the shared libraries' relocations and applying them in memory using ptrace(). Fortunately, an easier solution exists, which we will discuss next.

.so injection with dlopen() shellcode

The dlopen() function is used to dynamically load shared libraries that an executable wasn't linked with in the first place. Developers often use this as a way to create plugins for their applications in the form of shared libraries. A program can call dlopen() to load a shared library on the fly, and it actually invokes the dynamic linker to perform all of the relocations for you. There is a problem, though: most processes do not have dlopen() available to them, because it exists in libdl.so.2, and a program must be explicitly linked to libdl.so.2 in order to invoke dlopen(). Fortunately, there is also a solution to this: almost every single program has libc.so mapped into the process address space by default (unless it was explicitly compiled otherwise) and libc.so has an equivalent to dlopen() called __libc_dlopen_mode(). This function is used almost in the exact same way, but it requires a special flag be set:

#define DLOPEN_MODE_FLAG 0x80000000

This isn't much of a hurdle. But prior to using __libc_dlopen_mode(), you must first resolve it remotely by getting the base address of libc.so in the process you want to infect, resolve the symbol for __libc_dlopen_mode(), and then add the symbol value st_value (refer to Chapter 2, The ELF Binary Format) to the base address of libc to get the final address of __libc_dlopen_mode(). You can then design some shellcode in C or assembly that calls __libc_dlopen_mode() to load your shared library into the process, with full relocations and ready to execute. The __libc_dlsym() function can then be used to resolve symbols within your shared library. See the dlopen manpages for more details on using dlopen() and dlsym().

Illustration 4.8 – C code invoking __libc_dlopen_mode()

/* Taken from Saruman's launcher.c */
#define __RTLD_DLOPEN 0x80000000 //glibc internal dlopen flag
#define __BREAKPOINT__ __asm__ __volatile__("int3");
#define __RETURN_VALUE__(x) __asm__ __volatile__("mov %0, %%rax\n" :: "g"(x))

__PAYLOAD_KEYWORDS__ void * dlopen_load_exec(const char *path, void *dlopen_addr)
{
        void * (*libc_dlopen_mode)(const char *, int) = dlopen_addr;
        void *handle;        handle = libc_dlopen_mode(path, __RTLD_DLOPEN|RTLD_NOW|RTLD_GLOBAL);
        __RETURN_VALUE__(handle);
        __BREAKPOINT__;
}

It is very much worth noting that dlopen() will load PIE executables too. This means that you can inject a complete program into a process and run it. In fact, you can run as many programs as you want in a single process. This is an incredible anti-forensics technique, and when using thread injection, you can run them all concurrently so that they execute at the same time. Saruman is a PoC software that I designed to do this. It uses two possible methods of injection: the open()/mmap() method with manual relocations or the __libc_dlopen_mode() method. This is available on my site at http://www.bitlackeys.org/#saruman.

.so injection with VDSO manipulation

This is a technique that I discussed in my paper at http://vxheaven.org/lib/vrn00.html. The idea is to manipulate the virtual dynamic shared object (VDSO), which is mapped into every process address space in Linux since kernel version 2.6.x. The VDSO contains code to speed up system calls, and they can be invoked directly from the VDSO. The trick is to locate the code that invokes syscalls by using PTRACE_SYSCALL, which will break once it lands on this code. The attacker can then load %eax/%rax with the desired syscall number and store the arguments in the other registers, following the proper calling convention for Linux x86 system calls. This is surprisingly easy and can be used to call the open()/mmap() method without having to inject any shellcode. This can be useful for bypassing PaX, which prevents a user from injecting code into the text segment. I recommend reading my paper for a complete dissertation on the technique.

Text segment code injections

This is a simple technique and is not very useful for anything other than injecting shellcode, which should then quickly be replaced with the original code once the shellcode has finished executing. Another reason you would want to directly modify the text segment is to create function trampolines, which we discussed earlier in this chapter, or to directly modify the .plt code. As far as code injection goes, though, it is preferable to load code into the process or create a new memory mapping where code can be stored: otherwise, the text segment could easily be detected as being modified.

Executable injections

As mentioned previously, dlopen() is capable of loading PIE executables into a process, and I even included a link to Saruman, which is the crafty software that allows you to run programs within existing processes for anti-forensics measures. But what about injecting ET_EXEC type executables? This type of executable does not provide any relocation information except for dynamic-linking R_X86_64_JUMP_SLOT/R_386_JUMP_SLOT relocation types. This means that injecting a regular executable into an existing process is ultimately going to be unreliable, especially when injecting more complex programs. Nevertheless, I created a PoC of this technique called elfdemon, which maps the executable to some new mappings that don't conflict with the host process executable mappings. It then hijacks control (unlike Saruman, which allows concurrent execution) and passes control back to the host process once it is done running. An example of this can be found at http://www.bitlackeys.org/projects/elfdemon.tgz.

Relocatable code injection – the ET_REL injection

This method is very similar to shared library injection but is not compatible with dlopen(). ET_REL (.o files) are relocatable code, much like ET_DYN (.so files), but they are not meant to be executed as single files; they are meant to link into either an executable or a shared library, as discussed in Chapter 2, The ELF Binary Format. This, however, doesn't mean that we can't inject them, relocate them, and execute their code. This can be done by using any of the techniques described earlier except dlopen(). So, open/mmap is sufficient but requires that you manually handle the relocations, which can be done using ptrace. In Chapter 2, The ELF Binary Format, we gave an example of the relocation code in the software that I designed, called Quenya. This demonstrates how to handle relocations in an object file when injecting it into an executable. The same principles can be used when injecting one into a process.