Up until now, we've covered the fundamentals of infecting ELF binaries with parasite code, which is enough to keep you busy for at least several months of coding and experimentation. This chapter would not be complete, though, without a thorough discussion of infecting process memory. As we've learned, a program in memory is not much different than it is on disk, and we can access and manipulate a running program with the ptrace system call, as shown in Chapter 3, Linux Process Tracing. Process infections are a lot more stealthy than binary infections, since they don't modify anything on disk. Therefore, process memory infections are usually an attempt at defeating forensic analysis. All of the ELF infection points that we just discussed are relevant to process infection, although injecting actual parasite code is done differently than it is with an ELF binary. Since it is in memory, we must get the parasite code into memory, which can be done by injecting it directly with PTRACE_POKETEXT (overwriting existing code) or, more preferably, by injecting shellcode that creates a new memory mapping to store the code. This is where things such as shared library injection come into play. Throughout the rest of this chapter, we will discuss some methods for remote code injection (injecting code into another process).
This technique can be used to inject a shared library (whether malicious or not) into an existing process' address space. Once the library is injected, you may use one of the infection points described earlier to redirect control flow to the shared library through PLT/GOT redirection, function trampolines, and so on. The challenge is getting the shared library into the process, and this can be done in a number of ways.
It is debatable whether we can actually call this method for injecting a shared library into a process is debatable injection, since it does not work on existing processes but rather the shared library is loaded upon execution of the program. This works by setting the LD_PRELOAD environment variable so that the desired shared library is loaded with precedence before any others. This can be a good way to quickly test subsequent techniques such as PLT/GOT redirection, but is not stealthy and does not work on existing processes.
$ export LD_PRELOAD=/tmp/wicked.so.1 $ /usr/local/some_daemon $ cp /lib/x86_64-linux-gnu/libm-2.19.so /tmp/wicked.so.1 $ export LD_PRELOAD=/tmp/wicked.so.1 $ /usr/local/some_daemon & $ pmap `pidof some_daemon` | grep 'wicked' 00007ffaa731e000 1044K r-x-- wicked.so.1 00007ffaa7423000 2044K ----- wicked.so.1 00007ffaa7622000 4K r---- wicked.so.1 00007ffaa7623000 4K rw--- wicked.so.1
As you can see, our shared library, wicked.so.1, is mapped into the process address space. Amateurs tend to use this technique to create little userland rootkits that hijack glibc functions. This is because the preloaded library will take precedence over any of the other shared libraries, so if you name your functions the same as a glibc function such as open() or write() (which are wrappers for syscalls), then your preloaded libraries' version of the functions will execute and not the real open() and write(). This is a cheap and dirty way to hijack glibc functions and should not be used if an attacker wishes to remain stealthy.
This is a way to load any file (including shared libraries) into the process address space by injecting shellcode (using ptrace) into an existing process' text segment and then executing it to perform open/mmap on a shared library into the process. We demonstrated this in Chapter 3, Linux Process Tracing, with our code_inject.c example, which loaded a very simple executable into the process. That same code could be used to load a shared library in as well. The problem with this technique is that most shared libraries that you will want to inject will require relocations. The open()/mmap() functions will only load the file into memory but won't handle code relocations, so mostly any shared library that you will want to load won't properly execute unless it's completely position-independent code. At this point, you could choose to manually handle the relocations by parsing the shared libraries' relocations and applying them in memory using ptrace(). Fortunately, an easier solution exists, which we will discuss next.
The dlopen() function is used to dynamically load shared libraries that an executable wasn't linked with in the first place. Developers often use this as a way to create plugins for their applications in the form of shared libraries. A program can call dlopen() to load a shared library on the fly, and it actually invokes the dynamic linker to perform all of the relocations for you. There is a problem, though: most processes do not have dlopen() available to them, because it exists in libdl.so.2, and a program must be explicitly linked to libdl.so.2 in order to invoke dlopen(). Fortunately, there is also a solution to this: almost every single program has libc.so mapped into the process address space by default (unless it was explicitly compiled otherwise) and libc.so has an equivalent to dlopen() called __libc_dlopen_mode(). This function is used almost in the exact same way, but it requires a special flag be set:
#define DLOPEN_MODE_FLAG 0x80000000
This isn't much of a hurdle. But prior to using __libc_dlopen_mode(), you must first resolve it remotely by getting the base address of libc.so in the process you want to infect, resolve the symbol for __libc_dlopen_mode(), and then add the symbol value st_value (refer to Chapter 2, The ELF Binary Format) to the base address of libc to get the final address of __libc_dlopen_mode(). You can then design some shellcode in C or assembly that calls __libc_dlopen_mode() to load your shared library into the process, with full relocations and ready to execute. The __libc_dlsym() function can then be used to resolve symbols within your shared library. See the dlopen manpages for more details on using dlopen() and dlsym().
/* Taken from Saruman's launcher.c */
#define __RTLD_DLOPEN 0x80000000 //glibc internal dlopen flag
#define __BREAKPOINT__ __asm__ __volatile__("int3");
#define __RETURN_VALUE__(x) __asm__ __volatile__("mov %0, %%rax\n" :: "g"(x))
__PAYLOAD_KEYWORDS__ void * dlopen_load_exec(const char *path, void *dlopen_addr)
{
void * (*libc_dlopen_mode)(const char *, int) = dlopen_addr;
void *handle; handle = libc_dlopen_mode(path, __RTLD_DLOPEN|RTLD_NOW|RTLD_GLOBAL);
__RETURN_VALUE__(handle);
__BREAKPOINT__;
}It is very much worth noting that dlopen() will load PIE executables too. This means that you can inject a complete program into a process and run it. In fact, you can run as many programs as you want in a single process. This is an incredible anti-forensics technique, and when using thread injection, you can run them all concurrently so that they execute at the same time. Saruman is a PoC software that I designed to do this. It uses two possible methods of injection: the open()/mmap() method with manual relocations or the __libc_dlopen_mode() method. This is available on my site at http://www.bitlackeys.org/#saruman.
This is a technique that I discussed in my paper at http://vxheaven.org/lib/vrn00.html. The idea is to manipulate the virtual dynamic shared object (VDSO), which is mapped into every process address space in Linux since kernel version 2.6.x. The VDSO contains code to speed up system calls, and they can be invoked directly from the VDSO. The trick is to locate the code that invokes syscalls by using PTRACE_SYSCALL, which will break once it lands on this code. The attacker can then load %eax/%rax with the desired syscall number and store the arguments in the other registers, following the proper calling convention for Linux x86 system calls. This is surprisingly easy and can be used to call the open()/mmap() method without having to inject any shellcode. This can be useful for bypassing PaX, which prevents a user from injecting code into the text segment. I recommend reading my paper for a complete dissertation on the technique.
This is a simple technique and is not very useful for anything other than injecting shellcode, which should then quickly be replaced with the original code once the shellcode has finished executing. Another reason you would want to directly modify the text segment is to create function trampolines, which we discussed earlier in this chapter, or to directly modify the .plt code. As far as code injection goes, though, it is preferable to load code into the process or create a new memory mapping where code can be stored: otherwise, the text segment could easily be detected as being modified.
As mentioned previously, dlopen() is capable of loading PIE executables into a process, and I even included a link to Saruman, which is the crafty software that allows you to run programs within existing processes for anti-forensics measures. But what about injecting ET_EXEC type executables? This type of executable does not provide any relocation information except for dynamic-linking R_X86_64_JUMP_SLOT/R_386_JUMP_SLOT relocation types. This means that injecting a regular executable into an existing process is ultimately going to be unreliable, especially when injecting more complex programs. Nevertheless, I created a PoC of this technique called elfdemon, which maps the executable to some new mappings that don't conflict with the host process executable mappings. It then hijacks control (unlike Saruman, which allows concurrent execution) and passes control back to the host process once it is done running. An example of this can be found at http://www.bitlackeys.org/projects/elfdemon.tgz.
This method is very similar to shared library injection but is not compatible with dlopen(). ET_REL (.o files) are relocatable code, much like ET_DYN (.so files), but they are not meant to be executed as single files; they are meant to link into either an executable or a shared library, as discussed in Chapter 2, The ELF Binary Format. This, however, doesn't mean that we can't inject them, relocate them, and execute their code. This can be done by using any of the techniques described earlier except dlopen(). So, open/mmap is sufficient but requires that you manually handle the relocations, which can be done using ptrace. In Chapter 2, The ELF Binary Format, we gave an example of the relocation code in the software that I designed, called
Quenya. This demonstrates how to handle relocations in an object file when injecting it into an executable. The same principles can be used when injecting one into a process.