Basic attributes of network activity—such as IP addresses, TCP and UDP ports, domain names, and traffic content—are used by networking and security devices to provide defenses. Firewalls and routers can be used to restrict access to a network based on IP addresses and ports. DNS servers can be configured to reroute known malicious domains to an internal host, known as a sinkhole. Proxy servers can be configured to detect or prevent access to specific domains.
Intrusion detection systems (IDSs), intrusion prevention systems (IPSs), and other security appliances, such as email and web proxies, make it possible to employ content-based countermeasures. Content-based defense systems allow for deeper inspection of traffic, and include the network signatures used by an IDS and the algorithms used by a mail proxy to detect spam. Because basic network indicators such as IP addresses and domain names are supported by most defensive systems, they are often the first items that a malware analyst will investigate.
The commonly used term intrusion detection system is outdated. Signatures are used to detect more than just intrusions, such as scanning, service enumeration and profiling, nonstandard use of protocols, and beaconing from installed malware. An IPS is closely related to an IDS, the difference being that while an IDS is designed to merely detect the malicious traffic, an IPS is designed to detect malicious traffic and prevent it from traveling over the network.
The first step in malware analysis should not be to run the malware in your lab environment, or break open the malware and start analyzing the disassembled code. Rather, you should first review any data you already have about the malware. Occasionally, an analyst is handed a malware sample (or suspicious executable) without any context, but in most situations, you can acquire additional data. The best way to start network-focused malware analysis is to mine the logs, alerts, and packet captures that were already generated by the malware.
There are distinct advantages to information that comes from real networks, rather than from a lab environment:
Live-captured information will provide the most transparent view of a malicious application’s true behavior. Malware can be programmed to detect lab environments.
Existing information from active malware can provide unique insights that accelerate analysis. Real traffic provides information about the malware at both end points (client and server), whereas in a lab environment, the analyst typically has access only to information about one of the end points. Analyzing the content received by malware (the parsing routines) is typically more challenging than analyzing the content malware produces. Therefore, bidirectional sample traffic can help seed the analysis of the parsing routines for the malware the analyst has in hand.
Additionally, when passively reviewing information, there is no risk that your analysis activities will be leaked to the attacker. This issue will be explained in detail in OPSEC = Operations Security.
Suppose we’ve received a malware executable to analyze, and we run it in our lab
environment, keeping an eye on networking events. We find that the malware does a DNS request for www.badsite.com, and then does an HTTP GET
request on port 80 to the IP address returned in the DNS record. Thirty seconds later, it tries to
beacon out to a specific IP address without doing a DNS query. At this point, we have three
potential indicators of malicious activity: a domain name with its associated IP address, a
stand-alone IP address, and an HTTP GET request with URI and
contents, as shown in Table 14-1.
We would probably want to further research these indicators. Internet searches might reveal how long ago the malware was created, when it was first detected, how prevalent it is, who might have written it, and what the attackers’ objectives might be. A lack of information is instructive as well, since it can imply the existence of a targeted attack or a new campaign.
Before rushing to your favorite search engine, however, it is important to understand the potential risks associated with your online research activities.
When using the Internet for research, it is important to understand the concept of operations security (OPSEC). OPSEC is a term used by the government and military to describe a process of preventing adversaries from obtaining sensitive information.
Certain actions you take while investigating malware can inform the malware author that you’ve identified the malware, or may even reveal personal details about you to the attacker. For example, if you are analyzing malware from home, and the malware was sent into your corporate network via email, the attacker may notice that a DNS request was made from an IP address space outside the space normally used by your company. There are many potential ways for an attacker to identify investigative activity, such as the following:
Send a targeted phishing (known as spear-phishing) email with a link to a specific individual and watch for access attempts to that link from IP addresses outside the expected geographical area.
Design an exploit to create an encoded link in a blog comment (or some other Internet-accessible and freely editable site), effectively creating a private but publicly accessible infection audit trail.
Embed an unused domain in malware and watch for attempts to resolve the domain.
If attackers are aware that they are being investigated, they may change tactics and effectively disappear.