Mastering 4 Stages of Malware Analysis

Examining malicious software involves a variety of tasks, some simpler than others. These efforts can be grouped into stages based on the nature of the associated malware analysis techniques. Layered on top of each other, these stages form a pyramid that grows upwards in complexity. The closer you get to the top, the more burdensome the effort and the less common the skill set.

Fully-Automated Analysis

The easiest way to assess the nature of a suspicious file is to scan it using fully-automated tools, some of which are available as commercial products and some as free ones. These utilities are designed to quickly assess what the specimen might do if it ran on a system. They typically produce reports with details such as the registry keys used by the malicious program, its mutex values, file activity, network traffic, etc.

Fully-automated tools usually don't provide as much insight as a human analyst would obtain when examining the specimen in a more manual fashion. However, they contribute to the incident response process by rapidly handling vast amounts of malware, allowing the analyst (whose time is relatively expensive) to focus on the cases that truly require a human's attention.

For a listing of free services and tools that can perform automated analysis, see my lists of Toolkits for Automating Malware Analysis and Automated Malware Analysis Services.

Static Properties Analysis

An analyst interested in taking a closer look at the suspicious file might proceed by examining its static properties. Such details can be obtained relatively quickly, because they don't involve running the potentially malicious program. Static properties include the strings embedded into the file, header details, hashes, embedded resources, packer signatures, metadata such as the creation date, etc.

Looking at static properties can sometimes be sufficient for defining basic indicators of compromise. This process also helps determine whether the analyst should take closer look at the specimen using more comprehensive techniques and where to focus the subsequent steps. Analyzing static properties is useful as part of the incident triage effort.

VirusTotal is an example of an excellent online tool whose output includes the file's static properties. For a look at some free utilities you can run locally in your lab, see my posts Analyzing Static Properties of Suspicious Files on Windows and Examining XOR Obfuscation for Malware Analysis.

Interactive Behavior Analysis

After using automated tools and examining static properties of the file, as well as taking into account the overall context of the investigation, the analyst might decide to take a closer look at the specimen. This often entails infecting an isolated laboratory system with the malicious program to observe its behavior.

Behavioral analysis involves examining how sample runs in the lab to understand its registry, file system, process and network activities. Understanding how the program uses memory (e.g., performing memory forensics) can bring additional insights. This malware analysis stage is especially fruitful when the researcher interacts with the malicious program, rather than passively observing the specimen.

The analyst might observe that the specimen attempts to connect to a particular host, which is not accessible in the isolated lab. The researcher could mimic the system in the lab and repeat the experiment to see what the malicious program would do after it is able to connect. for example, if the specimen uses the host as a command and control (C2) server, the analyst may be able to learn about specimen by simulating the attacker's C2 activities. This approach to molding the lab to evoke additional behavioral characteristics applies to files, registry keys and other dependencies that the specimen might have.

Being able to exercise this level of control over the specimen in a properly orchestrated lab is what differentiates this stage from fully-automated analysis tasks. Interacting with malware in creative ways is more time-consuming and complicated than running fully-automated tools. It generally requires more skills than performing the earlier tasks in the pyramid.

For additional insights related to interactive behavior analysis, see my post Virtualized Network Isolation for a Malware Analysis Lab, a my recorded webcast Intro to Behavioral Analysis of Malicious Software and Jake Williams' Tips on Malware Analysis and Reverse-Engineering.

Manual Code Reversing

Reverse-engineering the code that comprises the specimen can add valuable insights to the findings available after completing interactive behavior analysis. Some characteristics of the specimen are simply impractical to exercise and examine without examining the code. Insights that only manual code reversing can provide include:

Decoding encrypted data stored or transferred by the sample;
Determining the logic of the malicious program's domain generation algorithm;
Understanding other capabilities of the sample that didn't exhibit themselves during behavior analysis.

Manual code reversing involves the use of a disassembler and a debugger, which could be aided by a decompiler and a variety of plugins and specialized tools that automate some aspects of these efforts. Memory forensics can assist at this stage of the pyramid as well.

Reversing code can take a lot of time and requires a skill set that is relatively rare. For this reason, many malware investigations don't dig into the code. However, knowing how to perform at least some code reversing steps greatly increases the analyst's view into the nature of the malicious program in a comp

To get a sense for basic aspects of code-level reverse engineering in the context of other malware analysis stages, tune into my recorded webcast Introduction to Malware Analysis. For a closer look at manual code reversing, read Dennis Yurichev's e-book Reverse Engineering for Beginners.

Combining Malware Analysis Stages

The process of examining malicious software involves several stages, which could be listed in the order of increasing complexity and represented as a pyramid. However, viewing these stages as discrete and sequential steps over-simplifies the steps malware analysis process. In most cases, different types of analysis tasks are intertwined, with the insights gathered in one stage informing efforts conducted in another. Perhaps the stages could be represented by a "wash, rinse, repeat" cycle, that could only be interrupted when the analyst runs out of time.

If you're interested in this topic, check out the malware analysis course I teach at SANS Institute. The pyramid presented in this post is based on a similar diagram by Alissa Torres (@sibertor). Also, Andres Velzquez (@cibercrimen) translated this article into Spanish.

Updated May 27, 2022

Lenny Zeltser