Automating Malware Analysis with Cuckoo Sandbox

Analyzing malware can be a lengthy process.

Malware today can be simple, like something that downloads another program from a rogue server, or complicated, perhaps utilizing advanced encryption or having many component modules that extend functionality.

The time required to analyze a sample really depends on the questions you need to answer. To fully understand a sample, it can take a long time–in some cases, it may take years.

Alexander Gostev of Kaspersky Lab said in regards to the Flame cyber-espionage malware, “it will take us 10 years to fully understand everything” due to the malware’s size and complexity.

The problem for malware analysts and security researchers alike is we don’t have years to dedicate to one sample; so, we need something to speed up the process and reduce the workload. There are many tools available to do this, one of which is a sandbox.

In the context of malware analysis (and computer security in general), a sandbox is a tool that runs a program in a secure environment (e.g. a virtual machine.) We’ve discussed this concept before in more detail here.

One popular sandbox is Cuckoo, a free and open source system provided by the Cuckoo Foundation. It does a pretty good job and provides nice detailed reports of its findings.

Cuckoo is a great resource, but setup is not exactly “user-friendly”. Installation is not done through a package, but manually with much attention to detail. The provided documentation will serve you well, but it helps to have a good understanding of things like Linux, virtualization software (VIrtualBox, VMware, etc), virtual networking, and the Python programming language.

Take plenty of time setting up the sandbox and make sure you understand the configuration. If it’s not making sense, read through the documentation repeatedly until you understand the basics.

Also, don’t stress if everything isn’t right the first time you launch Cuckoo. If you have errors, read the error messages and try to figure out what’s wrong before asking for help. You’ll likely end up learning a lot about how the sandbox works this way.

If you still can’t figure it out, you can always reach out to the community for more assistance. Here are some additional tips for beginners:

First, focus on getting Cuckoo to launch without errors. Don’t worry about reporting, guests, or anything else yet.
Configure one guest machine correctly. Once you have it configured, relaunch Cuckoo and see if your machine loads. If it does, you can clone it to create more and speed up processing. This is something that will take time, as you want to create an environment that is “malware friendly”.
Check your networking. As stated by the developers, “most of the issues reported by users are related to a wrong setup of their networking.” If you’re not an expert at virtual networking, I suggest you use the default setup as outlined in the documentation.

One you’ve got everything working, you can go ahead and start it by running cuckoo.py. You should see a screen like the one below if everything went ok:

You’re going to need to give the sandbox some files to process. You do this by using the submission utility, named ‘submit.py’, located in the /utils directory. You can open a terminal tab and submit files, then return to your other tab to watch the sandbox work. The submission utility has a lot of great features, so make sure to take time and learn it for your own benefit.

After the sandbox has finished processing your samples, you’ll find the analysis data stored under /storage/analyses.

Open the results folder for a processed sample and you’ll find logs and reports. The type of reports you have will depend on how your sandbox is configured; by default HTML and JSON are enabled. To see all of the supported options, refer to the ‘reporting.conf’ file.

If you open up an HTML report, you can see a lot of information about the files you just processed.

With this information, you can easily sort your malware samples and sift out anything you’re not interested in, and focus on the more unique samples.

Cuckoo is also highly extensible, offering a lot of additional content made by those in the community. The sandbox includes a utility available to download this content using ‘community.py‘. Keep in mind you may want to get used to the sandbox and how it works before downloading content produced by others.

It’s also important to note that like us, Cuckoo isn’t perfect.

In certain cases, submitted analyses will fail. This can happen for a variety of reasons. For example, some of the malware today is designed to check if it’s inside a virtual environment or sandbox, and may not execute properly if detected.

In other instances, the malware performs code injection using “interesting” methods that confuse CuckooMon, the sandbox element that intercepts function calls and monitors the malware’s behavior. Even still, these scenarios aren’t very common, and shouldn’t drastically inhibit automation.

Cuckoo is another great tool in an analyst’s arsenal, regardless of how it’s used. Without even taking a look at the malware yourself, Cuckoo can sometimes provide analysts with enough of the information they need to get the job done.

_________________________________________________________________

Joshua Cannell is a Malware Intelligence Analyst at Malwarebytes where he performs research and malware analysis. Twitter: @joshcannell