We have a saying in Greece: “They assigned the wolf to watch over the sheep.”
In a security context, this is a word of caution about making sure the tools we use to keep our information private don’t actually cause the data leaks themselves. In this article, I will be talking about some cases that I have come across in which security tools have leaked data they were intended to secure.
The VirusTotal problem
VirusTotal (VT) is a multi-scanner in which an individual researcher is free to upload any file they believe is suspicious. They can then view results from many antivirus (AV) products as to whether or not the file is considered malware. While this is an amazing service which I am certain everyone in the infosec world uses regularly, its usage needs to be carefully thought over.
What some people don’t realize is that every file you submit to VirusTotal gets saved on VT’s servers and is fully searchable. By using an internal VT tool called Malware RetroHunting, malware hunters have the ability to search for text and binary patterns in order to find malware similar to ones that he may be analyzing or tracking.
This is a great feature, but as you can imagine, just as someone could search for [insert malicious string of your choice], they could just as easily search for “Account Number:”, which might result in loads of documents containing such data. It is important to bring awareness to this fact so that people can properly use this tool without risking their private data.
I will go through a few cases showing the misuse of VirusTotal to serve as a warning for users who might be thinking about using either second rate/ unofficial tools or adopting practices built off of VT.
Case 1: The no AV argument
I far too often hear people saying something like this: “I don’t need an AntiVirus. I send files to VT for free when they look suspicious.”
I think it should be quite obvious why this method is flawed. If you submit all documents you receive to VT, then you run the risk of leaking private information, as stated above. Now, if you exclude scanning of documents from specific “trusted” addresses (in order to not leak confidential data), then you run the risk of getting a malware phished to you from a spoofed contact. Needless to say, this is not a safe way to keep yourself protected.
Case 2: API usage
The use of VirusTotal API can also be dangerous. Bugs in the code or logic can easily cause a mass upload of private files. This is a danger whether you are building your own tools or using tools like WINJA, which automate submission of files to VT. The only recommendation here is to make sure the tools you are using are reputable or you have done your own independent code audits to make sure no bugs may lead to data leakage.
When it comes to using other reputable security tools, it is wise to read over all of the documentation and make sure you understand how and when the given tool will incorporate VT.
Case 3: VT email scanning service
I have unfortunately seen may articles and forum posts online where people have been giving advice to use the VT attachment scan service. Basically, by sending an email attachment to scan@virustotal.com, the sender can receive a response as to what VT found regarding the attachment.
Please do not take such advice unless you are sure the document you are scanning contains no private data. It is a risky game. If you are worried about malicious documents infecting your computer, then the logical conclusion would be to buy an antivirus with a good reputation and the technology to block malicious documents.
If you choose to send all your potentially private emails to VT, searchable by anyone, then you’re essentially undoing any potential security or privacy benefits by exposing all your data anyway. What damage is a spyware going to do when you’ve already sent your sensitive data out to a public database?
EXE files problem
The next case I want to talk about, while less sensitive, is a lot more likely to be overlooked.
In a corporate environment, we cannot rely on everyone to manually submit attachments or files to security engineers—all of this is automated. From my past experience and from speaking with fellow security engineers, I have seen that it is quite common for all executables entering a corporate network to automatically get scanned with various plugins tied to a given platform. I will highlight Carbon Black, an enterprise antivirus program, in this case, although many other security providers have this problem as well.
When a new exe makes its way into a network, Carbon Black stores it, but also has the ability to cross reference the given file with various plugins and tools that are built in or added to the platform. For example, you can click a bubble on any given file in your network, which will give you its results against wildfire sandbox. And of course, the topic that has received so much heat in the media this year—the VT plugin.
Now, while they have fixed the issues on submitting documents to avoid leaking data, they still do submit exes. But wait, so what? Isn’t that exactly what we want it to do?
Correct, it is. Automation is what every corporation aims for in its security infrastructure. There is nothing wrong with the root idea of submitting and scanning exes flowing through the network. However, automation sometimes comes with a tradeoff if not properly planned.
I have evaluated the security infrastructure of many corporate networks and in these evaluations, I have seen that in this attempt to scan all new exes for malware, the company’s in-house executables end up getting scanned as well.
So now, confidential exes are unknowingly being exposed and leaking arguably more sensitive data and intellectual property. In addition, think for a moment about how software developers typically code. While they are testing functionality, it is common for a developer to hard code some credentials, paths, or other revealing information for a test build. Sure, after they are done, for the production build, it will likely get changed to hide this information and make it dynamic, but in the meantime, these demo builds have been picked up by the EDR and scanned through various plugins.
Again, this is not a problem with the EDR itself, it is a problem with its implementation, entirely the responsibility of the customer using the software.
Remediation and prevention
Now this does not mean we need to abandon use of security tools for fear of data leaks; it simply means we need to make some adjustments. So what can a business do to protect against leaking their own data to the public?
There are many options which will depend upon the compliance requirements and needs of a given company, but I have a few base considerations I recommend.
Rules-based segmentation
Rather than having a blanket automation where everything is automatically scanned, I always recommend segmenting the actions taken when the EDR sees a new file based on user groups. For example, maybe users in the developers’ group do not have their binaries residing in a specific directory sent for auto scanning.
However, this is easier said than done because just simply enabling this type of rule can be catastrophic and may essentially allow a developer free rein to secretly develop malware. That’s why, when one security rule is relaxed for a given user, another rule must be increased to make up for it. So in this theoretical scenario, we have just given a developer a free pass to not have his executables scanned. So we have closed one door but opened another.
To make up for this, one thought might be to keep heavy watch on the IPs and ports that the dev machines are allowed to communicate over. If the developer needs to communicate with a specific IP for his software, he should get approval in advance from the security engineers. At this point, we can let the developer go ahead and create malware, but if his MAC address or IP is seen attempting communication with a non pre-approved IP or over a non pre-approved port, fire alerts. This type of rule is trivial to create using a good EDR platform.
The roles and expected behavior of a given employee’s machine must be fully understood beforehand to be able to keep proper control over a network.
Understand the tools you use
It is important to understand that security tools are made for generic use. The creators do not know specifically what your company does and what your privacy policies are. They do not know whether you will be developing your own software onsite or whether you are simply using the tool to scan downloaded files.
That being said, it is up to you, the user or security engineer in charge of evaluating, to make sure you understand all of the functionality and options a tool gives you.
A developer who creates a tool to scan email attachments automatically with VT is not necessarily acting maliciously. For some users, maybe a user who specifically does not create and store info in documents, this might be the best tool in the world, exactly what they need to automate their operations. For another company who sends their contracts in the form of Word documents, this might be catastrophic. At the end of the day, the responsibility cannot be blamed on the tool that behaved exactly as advertised. It’s up to the user to do her own research and understand what the tool does and how it will effect privacy and security.