Explained: False positives

Explained: False positives

What are false positives?

False positive, which is sometimes written as f/p, is an expression commonly used in cybersecurity to denote that a file or setting has been flagged as malicious when it’s not.

In statistics, false positives are called Type I errors, because they check for a particular condition and wrongly give an affirmative (positive) decision. The opposite of this is false negative, or Type II error, which checks for a particular condition is not true when, in fact, it is. In this blog post, we will focus on false positives in cybersecurity, but note that false negatives in this field are commonly referred to as “misses.” So “misses” are malicious files or malicious behavior that the scanner or protection software did not detect.

Possible causes of false positives

The most common causes of false positives are:

  • Heuristics: decisions are made on minimal bits of information
  • Behavioral analysis: decisions are made based on behavior, and the legitimate file shows behavior that is usually considered malicious
  • Machine learning: sometimes we see the effects of “garbage in, garbage out,” or more politely put, “training did not take certain situations into account.”

Let’s give some examples of these causes.

An example rule for a heuristic detection could be this: if this file claims to be from Microsoft, but it is not signed with the Microsoft certificate, then we assume the file has malicious intentions. A false positive could occur in the rare case that Microsoft forgot to sign the file.

One detection vector in spotting the behavior of ransomware is if a program starts deleting shadow copies. Some ransomware families do this to ensure the victim has no backups. But you can imagine a cleanup utility that deletes old shadow copies, which could possibly be flagged as displaying malicious activity, right?

training

Machine learning is done by feeding the system vast amounts of training data. Mistakes or ambiguities in the training data can lead to errors in the detections.

Designing detection rules for yet-unknown malicious files or behavior is always a balance of trying to cover as many of them as possible without triggering any false positives and, understandably, this can go wrong sometimes.

Fun facts

A much less common cause for false detections is deliberate false positives. The most well-known false positive is the EICAR test file, a computer file that was developed by the European Institute for Computer Antivirus Research to verify the response of antivirus programs without having to use real malware. Note that Malwarebytes for Windows does not detect the EICAR file and Malwarebytes for Mac only detects it under exceptional circumstances. This is by design.

But history has also brought us deliberate false positives as a way to test if an anti-malware software is using detections made by their competitors.

Summary

False positives are alarms for non-specific files or behavior that is flagged as malicious, while in fact there were no bad intentions present. They are caused by rules that try to catch as many malicious events as possible, which sometimes fail by picking up something legitimate.

 

Pieter Arntz

ABOUT THE AUTHOR

Pieter Arntz

Malware Intelligence Researcher

Was a Microsoft MVP in consumer security for 12 years running. Can speak four languages. Smells of rich mahogany and leather-bound books.