Mysterious database exposed personal information of 80 million US households

Mysterious database exposed personal information of 80 million US households

Word has broken of yet another massive data trove exposed for anyone to see. A research team from vpnMentor discovered an exposed 24GB database hosted on a Microsoft cloud server containing the addresses, income levels, and marital statuses of users within 80 million US households.

As we’ve seen recently, many organisations aren’t taking steps to secure their customer data and every so often one makes the news. Some may have been exploited while exposed; others will have been lucky.

Occasionally, there’s a quick takedown of the exposed information; sometimes it’s nearly impossible to find out who, exactly, is responsible. At that point, the only option left is to ping someone like Microsoft to take that final step and hope they can do something about it.

What’s the damage report?

Since 80 million US households were sitting in this database, that means considerably more people could have been impacted. Across thousands of entries, the researchers couldn’t find anyone listed under the age of 40.

The exposed data included a mixture of coded information and non-coded information. Non-coded items included street addresses, cities, states, counties, zip codes, latitude and longitude coordinates, ages, dates of birth, and first/last names along with middle initials. The data assigned a coded, numerical value contained information, such as marital status, income, gender, dwelling type, and homeowner status.

Decoding the numbers

In practice, what the coded and non-coded entries mean is you could easily view someone’s name or address, but something like gender or title is instead assigned a numerical value. Some of the information chained to coded values may not be possible to figure out: For example, “Income [1]” or “Income [6]” may be too obscure to put a salary range on it. However, if you see “Steve” and the gender assigned is “[1]” then it’s probable that 1 = male on all their records.

In this way, even where data is assigned a numerical code, you can piece together most of a person’s profile. If the salary for people listed 70 and up is “10”, then 10 might be “retired”, “on a pension plan”, or something similar.

In fact, there’s a lot of code-assigned sections alongside viewable data, so full street address + code for dwelling type + Google maps = a quicker and easier way to assign home-types to people listed then (say) target them with property-specific phish attacks or other social engineering tactics.

What exactly is this database for?

Given the upper end of the ages listed in this database, they could well be more susceptible to these kind of tricks. The database was eventually taken offline by Microsoft, who have apparently notified the owner(s). Meanwhile, researchers have asked the public to try and help identify exactly who this data belongs to.

They suspect it has some sort of financial service connection, such as insurance or mortgaging or perhaps healthcare. The specific age range shown in the data looks at might have suggested a form of dating app for older generations, except it makes no sense for it to focus on households rather than individuals. The geo-locational coordinates may associate this with some form of mobile app connection, as you’d typically expect to see that via portable apps as opposed something filled in on the desktop.

Time to play the waiting game

No matter the purpose of


the database, the good news is that it’s currently offline. It also doesn’t seem to be the case that it’s been used maliciously—for now, anyway. There isn’t a huge amount anyone can do in this situation beyond advising to be wary of the usual social engineering scams.

Ultimately, this database is large but also quite generic, with no way to say for sure exactly what it’s for. As a result, it’s a case of being on your guard and keeping some common sense handy at all times.

This isn’t something to worry about for the time being, and hopefully this tale begins and ends with “someone needs to secure their data better.”


Christopher Boyd

Former Director of Research at FaceTime Security Labs. He has a very particular set of skills. Skills that make him a nightmare for threats like you.