“Anonymized” data really isn’t—and here’s why not – Ars Technica

The simple fact is that large set of data can be used to ascertain certain things. For example, using NetFlix’s data set, the most popular movies among men over 40.

But consider when that data set becomes huge… How many restrictions would be needed to find an entry that is yours? Even if it’s made Anonymously.. Let’s take previous example, have you viewed the “most popular movie among men over 40?”… How many of entries would be there from your state? Your zip code? Born in the same year? Or the same birthdate?

Take a look at the Ars Technica article, and be surprised how easy it is to reverse the anonymous data, and make it readily identifiable.

‘Anonymized’ data really isn’t—and here’s why not
Companies continue to store and sometimes release vast databases of ‘anonymized’ information about users. But, as Netflix, AOL, and the State of Massachusetts have learned, ‘anonymized’ data can often be cracked in surprising ways, revealing the hidden secrets each of us are assembling in online ‘databases of ruin.’

(View the rest of the article at Ars Technica)