For reasons of its own AOL recently released a ~450 mb file containing three months of search records for 650,000 AOL users. They took it offline fairly quickly, but by that time the file had already crossed the globe seventeen or eighteen times. Christian Beckner at Defensetech writes (formatting his):
People are already poring through the data, finding some very disturbing search patterns among a number of AOL’s users. In theory, there is no personally-identifiable information on the database, but if people ran searches that identify things about themselves, it often becomes easy to figure out who they are. In many ways, this is a worse privacy loss than the laptop stolen from the Veterans Administration employee earlier this spring, if it had been compromised.
This inadvertent disclosure of data forces the need for a public debate on the retention and use of search data by private companies, and the propriety of its use by government agencies. In January we learned that Google refused a DOJ subpoena to supply the government with exactly this kind of data – a request with which Yahoo!, AOL and MSN complied. These companies are compiling petabytes of search data on their servers, effectively archiving the collective subconscious of hundreds of millions of people.
If you have an AOL account then it should probably behoove you to check the file and find out exactly what anybody on Earth can now find out about you. On this issue I think that John Aravosis deserves credit for sounding the privacy alarm in the general blogosphere faster than practically anyone else. John was right that we all ought to be concerned about how porous our information security has become. Have you ever served in the army? You may have lost your SS # and other private information two or three times over. Maybe you have insurance, or a credit history on record or your employer keeps private employee information on a disk somewhere. Stupid mistakes or theft has lifted thousands to millions of private records at a time from each of those examples.
If we live in an information age where information is currency then we ought to treat information like currency. 7-11 managers don’t leave the evening’s till sitting on a curb or in an unlocked cabinet. Banks don’t let a mid-level manager carry random safe deposit boxes home in his briefcase and the Treasury Department doesn’t transport newly-minted money by pizza delivery boy. Unless we start doing so our personal identifiers, particularly the oft-pilfered ones like Social Security #, will mean less than nothing and cleaning up defrauded accounts will become a significant drain on the U.S. economy.
John Aravosis once pointed out that either party could make a winning issue out of protecting Americans’ privacy. God knows this has traditional conservatism written all over it, even if privacy questions mostly come from Democratic circles these days. Sadly, other than some noise from Hillary Clinton the silence has been bipartisan.
Pb
Yeah, I read all about it on slashdot–that’s a huge technology / civil liberties issue there, so the deafening political silence doesn’t surprise me.
I don’t, but of course I know friends and family who do. I guess I should get a copy of it and search for their info? :)
Incidentally, what are the odds that the Bush administration already has this–and not just the 450MB sample, but *all* of it? Given the Google lawsuit story a while back (for those who were paying attention to it), it wouldn’t surprise me if they did, so from that perspective, that’s a very important debate to have as well, especially now that we know the sorts of information that could be in those search queries.
The Asshole Formerly Known as GOP4Me
If the government doesn’t get to know Americans’ predilections for porn and shopping, the terrorists have won.
Freedom is messy.
Perry Como
Just one example of browsing through the AOL data, apparently user 524625 is married, lives in Memphis, has kids, is having marital problems (sex related), having mortgage payment problems, is a Christian, is looking for a job in real estate, and had her tubes tied but is now looking to reverse the procedure. There’s other information in there, but I’d hate to get personal.
J. King
“Sadly, other than some noise from Hillary Clinton the silence has been bipartisan.”
You’re damn right, and that’s why I despise the Democrats almost as much as I do the Republicans–because they are STUPID.
Perry Como
User 375 has a fetish for grannies, Rolex Air Kings, and 1969 Dodge Super Bees.
I should have time later this week to drop the data into a DB. Mining this should prove…interesting.
Perry Como
Wheee: http://www.aolsearchdatabase.com/
The Asshole Formerly Known as GOP4Me
Well, we know from our research into his personal history that Osama Bin Laden doesn’t care for 1969 Dodge Super Bees, so we can safely eliminate User 375 from suspicion of being a front for Osama Bin Laden.
Already, the search net we’ve cast using an Internet search engine has drawn a little tighter! Once we have Osama’s IP address, rounding him up should present no serious difficulties whatsoever!
The Asshole Formerly Known as GOP4Me
Does Osama have a granny fetish, though?
Perry Como
Vote for the top 100 AOL search terms: http://aol.yogurtrat.com/top100.php
Pb
I think someone has a mother-in-law fetish…
Slide
Hey Perry, thanks for that link, I did some research and I think I determined that AOL user 738499 is our very own Darrell. Search terms:
I love Bush fan clubs
how to make nonsense sound convincing
gay Scoutmasters that destroyed America
pornography for idiots
speed posting 101
how to spot early onset alzheimers
guide to walking and chewing gum simultaneously
the murders of Bill Clinton
Democrats and Traitors
Ann Coulter nude photos
Rush Limbaugh nude photos
Karl Rove nude photos
illustrated guide to tying shoelaces
how to stop drolling
the Constitution, do we really need one?
the good deeds of the Hitler youth
vacation spots in Baghdad
cfw
Privacy violation is a tort and covered by insurance. If one put in a claim, what would be the compensable damages? What would a jury award after proper defense by a competent lawyer? Maybe a thousand? Three thousand? These are not huge potential claims from a money perspective. What is offensive is not getting paid while Google makes billions from data mining.
Solution, from my perspective, pay for my info. You can have my info for your data base (with no names attached). Pay me how much? Whatever the reasonable value might be, and take care of me if some disaster happens (unlikely).
I would trade my dreams of punitive damages for something now and practical, like pay my netflix bill each month (say $20 per month) and provide 5 itunes a week. I say let the Googles at my info, for reasonable pay to me, and stop showing me ads that are for things that would never interest me.
We spend some $450 billion a year for ads. If we could reduce that to $300 billion, we end up with a more productive economy.
Argonaut
“Solution, from my perspective, pay for my info”
Their response — “we are… search isn’t REALLY free. Email isn’t REALLY free. Online spreadsheets aren’t REALLY free.”
Internet users have gotten so used to the idea that software costs nothing that they turn up their noses at any that actually seeks to recoup development costs up front — but they’ll gladly turn over their lucrative data in return for a “free” chat program.