Search Engine Queries: Untrusted Computing?

Sometimes, I hate being right. A long time ago, in galaxy not so far away, I wrote a post about the possible security/privacy concerns of using an external search engine on your website. I noted that the search phrases were no doubt logged by such engines and a profile probably created. Said profile would likely be used for commercial purposes and I questioned whether that was appropriate. At the time, some criticised the post as not being realistic.

However, my concerns appear to be becoming true. But in a way I had not foreseen - namely, political.

Recently, Republican President Bush's administration subpoenaed search engine data from at least Google, Yahoo, AOL and Microsoft . Except for Google, everyone immediately caved in under the pressure and submitted at least some of the data without even trying to quash the subpoenas.

Although reports indicate no personal information was requested or released, search queries can be mined in ways that can lead back to an individual. For example, databases can be cross referenced such that information that appears safe when viewed in each database can be combined to point to an individual.

Even single, aggrate databases that don't have names can be queried in ways that can lead to an individual. That is, even if the database doesn't have a name field, but does include items such as, but not limited to, race; age; city or zip code, queries can be constructed that slice the data down to one person. I've seen it done and it's almost trivial to do.

Please note: I'm not saying this has occurred in this or any other instance. But since it is theoretically possible, it comes down to - who do you trust?

This time, no names were apparently requested. But I'll be taking odds as to when the proverbial nose of this camel is soon followed into the tent by the tail.


I believed you, even in that other not-so-distant galaxy. It's one of the reasons why I never got a search think on my site.
For similar reasons I seldom use narrow and precise search terms when googling the net. Getting 2k results is usually no problem for me but not very usefull for profiling engines.


