TechCrunch saw Twitter re-writing all links to point to their site. Great idea as it gives them wonderful data. Potentially bad idea from a privacy standpoint if they do anything more than count how many times a link is clicked. If the link is to a URL shortening service, they’d get a lot more data by looking at what the link redirects to. The problem is that the end URLs can be parameterized in any number of ways, including with information that can be considered personally identifiable by the latest thinking from industry associations and the FTC (per David Vladeck’s recent interview).
Q: The marketers make a distinction between personally identifiable and non-personally identifiable information, that they’re only collecting anonymized information.
A: Well, but we saw what happened. There’ve been all sorts of disclosures with allegedly anonymous data. The problem is that it’s like a mosaic. If you have the information released and you can match it to other publicly available data about somebody, you can often put together a pretty complete picture. You know, I think were past that debate. At least, I think the F.T.C. is past that debate; whether the rest of the world has caught up with us, I don’t know. But we don’t find that a tenable distinction. And if you look at our online behavioral advertising report we make this point, I think, pretty emphatically.
I recently went to the OMMA Behavioral conference in SF and talked to a number of chief privacy officers about problems at the intersection of data collection, advertising and privacy. The bottom line was that many companies unknowingly make false claims to consumers and violate laws and self-governing principles. One of the stories I heard was the epitome of how easy it is to get into trouble. It had to do with an SEO change made to a publisher’s site resulting in personally identifiable information going to an ad targeting company. Previously, the targeting company had been collecting some basic URL info from the publisher. The SEO change moved potentially sensitive data from query strings into the path of URLs. The SEO team had no idea of the impact of their change–they didn’t even know about the data sharing deal. There was no notice to consumers or the targeting company. The ad targeting company discovered this after the fact (and after both they and the publisher had significant liability exposure).
I’d love for Twitter to collect and make use of more information about how people use the service. As volumes increase, they need more meta-data to create a better service. However, I hope they have experienced advisors on the privacy side, e.g., folks like Dan Jaye who co-wrote the P3P spec and later did TACODA.