On March 12th, Infochimps, a startup data marketplace here in Austin, entered into a revenue sharing agreement with MySpace allowing the social network’s data to be available for sale on the Infochimps site.
When looking at the RWW article closely, you see that the announcement began with 22 available datasets, but now only 8 MySpace datasets are available on the Infochimps site, without any explanation as to why 14 datasets were removed.
What happened here, and what can this tell us about the possible futures for data privacy?
First, MySpace got themselves into trouble by providing identifiable user data on a 3rd party site (i.e. Infochimps) without notifying users or having any warning in their ToS that this was possible.
Users were also frazzled by the possibility of MySpace profiting off of the sale of this data. And many users completely misunderstood, thinking that MySpace sold the data outright to Infochimps, which was not the case.
MySpace users think of the social network as ultimately private, even though they are publicly sharing much of their information within the site (at times, even their phone numbers).
MySpace is completely within their rights to provide their data firehose to any outside party – after all, it is already available for free via the API. However, many users were unaware of this fact and because MySpace was perceived to be sharing and attempting to sell their data to 3rd party developers, a privacy backlash occurred.
Confused users believed that because their data – including updates, photos, and zip codes – is within MySpace and because their profile is accessible only to friends, they retain control of their information within the walls of MySpace. When users realized this wasn’t the case, it created an unnecessary frenzy amongst users, resulting in anger and mass deletion of many MySpace accounts.
With that being said, let’s do our best to delve into the varying degrees of private and public data.
Danah Boyd’s thoughts on the subject in her 2010 SXSW keynote presentation, entitled “Making Sense of Privacy and Publicity”, is summed up by the following:
Fundamentally, privacy is about having control over how information flows. It’s about being able to understand the social setting in order to behave appropriately. To do so, people must trust their interpretation of the context, including the people in the room and the architecture that defines the setting. When they feel as though control has been taken away from them or when they lack the control they need to do the right thing, they scream privacy foul.
Just because something is publicly available doesn’t mean a user wants it to be publicized.
It’s a debate that has become increasingly prominent and heated. Regardless of your preferences, the privacy of social network user data inherently holds many opinions and perspectives, even within the academic community.
So, what exactly has caused users to get so upset? The major explanations for most of the MySpace backlash can be attributed primarily to three issues:
1) Lack of awareness that user data is already very much available through the API, and for free.
2) Issues and misunderstandings around the sale of the data.
3) How anonymous the use of individuals’ data was kept.
Perhaps the easiest solution to these issues is continued market education and better communication and understanding of the ToS agreements. We, as users, cannot continue to play dumb and resort to reacting against the social network service providers. This will just continue ad infinitum until we reach an agreement on how to best approach the problem that public vs. private social network data presents to us daily.
What if in the future, data is so privately held within the Facebooks and Twitters of the world that anyone looking to slip through and do something really useful to society is sued. Consider the case of Pete Warden, who accumulated data on 210 million public Facebook profiles, planning to give academic researchers access to his findings. Instead, he was forced destroy it. Think of all the amazing insights, tools, and applications that could have been built on that data alone.
On the other hand, we do need some sort of security measures to ensure that if our data is accessible, at least it doesn’t get into the wrong hands (i.e. spammers, identity thieves, etc)
While more clarity of ToS and data privacy needs to be provided by companies, and more awareness needs to be instilled in users, that will not provide a comprehensive solution, especially given the common disregard most users display towards ToS.
What if, instead, use of public and private data included users receiving dividends when their personal data is used for profit generating means, like in financial stocks today? We may also need more options than just opt in and out. Would it mean that opting into something is based more on the circumstance than simply a binary solution of yes/no? Could users be notified with alerts when their data is being accessed, putting the power into the user’s hand to make a clear decision of what privacy terms they really agree with?
Today, groups like the Electronic Frontier Foundation (EFF) are carrying the torch on issues of data privacy and publicity, but how aware are technologists and users of these laws that directly affect us all?
Are we taking adequate measures to educate the next generation on data privacy mistakes, like in the case of MySpace? Or is this even an issue that will be of concern to the next generation, for whom issues of private vs. public are already being reconceived?
What do you think about these issues?