By Gabriel Nicholas
In 2009, back when Gmail was in beta and Facebook was a place to play Farmville, Brad Fitzpatrick launched the Google Data Liberation Front, an engineering team at Google whose goal was to make it easier for users to move their data in and out of Google products. As Fitzpatrick and his crew helped various Google product teams build APIs and data export tools, they would triumphantly declare each service “liberated” on their Twitter account, whose avatar featured a hand, with ones and zeroes for fingers, breaking free from shackles.
The Data Liberation Front embodied Google’s “don’t be evil” ethos. As then-CEO Eric Schmidt remarked at the time, “How do you be big without being evil? We don’t trap end users. So if you don’t like Google, if for whatever reason we do a bad job for you, we make it easy for you to move to our competitor.”
Yet over a decade later, Google and its peers have only tightened their stranglehold on our data. Google, Apple, Amazon, and Facebook have used their unparalleled data access to squash competitors and become some of the most dominant companies in the world, so much so that all four are facing serious antitrust scrutiny. Many scholars see their unchecked market power as a root cause of the existential threats that the internet poses to democracy, privacy, journalism, and small businesses.
Data liberation has recently reemerged as a way to improve competition by making it easier for consumers to switch to new services, and Big Tech companies are scrambling to support it as a more palatable alternative to breakups. Mark Zuckerberg included it as one of his “Four Ideas to Regulate the Internet”, and Google, Facebook, Apple, Microsoft, and Twitter are all collaborating on an open-source framework called the Data Transfer Project to make it easier to move data between platforms.
While the internet’s competition problems have grown more complicated in the decade plus since the Data Liberation Front was founded, data liberation methods are stuck in 2009. For data liberation to meaningfully improve competition, we need to reimagine how it works to give users more useful data, more privacy, and more bargaining power over platforms.
Though the term “data liberation” has gone out of vogue, the two most common methods of data liberation — portability and interoperability — have stayed more or less the same since the 2000s. Data portability, a term popularized by Fitzpatrick back when he founded the blogger network LiveJournal, is the ability for users to take a snapshot of their data on one service and bring it to another. Interoperability is a more seamless version of the same thing that generally uses application programming interfaces, or APIs, to let software services directly send data to one another.
Portability and interoperability are increasingly popular with tech industry regulators around the world. Many countries, including Singapore, Australia, and members of the EU, already have laws requiring some form of data portability. Though these laws mostly focus on letting users archive and understand their data, rather than actively move it elsewhere, updated legislation is being considered around the world, and even in the US, to help users transfer their existing data to new services.
These laws tend to be based on an antiquated model established by one of the earliest and most successful examples of data liberation: mobile phone number portability. In 2003, the FCC required carriers to allow customers to transfer their mobile numbers to another provider for free. Before then, customers had to give up their phone numbers when they wanted to change from AT&T to Verizon, for example. This created a high switching cost, disadvantaging people trying to change carriers and reducing competition. Allowing customers to keep their mobile numbers eliminated that switching cost, and the positive effects were almost immediate — the cost of cell phone plans dropped significantly within a year.
As with phone number portability, today’s data liberation policies are aimed at allowing users to bring their data with them to new platforms. But there are two problems with this approach. First, data that’s useful for one platform isn’t necessarily useful for another in the way that phone numbers are; second, the most dominant tech companies don’t have competitors that many users would want to send their data to. Liberated data thus has to not only to let customers try alternatives but also allow competitors to build those alternatives, and even entirely novel services. So far, this has failed to happen — tech companies big and small have for years been required to let users download their data under the EU’s General Data Protection Regulation, but no major services have been built using this data.
A broader approach to data portability could address a wide scope of competition challenges in the tech industry, and the first step to achieving this is to liberate data from emerging technologies before these issues develop. Take the example of smart meters, which consumers can use to optimize electricity use in their homes to lower their bills. Utilities let third party developers build apps for these smart meters, but they limit how much data they can access and how quickly they can access it, giving preference to their own homegrown apps. Placing portability requirements on smart meters could give consumers new ways to save on their energy bills and reduce the potential for utilities to abuse their monopolistic power.
Data liberation also needs to be reworked to better navigate thorny issues of privacy, especially for social media. On the one hand, users obviously should not be able to liberate the data of others. That kind of access is what allowed Cambridge Analytica to collect personal data on 87 million users despite only receiving permission from 270,000. On the other hand, studies show that developers struggle to build useful products out of social media data that’s taken outside of its original context. Consider for example, Facebook’s Download Your Information tool. It allows a user to download every comment they have ever made but not the status or post they were commenting on. Even though this data could be freed from the Facebook platform, there’s no real use for it without the context.
To address this, we need to allow groups of users to liberate data that they share. Focusing on groups rather than individuals can give users access to their data and its context without infringing on the privacy of others. With group portability, friends on a messaging thread, for example, could consent to move their group chat from WhatsApp to Viber instead of leaving each person with only their side of the conversation. Enabling groups of users to move data in larger chunks could also begin to address the competitive barrier of network effects, where growth in the number of people that use a service drives bigger growth, giving large social networks their “everyone you know is here” appeal.
Still, without being able to coordinate the switching of services en masse, users won’t have enough leverage to push platforms to make meaningful changes. Anouk Ruhaak, a Mozilla Fellow who studies and develops data governance models, suggests a radical solution to this problem with her concept of data trusts. The idea is that users should be able to liberate their data, pool it, and entrust it to an entity that is legally required to act in their best interests. This trustee would then wield collective bargaining power that it could leverage for better data practices for users. For instance, a data trust could withhold GPS data from Instagram for potentially thousands of users until the service was more transparent about its data collection practices. The larger the trust, the more inclined Instagram would be to listen.
These new approaches to data liberation are experimental and not without risk, but the old ones aren’t sufficient to address the tech sector’s unprecedented competition problems. As the US government shifts its regulatory attention from hearings to legislating, and as large tech companies scramble to show that they can use data liberation to regulate themselves, lawmakers will need to update their proposals to match the extent of the issue. The Data Liberation Front’s methods are now 12 years out of date — it’s time to think bigger.