The use of big data in the battle against the coronavirus (COVID-19)—particularly the use of detailed mobile phone data to track and monitor the pandemic—has spurred big privacy concerns.
This topic matters not only for data use in an emergency, but also because it establishes a precedent that can and will be replicated across sectors. Government powers, once created, rarely go away, and more commonly, they get extended and repurposed for other means.
Before discussing why we must create data sharing protocols for future use of digital data, let’s first try to understand what COVID-19 shows us about the shortcomings of deploying digital data systems without a structured agenda around use and governance.
Largely, the critics of phone-based contact tracing have highlighted four areas of concern:
- Accuracy: Critics point out that mobile phone tracing cannot accurately track dispersion and contagion. According to medical consensus , the virus spreads when people are less than six feet apart. While call detail records, GPS, and WiFi data can give us a general sense of where a person is, it cannot determine with sufficient precision whether two individuals have come within six feet of each other. And while Bluetooth may help solve the issue, that technology is not yet ubiquitous.
- Probability vs. reality: The bigger problem with mobile phone data tracking is that it measures probabilistic risk rather than real exposure. And while this is useful as a statistical tool, individual risk estimation isn’t helpful to systems known to be under extreme stress, particularly if it leads people to seek more testing than they might otherwise. In many places, providing people with information on exposure can overwhelm health systems, as many people without symptoms may flock to health centers for fear of having been exposed.
Unlike the “move fast and break things” mantra that technology systems follow, health and legal systems work with institutional checks and balances to make sure they solve the intended problem. Validating digital contact tracing records will be a long and labor-intensive problem, resources that might be better allocated for other, proven response and recovery mechanisms.
- Representativeness: Another large concern comes from the fact that digital contact tracing leverages smartphone data. Relying on such data only works in communities with high penetration of smartphones. For example, although India has a large tech industry, only 28% of people there have smartphones. And in Ethiopia, which has one of the world’s lowest smartphone penetration rates, that number stands at about 11%. Smartphones are also greatly skewed to the male portion of the population. Technology markets are so fragmented that they aren’t a sound basis to deliver relief equally, as pandemic response should be able to reach everyone.
- Privacy: Finally, and perhaps most importantly, data surveillance, if left unchecked, gives extra power to governments. As a recent article points out, “We often talk about surveillance as ‘harmful,’ but really what we mean is: surveillance enables a significant number of harms.”
Despite these concerns, we recognize the huge potential of harnessing big data. Its rise is a development opportunity that is too big to be ignored—but it is critical that we get data use right and put appropriate safeguards in place. This is the subject of the World Bank Group’s next World Development Report – Data for Better Lives, which focuses on unlocking data’s potential, and how it can benefit people in developing countries, while protecting against misuse.
As transport specialists, we know the importance of using data responsibly. For the last couple of years, many mobility companies have been asked to share sensitive data with governments, a trend that raises significant concerns over security and privacy. On a more positive note, transport was one of the first sectors to reap the benefits of data collection, from automatic vehicle location to automatic fare collection, among others. As we progressively move from static digital data to real-time big data collection and use, user privacy is at the top of our agenda.
So, what can we learn from the transport sector about how to use data efficiently, during and after this pandemic, while protecting user privacy?
Our recommendations are as follows:
- Focus on proven outcomes: Digital data has proven extremely helpful to monitor and plan performance and service delivery at an aggregate level. For example, in transport, we are focusing our use of phone analytics data to track the performance of infrastructure, for instance to determine bottlenecks. We’re also using Call Detail Records (CDR) data to optimize public transport networks. These processes have been developed and tested before, while maintaining privacy and representativeness requirements. In the context of COVID-19, we could easily see at aggregate levels how people move after a disaster to provide aid more efficiently, or track migration patterns on a large scale. Collecting and sharing all these proven processes can inform client countries on how to use big data.
- Support the creation of data trusts: these could be an effective way of sharing and using data. Civic data trusts move beyond single trustees and build models of fiduciary governance for management, use and sharing of rights to data. Digital trusts could hold digital assets, like code or a subset of digital rights. They may act as rights clearinghouses or process management. The digital data trust would be a private, publicly and legally accountable organizational structure, capable of balancing complex interests to serve a unifying purpose. This is similar to what the Open Traffic initiative pioneered, pooling data from multiple providers, and converting it to actionable, sharable information.
- Create an agenda on effective data governance: The use of data is most efficient in places that have strong data governance and transparency requirements. This means private data should always be used in a transparent way, being very specific about the objectives and uses and making sure that all data is private by design, as has been our use of mobile phone data for origin-destination estimation. Data collection is critical to the COVID-19 response and to broader development efforts, but it must respond to a specific need and be governed through institutional mechanisms that ensure contextual value, necessity, and privacy.