New data for development policy

|

This page in:

What key insights have emerged from development economics in the past decade, and how should they impact the work of the World Bank? A new working paper Toward Successful Development Policies: Insights from Research in Development Economics from the Bank’s research department captures 13 of the most significant insights in the world of development economics.

Here’s insight #9 on moving on from traditional data collection methods. See all previous insights here: Thirteen insights for successful development policies

Until recently, most of development economics research relied on traditional data types such as household and firm surveys and national accounts. Governments (or agents authorized by governments) have typically been central to the data collection efforts and data typically have been collected for specific purposes, often to foster development.  For example, the earliest social surveys were undertaken in England by Charles Booth and Seebohm Rowntree in the 1890s to measure poverty, to describe the deplorable living conditions of the poor, and to bring about social policy reform.

Traditional data methods require strong statistical capacity.  Trained staff, budgetary autonomy for agencies that collect data, adequate installations, connected databases, and international partnerships are important factors in shaping successful national data systems.ref1 These resources are often scarce in low-income countries,ref2 leaving them least equipped to collect the data necessary to assess and understand the scope and nature of development problems and make inroads to solving them. Enhancing the statistical capacity of client countries therefore has, and will continue to be, a point of emphasis for the World Bank Group.

Moving beyond traditional data

At the same time, traditional data collection methods can be relatively costly. ref3 Surveys are therefore performed infrequently,ref4 and they lack the granularity necessary to make meaningful inferences about sub-populations of interest. In contrast, so-called “digital data” from mobile phones, satellites, and other sources can be collected cost-effectively, with high frequency, and at fine levels of granularity.  Digital data therefore may offer new insights for understanding and resolving some development challenges. Recent examples from empirical research include:

  • Measuring poverty, agricultural productivity, and malnutrition.  Influential recent work has used satellite imagery of nighttime lighting to measure poverty by using it as a proxy for wealth.ref5 In Rwanda, researchers have mapped personal data from mobile networks to individual phone subscriber wealth. They applied their model to predict wealth throughout the country, showing that the predictions match well with those from costlier traditional surveys of the population.ref6 similar approaches have been used to produce granular maps of crop yields and malnutrition.  ref7ref8 
  • Improved targeting of public health interventions and natural disaster relief. Understanding who is most affected by natural disasters and predicting relocation patterns is key to effective humanitarian relief operations, public health interventions, and long-term reconstruction efforts. Mobile phone data have been used to reveal who is most affected by natural disasters,ref9 where people relocate in response to a disaster, ref10 and the implications of relocation for the spread of disease. ref11 
  • Natural Language Processing methods: text as data.  Traditional qualitative surveys require in-depth analysis of interviews coupled with participant observation in small groups.  These have recently been supplemented with the analysis of narratives from much larger samples by applying Natural Language Processing methods to existing texts. Such methods have been used to analyze Wikipedia pages to predict levels of economic developmentref12, and to study discourse within village meetings in rural India.ref13

Challenges in harnessing digital data for development

Governments have been at the center of traditional data collection efforts.   In contrast, digital data is typically collected by firms, which focus on its commercial value rather than its potential social value. Moreover, data are “non-rival”, meaning that a person’s call data records, location history, internet usage and medical records can be used by many firms (and governments) at the same time and for different purposes. Recent theoretical research shows that this may lead firms to respect data privacy less than is socially desirable, and they may also have incentives to inefficiently hoard data.ref14  In short, when firms own data, they may over-use and under-share it while not adequately respecting consumer privacy. Policy priorities for making digital data work for development include:

  • Data Governance: Privacy and Sharing.  Establishing data governance frameworks that safeguard individual data and privacy while expanding the development benefits of that data to many stakeholders is challenging.  Frameworks for supervising and enforcing new laws with respect to data that are designed for developed economies may not transfer well to weaker institutional environments.  
  • Ownership: Democratizing Data.  By crowd-sourcing information to facilitate more responsive governance, and by giving citizens the ability to conduct their own surveys and analyze the data over large numbers of people, data can foster voice, government accountability, and transparency. Examples of such projects are underway in both India and Indonesia.ref15 
  • Measures to Avoid Politicization of Data.  Personal data are now widely available and can be manipulated to further the private interests of firms leading to fears of “surveillance capitalism.”  There is also a fear that governments will use individuals’ digital data to pursue political objectives, centralize power, and discourage dissent.ref16, ref17  This calls for data governance frameworks to safeguard privacy, and give citizens more control over the use of their personal data.
  • Making Traditional and New Data Work Together.  Most new data are generated by individuals through their interactions on mobile devices and computers. In some places, the digital user population may approximate the overall population. In other contexts, including many developing countries, large swaths of the population who are not digitally connected will not be reflected. Data collected via traditional methods, which sample the entire population, can speak to the validity of, for example, poverty maps generated using digital data, and help identify which sub-populations are systematically under-represented in these applications.

Authors

Bob Cull

Lead Economist, DEC

Dean Mitchell Jolliffe

Lead Economist, Living Standards Measurement Study (LSMS), World Bank

Vijayendra Rao

Lead Economist, Development Research Group, World Bank