Big data for development: Beyond Transparency


This page in:

For the development community, the focus on ‘data’ has been very much on open data: making public where aid dollars are being spent. This is no small task, and I welcome the rise of platforms and initiatives such as The World Bank’s Mapping for Results, DFID’s Project Map, aidinfo and the International Aid Transparency Initiative. Transparency about aid is very important - it raises public awareness of development work, it enhances accountability among both the givers and receivers of aid, and it can drive out waste, bureaucracy and corruption.Big data can give insight into development challenges, such as nutrition in India. (Credit: Wen-Yan King, Flickr Creative Commons)

But we can do much more with data. Big business already gets this: companies from Tesco to Facebook have been using the data they collect to gain valuable insight on their users and drive efficiency for years. It’s time for governments and the third sector to catch up. In many cases these groups, such as microfinance organisations, local government and community health centres, already collect plenty of data, but don’t make much use of it.

Big data is a powerful tool to help design policy that really works, and bust myths by revealing what doesn’t. Esther Duflo and Abhijit V Banerjee give a good example of this in Poor Economics. They use data on 18 countries to show that “government and international institutions need to completely rethink food policy”. The prevailing wisdom says that we must provide food grains to the very poorest to protect them from starvation: Egypt, for example, spent 3.8 billion in food subsidies in 2008-9. Yet the data reveals that the poor are not desperately striving for more calories. Food makes up only 45-77% of expenditure among rural extremely poor and 52-74% among urban extremely poor. Nor is the rest of their household budget dedicated to necessities: alcohol, tobacco and festivals comprise a large part of the spend. A survey from India confirms this: the number of people who consider that they do not have enough food fell from 17% in 1983 to 2% in 2004. And yet children growing up in these families persistently show stunted growth from a lack of nutrition. From the data the real problem emerges: people are not literally starving, but their diets are not nearly sufficiently nutritious. Thus the best role for governments is not to provide more staples like rice, noodles or wheat, but to provide or subsidize more nutritious foods. (Naturally this does not apply in natural or man-made emergencies).

Big data can be invaluable in improving public service delivery, as well as design. One major challenge for healthcare in the developing world is ensuring that limited supplies of life-saving medicines are distributed to the health facilities where they are needed. Demand for drugs, such as anti-malarials, is not entirely predictable – to ensure that the right distribution is achieved, you need to use real time data. A pilot programme called SMS for Life did just this to improve the distribution of malaria drugs at a health facility level in rural Tanzania, and prevent ‘stock-outs’. Their real innovation was getting front-line workers from every clinic to send an SMS with their stock count each week. Once senior coordinating staff had access to these figures, they were able to accurately target re-stocking of the clinics. The results were dramatic: the proportion of health facilities with no stock of one or more antimalarial medicine fell from 78% to 26%, and in one of the three districts, stock-outs were eliminated by week 8 of the pilot with virtually no stock-outs thereafter.

Randomized Controlled Trials (RCTs) are another powerful way to use data for development. These have long been the gold-standard for evidence in medicine and are gaining traction among NGOs and academics. RCTs can be used to prove the efficacy of the nutrition-based food policy set out above. For example, the Work and Iron Status Evaluation study in Indonesia provided randomly chosen adults with regular iron supplementation from fish sauce. For a self-employed male, the yearly gain in earnings as a result of being able to work harder from improved nutrition was $46 USD PPP, while a year’s supply of the sauce cost just $7 USD PPP. RCTs have also been used to measure the impact of introducing microfinance into a community in India, or to determine the effect on girl’s education of introducing community-based schools in rural Afghanistan.
So yes, open aid data is important, but it’s time that governments and NGOs started doing more to harness the insights of big data and randomized controlled trials. There are many moves in the right direction. One new organisation to watch is DataKind (originally Data Without Borders) which aims to bring data scientists together with NGOs who could use their services. The World Bank have also recently launched Global Findex, a store of data about people’s use of financial products, to inform financial inclusion policy. The data comes from a survey which covered at least 1,000 adults in each of 148 economies using randomly selected, nationally representative samples. I hope that these initiatives mark the start of a new of era of D4D - data for development.


Alice E. Newton
July 30, 2012

Thanks for your comment Tim. I agree with you about a combination of kinds of data being important for development policy and practice, in particular local insights. I suppose I wanted to talk about 'big' data in this piece, not at the expense of 'small' data, but because I think the majority of ambitious large data projects in development have focussed on transparency, and at that scale I want to see more data usage by policy makers / managers themselves. But that shouldn't mean neglecting other kinds of research and collaboration.

July 28, 2012

The data for development requires not merely a huge number but need precise information to obtain further acts instead of understand the problem, and preventive action.

How are we measure the data is important? This is going into our perceive of the problem finding and solution proposed that need certain number as approval for our development treat.

Data is also require relevance


Tim Davies
July 28, 2012

I would very much agree that there is great potential for development policy and practice to be improved by good research, some of which will involve large datasets - but I tend to feel it is somewhat dangerous for us to focus only on 'Big Data' in trying to foster a more evidence based, and reflective, model of development.

We need a mix of big data, small localised data, local insights, good analysis and sensitive implementation of projects to improve development - not just big data alone.

Some of the cases you give, such as the stock out data collection, could also be analysed through the lens of 'good management' - where its not only the data, but the change in management relationships involved in reporting back to base regularly, that play a role in improving practice.

So - better use of data to support development: yes. But alongside a broader approach to intelligent development, involving not only big data, but also small data, conversation, collaboration and much more...

Emmanuel Letouzé
August 14, 2012


We at UN Global Pulse are glad to see that more and more people and institutions are starting to recognize the potential of "Big Data for Development", a space in which we have been working since early 2010.

Given the title of your blog post I hope you've had a chance to read our white paper, "Big Data for Development: Opportunities and Challenges" [available here:], which to the best of our knowledge was the first time the phrase was used.

I share your enthusiasm for evidence-based policymaking and the open data movement of which the World Bank and DfID have played founding roles.

However, I feel i must point out the fact that, I far as I can tell, none of the examples you cite actually involve big data, i.e. massive amounts of digital data such as call logs, mobile-banking and credit card transactions, online user-generated content (blog posts, Tweets..), online searches, satellite images, etc.

I have never heard of RCTs using big data, and it seems to me that big data is not suited for RCTs given the methodological requirements of RCT experiments. Likewise, i would argue that survey data is not big enough and realistically never could be big enough to be called big data. The bulk of big data is also unstructured --i.e. text, video, images, etc.-- which creates specific challenges as well as opportunities.

Lastly, it is also important to distinguish big data and data typically captured through mobile phone-based ICT4D initiatives. The development field is increasingly and appropriately turning to digital data collection methodologies such as mobile surveys and SMS case reporting from rural clinics, for example. These new ways of doing business are transforming development work through a combination of increased speed, greater efficiency, and even according to recent studies, increased accuracy. But these are not big data.

In other words, the defining features of big data is its size, which makes it difficult to process even by hundreds or thousands of powerful servers, its speed, which generates most of its value from a development policy standpoint, and the fact that it is observed data rather than volunteered.

The good news is that in spite of all these challenges there are remarkable innovations coming out of the private sector as a result of the fact that companies now have to master these new opportunities in order to survive. We in the development community have to build on what has already been done and adapt these approaches to answer different kinds of questions.


Global Pulse
August 21, 2012

Colleagues, thanks for posting this blogpost considering the potential for big data in development work with broader applications. The United Nations' Global Pulse initiative ( is dedicated to exploring precisely this issue.

We recently published a White Paper entitled "BIG DATA FOR DEVELOPMENT: OPPORTUNITIES & CHALLENGES," as a jumping off point to expand the discussion. It includes several examples of past research and projects, as well as a review of some of the methods and serious challenges around analyzing big data. We welcome comments and feedback.

Global Pulse has also recently concluded a series of proof of concept research projects testing the feasibility of some analytics methodologies and big data sources (such as mobile phone survey data, social media data, large databases of news media, etc.) to answer development-related questions.

We are collectively just at the beginning of this journey, and look forward to hearing people's comments, learning of new initiatives, and continuing to test the possibilities for how to responsibly and effectively harness the tremendous power of big data for development.

August 29, 2012

Big Data and analytics will help organizations to streamline the distribution of content and will address manageability, scalability and high-availability issues significantly

December 26, 2012

Now, with Big Data Development , companies are able to make more data-driven decisions.