Since its inception, the World Bank’s Open Data initiative has generated considerable excitement and discussion on the possibilities that it holds for democratizing development economics as well as for democratizing the way that development itself is conducted around the world. Robert Zoellick, in a speech given last year at Georgetown University , expounded on the many benefits resulting directly from open data. Offering the example of a health care worker in a village, he spoke of her newfound ability to “see which schools have feeding programs . . . access 20 years of data on infant mortality for her country . . . and mobilize the community to demand better or more targeted health programs.” Beyond this, Zoellick argued that open data means open research, resulting in “more hands and minds to confront theory with evidence on major policy issues.”
The New York Times featured the Bank’s Open Data initiative in an article published earlier this month, in which it referred to the released data as “highly valuable”, saying that “whatever its accuracy or biases, this data essentially defines the economic reality of billions of people and is used in making policies and decisions that have an enormous impact on their lives.” The far-reaching policymaking consequences of the data are undeniable, but the New York Times touches upon a crucial question that has been overshadowed by the current push for transparency: what about quality?
Without placing equal emphasis on collecting data that is timely, consistent, and of high quality, few benefits can be reaped from the release of data to the public. A treasure trove of data that is rife with bias and plagued by inaccuracies is of little use to any researcher, statistician or village health care worker, regardless of whether they operate within or outside of the Bank. Indeed, inaccuracies and biases in data can result in significant harm, inasmuch as the data is used to inform the policies of developing countries.
Fortunately, the 7,000+ datasets that have been released to the public under the Open Data initiative represent some of the highest quality data currently available in a number of sectors. In some sectors, however, data quality lags severely behind. For example, despite the importance of the agricultural sector in reducing poverty and food insecurity throughout the developing world, serious weaknesses in agricultural statistics persist. According to the 2008 findings of the FAO’s Agricultural Bulletin Board on Data Collection, Dissemination and Quality of Statistics, only two of the forty-four countries in Sub-Saharan Africa are considered to have high standards in data collection, while standards in twenty-one countries remain low. As a result, the quality of the agricultural statistics collected in many countries is questionable, rendering the data ineffectual in guiding policy decisions aimed at benefitting the poor.
It is therefore crucial that the current mandate for open data go hand in hand with an equally strong mandate for better data – data collected based on sound survey and sample design, free from bias or error, and disseminated in a timely fashion. Without improvements in data collection methodology, the data that constitutes the bulk of the development community’s knowledge about the realities of life in many of the world’s poorest countries will continue to suffer from inaccuracy and error. Furthermore, many sectors that play major roles in the livelihoods of the extreme poor but on which little data is available – livestock, fisheries, and forestry, to name a few – will continue to be overlooked by both the international community and national governments with regards to future policy and flows of assistance or aid.
There are a number of ongoing efforts to improve data quality and coverage in the developing world. One such initiative is the Global Strategy to Improve Agriculture and Rural Statistics, a multi-institution initiative endorsed at the 41st Session of the United Nations Statistical Commission. The Strategy assigns a pivotal role to methodological research in order to improve the quality and policy relevance of the available information specific to the agricultural sector. In accordance with the main tenets of the Global Strategy, the Living Standards Measurement Study – Integrated Surveys on Agriculture (LSMS-ISA)  project is collaborating with the Ministries of Agriculture and National Statistics Offices of its partner countries in Sub-Saharan Africa to design and implement systems of multi-topic, nationally representative panel household surveys with a strong focus on agriculture. The project is implemented by the World Bank’s Living Standards Measurement Study team, which has been championing the cause of better data since its creation in 1980 under World Bank President Robert McNamara, as a response to his urgent call for better, more timely micro-data. The LSMS-ISA project represents the team’s latest effort to improve data quality, motivated by the widespread problems faced by agricultural statistics in much of the developing world. In addition to increasing the availability of data in Sub-Saharan Africa, the project is conducting methodological work to enhance the quality of survey data in several areas, including crop productivity, livestock and climate change. The data generated by the project intends to shed light on the links between agriculture and poverty reduction in the region, as well as to foster innovation and efficiency in statistical research in the sector.
The clarion call for open data heralds an era where people around the world will gain the ability to more knowledgeably campaign for their rights and to make their voices heard. As governments increasingly respond to the need for transparency and move to freely share their data with citizens, individuals will be empowered to take an active part in their country’s development and in the improvement of their livelihoods. However, as we move forward into a world where statistics play an increasingly powerful role in setting priorities and determining the direction of future policy efforts, it is essential to recognize that we cannot realize a more open and inclusive model for citizen-centric development  without identifying the areas in which existing data is insufficient or problematic and working to bridge those gaps.
We must recognize that open data is not enough.