Syndicate content

Applications open for third round of funding for collaborative data innovation projects

World Bank Data Team's picture
Photo Credit: The Crowd and The Cloud


The Global Partnership for Sustainable Development Data and the World Bank Development Data Group are pleased to announce that applications are now open for a third round of support for innovative collaborations for data production, dissemination, and use. This follows two previous rounds of funding awarded in 2017 and earlier in 2018.

This initiative is supported by the World Bank’s Trust Fund for Statistical Capacity Building (TFSCB) with financing from the United Kingdom’s Department for International Development (DFID), the Government of Korea and the Department of Foreign Affairs and Trade of Ireland.

Scaling local data and synergies with official statistics

The themes for this year’s call for proposals are scaling local data for impact, which aims to target innovations that have an established proof of concept which benefits local decision-making, and fostering synergies between the communities of non-official data and official statistics, which looks for collaborations that take advantage of the relative strengths and responsibilities of official (i.e. governmental) and non-official (e.g.,private sector, civil society, social enterprises and academia) actors in the data ecosystem.

Official Statistics in a Post-Truth World

Haishan Fu's picture
Photo Credit:  2018 Edelman Trust Barometer Report

I've been thinking about the role of data and digital technology in today's information landscape. New platforms and technologies have democratized access to much of the world’s knowledge, but they’ve also amplified disinformation that affects public discourse. In this context, the official statistics community plays a critical role in bringing credible, evidence-based information to the public.
 
A “post-truth” society is not an inevitable state of affairs that we must accept; it's an unacceptable state of affairs that we must address. To do so, we need reliable data that are trusted by the public. Institutions like national statistical offices must go beyond their traditional data production remit to become a trusted, visible force for reason in people’s lives by building trust, embracing relevance, and communicating better.

If development data is so important, why is it chronically underfinanced?

Michael M. Lokshin's picture

Few will argue against the idea that data is essential for the design of effective policies. Every international development organization emphasizes the importance of data for development. Nevertheless, raising funds for data-related activities remains a major challenge for development practitioners, particularly for research on techniques for data collection and the development of methodologies to produce quality data.

If we focus on the many challenges of raising funds for microdata collected through surveys, three reasons stand out in particular: the spectrum of difficulties associated with data quality; the problem of quantifying the value of data; and the (un-fun) reality that data is an intermediate input.

Data quality

First things first – survey data quality is hard to define and even harder to measure. Every survey collects new information; it’s often prohibitively expensive to validate this information and so it’s rarely done. The quality of survey data is most often evaluated based on how closely the survey protocol was followed.

The concept of Total Survey Error sets out a universe of factors which condition the likelihood of survey errors (Weisbeg 2005). These conditioning factors include, among many other things: how well the interviewers are trained; whether the questionnaire was tested and piloted and to what degree; whether the interviewers’ individual profiles could affect the respondent answers, etc. Measuring some of these indicators precisely is effectively impossible—most of the indicators are subjective by nature. It may be even harder to separate the individual effects of these components in the total survey error.

Imagine you are approached with a proposal to conduct a cognitive analysis of your questionnaire. - How often were you bothered by the pain in the stomach over the last year? A cognitive psychologist will tell you that this is a badly formulated question: the definition of stomach varies drastically among the respondents; last year could be interpreted as last calendar year, 12 months back from now, or from January 1st until now; one respondent said: it hurt like hell, but it did not bother me, I am a Marine... (from a seminar by Gordon Willis)

Beyond Proof of Concept: do we have the right structure to take disruptive technologies to production?

Michael M. Lokshin's picture
Figure 1: Azure Cognitive Services Algorithm compliments authors’
youthful appearances

“Every company is a technology company”. This idea, popularized by Gartner, can be seen unfolding in every sector of the economy as firms and governments adopt increasingly sophisticated technologies to achieve their goals. The development sector is no exception, and like others, we’re learning a lot about what it takes to apply new technologies to our work at scale.

Last week we published a blog about our experience in using Machine Learning (ML) to reduce the cost of survey data collection. This exercise highlighted some challenges that teams working on innovative projects might face in bringing their innovative ideas to useful implementations. In this post, we argue that:

  1. Disruptive technologies can make things look easy. The cost of experimentation, especially in the software domain, is often low. But quickly developed prototypes belie the complexity of creating robust systems that work at scale. There’s a lot more investment needed to get a prototype into production that you’d think.

  2. Organizations should monitor and invest in many proofs of concept because they can relatively inexpensively learn about their potential, quickly kill the ones that aren’t going anywhere, and identify the narrower pool of promising approaches to continue monitoring and investing resources in.

  3. But organizations should also recognize that the skills needed to make a proof of concept are very different to the skills needed to scale an idea to production. Without a structure or environment to support promising initiatives, even the best projects will die. And without an appetite for long-term investment, applications of disruptive technologies in international development will not reach any meaningful level of scale or usefulness.

The 2018 Atlas of Sustainable Development Goals: an all-new visual guide to data and development

World Bank Data Team's picture
Also available in: Español | العربية | Français
Download PDF (30Mb) / View Online

“The World Bank is one of the world’s largest producers of development data and research. But our responsibility does not stop with making these global public goods available; we need to make them understandable to a general audience.

When both the public and policy makers share an evidence-based view of the world, real advances in social and economic development, such as achieving the Sustainable Development Goals (SDGs), become possible.” - Shanta Devarajan

We’re pleased to release the 2018 Atlas of Sustainable Development Goals. With over 180 maps and charts, the new publication shows the progress societies are making towards the 17 SDGs.

It’s filled with annotated data visualizations, which can be reproducibly built from source code and data. You can view the SDG Atlas online, download the PDF publication (30Mb), and access the data and source code behind the figures.

This Atlas would not be possible without the efforts of statisticians and data scientists working in national and international agencies around the world. It is produced in collaboration with the professionals across the World Bank’s data and research groups, and our sectoral global practices.
 

Trends and analysis for the 17 SDGs

Event: 50 Years of Measuring World Economies – Wednesday May 23, 2018 at 4pm EST

Nada Hamadeh's picture
Join us live online or in-person on Wednesday at 4pm for "50 Years of Measuring World Economies" event held at the World Bank James D. Wolfensohn Atrium in Washington, DC.
 
The International Comparison Program (ICP) – the world’s largest global data initiative led by the World Bank under the auspices of the United Nations Statistical Commission – is celebrating its 50th anniversary this year. Since the initiation of the ICP as a modest research project at the University of Pennsylvania by Irving Kravis, Alan Heston and Robert Summers in 1968, the Program has grown to cover about 200 countries and 20 global, regional and sub-regional agencies.
 

To commemorate this milestone, World Bank Group Chief Executive Officer Kristalina Georgieva, 2015 Nobel Laureate in economics Sir Angus Deaton, and Georgetown University Provost Robert M. Groves will come together at an event to discuss the challenges and opportunities for investing in evidence for sustainable development. In addition, Lawrence H. Summers, the 71st Secretary of the US Treasury and son of ICP co-founder Robert Summers, will share a recorded tribute. A video produced by the World Bank for the occasion will showcase the history and impact of the ICP.

Survey specialists and data scientists meet: machine learning to measure a person’s height from a picture.

Michael M. Lokshin's picture
Also available in: Français
A test subject holding a reference image and a silhouette derived from the photo by Tensorflow/DeepLab semantic image segmentation model.

Human body measurements are used to evaluate health trends in various populations. We wanted a simple way to reliably measure someone’s height as part a field interview, using a photo of them holding a reference object. We’ve developed an approach and would highlight two things we learned during the process:

  • With an iteratively refined method, it’s possible to get a measure of someone’s height accurate to 1% from a well-composed image of them holding a calibrated paper printout. We plan to integrate this functionality in to the free World Bank Survey Solutions CAPI tool.

  • We found working with an in-house team of survey specialists and data scientists the best way to tackle this problem. It’s only when we combined our domain knowledge and field experience with our data science skills and a healthy dose of creative problem solving, were we able to develop a working prototype.

Q1 2018 update of World Development Indicators available

World Bank Data Team's picture

The World Development Indicators database has been updated. This is a regular quarterly update to 1,600 indicators and includes both new indicators and updates to existing indicators. 

This release features updates for national accounts, balance of payments, demography, health, labor market, poverty and shared prosperity, remittances, and tourism series. New estimates are also available for electricity-related indicators from the Global Tracking Framework, adjusted net savings, law and regulation towards gender equality from Women, Business and the Law, ownership of financial accounts from the Global Findex, mobile and internet, and education series.

New indicators include those for health expenditures, value added per worker by sector, sex-disaggregated indicators on the completeness of birth registration, export/import unit value index, population exposed to PM2.5 pollution by interim target level and net ODA provided. For the latest list of additions, deletions, and changes in codes, descriptions, definitions, see here.

To accompany the data, a new online edition of World Development Indicators featuring stories, documentation and discovery tools will be available in May 2018. 

Data can be accessed via various means including:

- The World Bank’s main multi-lingual and mobile-friendly data website, http://data.worldbank.org 
- The DataBank query tool: http://databank.worldbank.org which includes archived versions of WDI
Bulk download in XLS and CSV formats and directly from the API
 

Why time use data matters for gender equality—and why it’s hard to find

Eliana Rubiano-Matulevich's picture
Also available in: العربية | Français
Photo: © Stephan Gladieu / World Bank

Time use data is increasingly relevant to development policy. This data shows how many minutes or hours individuals devote to activities such as paid work, unpaid work including household chores and childcare, leisure, and self-care activities. It is now recognized that individual wellbeing depends not just on income or consumption, but also on how time is spent. This data can therefore improve our understanding of how people make decisions about time, and expand our knowledge of wellbeing.

Time use data reveals how, partly due to gender norms and roles, men and women spend their time differently. There is an unequal distribution of paid and unpaid work time, with women generally bearing a disproportionately higher responsibility for unpaid work and spending proportionately less time in paid work than men.

How do women and men spend their time?

In a forthcoming paper with Mariana Viollaz (Universidad Nacional de La Plata, Argentina), we analyze gender differences in time use patterns in 19 countries (across 7 regions and at all levels of income). The analysis confirms the 2012 World Development Report findings of daily disparities in paid and unpaid work between women and men.

Can modern technologies facilitate spatial and temporal price analysis?

Marko Rissanen's picture
Also available in: Français

The International Comparison Program (ICP) team in the World Bank Development Data Group commissioned a pilot data collection study utilizing modern information and communication technologies in 15 countries―Argentina, Bangladesh, Brazil, Cambodia, Colombia, Ghana, Indonesia, Kenya, Malawi, Nigeria, Peru, Philippines, South Africa, Venezuela and Vietnam―from December 2015 to August 2016.

The main aim of the pilot was to study the feasibility of a crowdsourced price data collection approach for a variety of spatial and temporal price studies and other applications. The anticipated benefits of the approach were the openness, accessibility, level of granularity, and timeliness of the collected data and related metadata; traits rarely true for datasets typically available to policymakers and researchers.

The data was collected through a privately-operated network of paid on-the-ground contributors that had access to a smartphone and a data collection application designed for the pilot. Price collection tasks and related guidance were pushed through the application to specific geographical locations. The contributors carried out the requested collection tasks and submitted price data and related metadata using the application. The contributors were subsequently compensated based on the task location and degree of difficulty.

The collected price data covers 162 tightly specified items for a variety of household goods and services, including food and non-alcoholic beverages; alcoholic beverages and tobacco; clothing and footwear; housing, water, electricity, gas and other fuels; furnishings, household equipment and routine household maintenance; health; transport; communication; recreation and culture; education; restaurants and hotels; and miscellaneous goods and services. The use of common item specifications aimed at ensuring the quality, as well as intra- and inter-country comparability, of the collected data.

In total, as many as 1,262,458 price observations―ranging from 196,188 observations for Brazil to 14,102 observations for Cambodia―were collected during the pilot. The figure below shows the cumulative number of collected price observations and outlets covered per each pilot country and month (mouse over the dashboard for additional details).

Figure 1: Cumulative number of price observations collected during the pilot

Pages