Published on Data Blog

Reaching displaced populations in the Democratic Republic of Congo using spatial sampling

This page in:
 Aerial view of Beni, North Kivu region, Democratic Republic of Congo. Photo: World Bank / Vincent Tremeau Aerial view of Beni, North Kivu region, Democratic Republic of Congo. Photo: World Bank / Vincent Tremeau

Sample surveys remain the main instrument to gather reliable information on the socio-economic conditions of households in most low- and middle-income countries, where each survey sample is drawn from a sampling frame for the target population of interest. However, the sampling frames in these contexts struggle to meet the basic criteria: being current, comprehensive, and informative on the variable(s) of interest. 

This problem is amplified when target populations escape conventional sampling frames, as it is often the case with displaced populations in countries like the Democratic Republic of Congo (DRC)—which has the largest internally displaced population in Africa, according to the Office of the United Nations High Commissioner for Refugees (UNHCR). This situation has made systematic data collection challenging, with sampling frames dating back to the last population census held 40 years ago. 

Area-based sampling frames, built from publicly available geospatial resources, are a potential solution to this problem. However, user-friendly tools have not been widely available to construct and leverage area-based sampling frames.


Innovative approach to tackle challenges 

In response, at the World Bank we developed a novel spatial sampling methodology and user-friendly tools that: 

  • Leverage Open Buildings—an open dataset that outlines buildings using satellite imagery; 
  • Construct a spatial grid sampling frame; 
  • Generate a (random) sample of grid cells from this frame, as well the grid cell boundaries and geographic coordinates for navigation and orientation purposes; 
  • and integrate it with the World Bank’s Survey Solutions Computer-Assisted Personal Interview (CAPI) data collection platform.

These tools were in turn used by the DRC’s National Statistics Institute (Institute National de la Statistique, or INS, in French) to conduct the 2022 Socio-Economic Survey in the Grand Kasaï region, in southern DRC. This survey captured internally displaced persons, returnees, repatriated refugees, and their host communities—with support from the World Bank Development Data Group, UNHCR and the World Bank-UNHCR Joint Data Center on Forced Displacement (JDC).

The two main tools used in the DRC were: the Spatial Sampling Application and the Grid Sample Replacement Application. The former was used to generate the spatial grid frame and draw the required sample; while the latter was developed to generate replacement grid cells when those selected in the original sample were ineligible for the survey, mainly due to inaccessibility or lack of residential structures. 

Figure 1. Spatial Sampling Tool


Figure 2: Grid Cell Replacement Tool


Of the 137 grid cells and 27 health zones selected for the survey, 35,859 households and 27,859 homes were identified as part of the household listing operations in the selected areas. From the resulting listing, 4,079 households were randomly selected and interviewed for the main survey. 

The tools accurately listed households within the designated areas and facilitated easy identification of sampled dwellings and households during the main survey. This was achieved using preloaded information from the listing survey, including GPS coordinates, dwelling descriptions, and phone numbers. 

One of the main challenges experienced during survey implementation was poor and unstable internet connection. With that in mind, tools and protocols were further refined to enable the use of offline maps and data synchronization via Wi-Fi or Bluetooth connections.

Figure 3. Training and pretest pictures





Proof of concept  

These user-friendly tools, together with a widespread use of spatial resources, can help mitigate challenges in the field and strongly contribute to better listing, as well as better final survey data. The development and practical application of the grid sampling and replacement tools in the DRC represent a proof of concept that confirms the robustness of the methodology as a whole.

Implementing a household survey with a spatial grid-based area frame and a team from national statistical offices using the Survey Solutions platform shows these tools can be scaled for similar data-deprived contexts. We built capacity at the DRC INS and fostered partnerships among key stakeholders, ensuring collected data on displaced populations is relevant for policymaking and operations.

Finally, the free and open-source nature of the tools allows other users to access them, thus ensuring better quality of data and updated sampling frames.

Michael Wild

Senior Statistician, LSMS, World Bank

Harriet Kasidi Mugera

Harriet Kasidi Mugera, Senior Data Scientist, Development Economics Data Group, World Bank

Sergiy Radyakin

Senior Economist, Development Data Group

Victor Daye

Survey Specialist, Development Data Group, World Bank

Join the Conversation

The content of this field is kept private and will not be shown publicly
Remaining characters: 1000