Improved targeting for mobile phone surveys: A public-private data collaboration: Guest post by Kristen Himelein and Lorna McPherson

|

This page in:

Mobile phone surveys have been rapidly deployed by the World Bank to measure the impact of COVID-19 in nearly 100 countries across the world. Previous posts on this blog have discussed the sampling and  implementation challenges associated with these efforts, and coverage errors are an inherent problem to the approach. The survey methodology literature has shown mobile phone survey respondents in the poorest countries are more likely to be male, urban, wealthier, and more highly educated. This bias can stem from phone ownership, as mobile phone surveys are at best representative of mobile phone owners, a group which, particularly in poor countries, may differ from the overall population; or from differential response rates among these owners, with some groups more or less likely to respond to a call from an unknown number. In this post, we share our experiences in trying to improve representativeness and boost sample sizes for the poor in Papua New Guinea (PNG).

A challenging context

PNG is a challenging context by any measure. It is the largest country in the Pacific region and one of the most culturally and linguistically diverse in the world. The overwhelming rural population of over 8 million people is spread across 600 islands, with geography ranging from small tropical atoll islands and coastal regions to the rugged highlands and low-lying valleys of the Highlands Region. Data is scarce and it is one of only a few countries outside of sub-Saharan Africa to be classified by the World Bank as being in “extreme data deprivation.” The country’s poverty rate was last measured (more than 10 years ago) at around 40 percent of the population, and there is a history of instability and violence in many areas. Mobile phone ownership is rare, with the most recent national household survey, the 2016-2018 Demographic and Health Survey (DHS), finding that only 34 percent of adults between 15 and 49 owning a phone, substantially below even the low-income country average of 55 percent in the overall population.

Initial Approach

Digicel PNG was the implementing partner for the first and second round of these surveys. Benefits to working with Digicel include its widespread national coverage, including in rural and remote areas, as well as its dominant market share, reducing the potential coverage issues with respondents being outside the range of the cell network or having an account with another provider. The first round of the survey was implemented in late June 2020 and covered 3,115 households. The sampling strategy was a random digit dial approach with province-level stratification based on the last tower ping of the user. To attempt to encourage response in the first round, a text message blast was sent to all potential respondents informing them of the purpose and timing of the survey, as well as letting them know that they would receive a 3 PGK credit for participating. It was hoped that the monetary incentive would increase connection rates, and ultimately response rates, particularly in poorer households. The system for calling random numbers only connected to an interviewer once a live respondent answered, and to achieve the 3,115 sample size, 45,747 unique calls were made. Of completed calls, 33 respondents were excluded for being under the minimum age of 18 and 6 refused.

While successful, the round 1 results were, as we had feared, heavily skewed towards the top of the wealth distribution (as measured using the methodology from the DHS). The survey achieved only 10 percent of the sample in the bottom four deciles of the wealth index, compared with 48 percent in the top two deciles. We were also reasonably sure that the differences reflected bias in the respondents rather than changing conditions on the ground since the DHS. First, about half of the items included in the index are based on relatively stable dwelling characteristics. Furthermore, analysis done in the companion project in neighboring Solomon Islands showed almost no movement in the asset index between 2012 and 2015, but a similar skew in the mobile phone results. We also checked during the pilot to see if the crisis was leading to households selling assets, but respondents reported almost no change in ownership of the included assets since the pre-crisis period.

Adjustments for round 2

Implementation of the second round of surveys began in early December 2020, and we immediately noticed a problem with attrition. Following requirements in PNG, SIM cards that are not registered with a valid ID with six months are disconnected. As such, many Papua New Guineans – particularly those in the poorest and remote communities – replace their SIM cards relatively frequently. Approximately two-thirds of respondents from the first round could not be reached in the second.

To maintain the required sample size between the two rounds, we knew we would have to add households. While unfortunate from a panel data perspective, it did provide the opportunity to shift the methodology to hopefully address some of the issues with the skew.  Replacement households were selected using the same geographic stratification, but with a new targeting approach; this time based on subscriber characteristics from the Digicel database. To proxy poorer households, the team targeted subscribers that did not send text messages, on the assumption they were less likely to be literate. Similarly, subscribers that received only incoming calls or for whom the majority of credit was not purchased but transferred from other subscribers were thought more likely to be poor. Having only been previously used for Digicel’s marketing campaigns, and to target corporate social responsibility projects, we had little sense of how successful this application would be. But it worked.

Results

In the first round, the mean unweighted wealth index score was 0.33 compared to the weighted DHS mean of 0.08. For those households re-interviewed in round 2, the mean score was 0.34, compared to a mean score was 0.22 for the newly targeted households. The share in the bottom four deciles increased to over 15 percent and the share in the top two deciles dropped below 40 percent, despite the high share of returning households in the latter group. The figure below shows the cumulative distribution function of the wealth index observed in the DHS and the two rounds of the mobile phone survey. Though the DHS distribution (green line) still clearly lies well to the left of the either of the mobile phone rounds, there is a marked shift from round 1 results (red line) to the round 2 results (blue line).

Cumulative distribution functions of the wealth indices across surveys

Future Implications

While certainly good news for the current and future rounds of the COVID-19 household phone surveys – both in PNG and in many other countries – these results has broader implications. Now that there is a demonstrated relationship between the subscriber characteristics and the wealth index, it is possible to better understand and further model these relationships to target the lower deciles of the distribution more precisely. In addition, spending and phone use patterns are highly disaggregated in both time and geography. Understanding the linkages to traditional development data and potentially using machine learning techniques to model those relationships open the possibility to create dynamic poverty maps – representative down to the district level and which could be updated in near-real-time. This prospect would be exciting in any context but holds particular potential for a data deprived country like PNG.

To paraphrase Humphrey Bogart: this is the beginning of a beautiful collaboration.

 

Kristen Himelein is a Senior Economist / Statistician at the World Bank Group's Poverty & Equity Global Practice)

Lorna McPherson is Senior Vice President and Chief Sales Officer at Digicel PNG.

 

Authors

Kristen Himelein

Senior Economist / Statistician

Join the Conversation