An important policy response by many governments to the socioeconomic impact of the COVID-19 pandemic has been expanding social assistance. This effort faces many hurdles, not least of which is how to expand quickly in contexts where comprehensive and current social registries or other appropriate databases are not available. How to decide who gets the cash?
I was excited to read the new Nature publication of Aiken et al.’s NBER working paper, “Machine learning and phone data can improve targeting of humanitarian aid.” Even before the NBER working paper came out last July, I had heard about this effort with the government of Togo. It is a compelling application of mobile phone data for policy making. It is also a practical application of poverty mapping using satellite data rather than census data (which are often dated or otherwise not available), something I blogged about way back in 2014 in a preview of such maps that were constructed in 2015 for Sierra Leone and Nigeria (sadly neither report is online).
The paper offers a rigorous study of the performance of the mobile phone data approach compared to alternatives of geographic targeting (mobile phone data approach does better) and a hypothetical comprehensive social registry (mobile phone data approach does worse, though a comprehensive and maintained registry is not likely to exist any time soon). An important caveat is that they cannot compare to community-based targeting approaches.
I am not going to summarize the paper in this blog. The Nature paper is six pages, though dense and accompanied by annexes with all the details. I leave it to readers to make time for it. Instead, I am offering six ponderings about this impressive work.
1. On the practical
How tricky is getting the detailed mobile phone data? Moreover, more than basic mobile phone data is likely necessary. The study shows that simple mobile phone expenditure data perform worse than the detailed mobile phone data used in the paper. And notably Togo has only 2 operators. There would likely be practical challenges to get some data, especially in settings with several providers – challenges which are not discussed in the paper.
2. On ML
How important was machine learning for improving targeting (as the title says)? The paper does not show how important the the machine-learning algorithms applied to mobile phone data -- or to construct the poverty map – are as compared to traditional (and easier-to-do) linear regressions.
3. On surveys
The paper makes the case that mobile phone data targeting can compete against fielding special surveys to apply a proxy means test to target household. Though exclusion errors are higher, when you factor in cost and speed, it is likely a cheaper and quicker (if technically more difficult) approach. But this does not mean we don’t need surveys. The mobile phone data approach relies on a household consumption or income survey, and one that includes mobile phone numbers to link to the mobile phone data. The paper also shows that the success of the mobile phone data approach depends also on these survey data being current and representative. They show that using household survey data roughly 18 months out of date can almost wipe out the gains of the mobile phone data targeting over geographic targeting based on a poverty map.
By extension this also means that census data are needed (for the household survey sample design and weights) even if they are not used for poverty mapping or other approaches to first stage targeting.
4. Hello, anyone there?
They conducted a phone survey in 2020 to assess performance of the different targeting approaches. A sample of mobile phone numbers was drawn from the mobile phone data. The response rate was 35% (of the phone numbers they selected, this is the share for which there was a response and completion of the survey). It strikes me as a surprisingly low share and raises concerns about benchmarking performance of different approaches by using this survey as ground truth.
5. Parity by gender… unknown.
The authors note that improved targeting from the mobile phone data approach for the country as a whole might not mean it improves targeting for subgroups. This could be the case if the algorithm performs poorly for specific groups – a concern raised in the use of ML. To explore this possibility, they explore targeting performance across several subgroups, one of which is sex. They conclude that that the mobile phone data approach does not have greater exclusion errors for women compared to men. But what they actual assess is parity by sex of the household head. And as I have noted before (this old blog), comparing female and male headed households is not the same as comparing outcomes between women and men.
6. Targeting method is not necessarily a main factor driving exclusion.
Table 2 is a great reality check that it is not just a robust targeting approach that is important. This table breaks out all the aspects of the program design that will exclude people from being considered for eligibility. As the paper states “…targeting errors [from different targeting approaches] are an important source of programme exclusion, but that real-world programmes also face structural and environmental constraints to inclusion.” Of course, having a phone is one aspect (both for this targeting approach as well as the transfer itself which was delivered via mobile payment). But at least in Togo, where 85% of people reside in a household with a phone, this is not the main constraint to inclusion. Among other constraints, successful registration by phone is required to be eligible and just one third of the population in eligible areas succeeded in registering. This could reflect selection – a rich household might not bother with the hassle of registering. But in a very low-income country, likely a non-trivial fraction of the two thirds who did not complete registration could be eligible but stand no chance of being included.
Join the Conversation