Syndicate content


How can machine learning and artificial intelligence be used in development interventions and impact evaluations?

David McKenzie's picture

Last Thursday I attended a conference on AI and Development organized by CEGA, DIME, and the World Bank’s Big Data groups (website, where they will also add video). This followed a World Bank policy research talk last week by Olivier Dupriez on “Machine Learning and the Future of Poverty Prediction” (video, slides). These events highlighted a lot of fast-emerging work, which I thought, given this blog’s focus, I would try to summarize through the lens of thinking about how it might help us in designing development interventions and impact evaluations.

A typical impact evaluation works with a sample S to give them a treatment Treat, and is interested in estimating something like:
Y(i,t) = b(i,t)*Treat(i,t) +D’X(i,t) for units i in the sample S
We can think of machine learning and artificial intelligence as possibly affecting every term in this expression:

How hard are they working?

Markus Goldstein's picture
I was at a conference a couple of years ago and a senior colleague, one who I deeply respect, summarized the conversation as: “our labor data are crap.”   I think he meant that we have a general problem when looking at labor productivity (for agriculture in this case) both in terms of the heroic recall of days and tasks we are asking survey respondents for, but also we aren’t doing a good job of measuring effort. 

Odds are you’re measuring son preference incorrectly

Seema Jayachandran's picture
When investigating son-biased fertility preferences, the Demographic and Health Surveys (DHS) offer the go-to survey questions:
  • If you could go back to the time you did not have any children and could choose exactly the number of children to have in your whole life, how many would that be?
  • How many of these children would you like to be boys, how many would you like to be girls, and for how many would it not matter if it’s a boy or a girl?

Building Grit in the Classroom and Measuring Changes in it

David McKenzie's picture

About a year ago I reviewed Angela Duckworth’s book on grit. At the time I noted that there were compelling ideas, but that two big issues were that her self-assessed 10-item Grit scale could be very gameable, and that there was really limited rigorous evidence as to whether efforts to improve grit have lasting impacts.

A cool new paper by Sule Alan, Teodora Boneva, and Seda Ertac makes excellent progress on both fronts. They conduct a large-scale experiment in Turkey with almost 3000 fourth-graders (8-10 year olds) in over 100 classrooms in 52 schools (randomization was at the school level, with 23 schools assigned to treatment).

List Experiments for Sensitive Questions – a Methods Bleg

Berk Ozler's picture

About a year ago, I wrote a blog post on issues surrounding data collection and measurement. In it, I talked about “list experiments” for sensitive questions, about which I was not sold at the time. However, now that I have a bunch of studies going to the field at different stages of data collection, many of which are about sensitive topics in adolescent female target populations, I am paying closer attention to them. In my reading and thinking about the topic and how to implement it in our surveys, I came up with a bunch of questions surrounding the optimal implementation of these methods. In addition, there is probably more to be learned on these methods to improve them further, opening up the possibility of experimenting with them when we can. Below are a bunch of things that I am thinking about and, as we still have some time before our data collection tools are finalized, you, our readers, have a chance to help shape them with your comments and feedback.

Skills and agricultural productivity

Markus Goldstein's picture
Do skills matter for agricultural productivity?   Rachid Laajaj and Karen Macours have a fascinating new paper out which looks at this question.   The paper is fundamentally about how to measure skills better, and they put a serious amount of work into that.    But for those of you dying to know the answer – skills do matter, with cognitive, noncognitive, and technical skills explaining about 12.1 to 16.6 of the variation in yields.   Before we delve into that

Towards a survey methodology methodology: Guest post by Andrew Dillon

When I was a graduate student and setting off on my first data collection project, my advisors pointed me to the ‘Blue Books’ to provide advice on how to make survey design choices.  The Glewwe and Grosh volumes are still an incredibly useful resource on multi-topic household survey design.  Since the publication of this volume, the rise of panel data collection, increasingly in the form of randomized control trials, has prompted a discussion abo