Syndicate content

external validity

Towards policy irrelevance? Thoughts on the experimental arms race and Chris Blattman’s predictions

David McKenzie's picture

Chris Blattman posted an excellent (and surprisingly viral) post yesterday with the title “why I worry experimental social science is headed in the wrong direction”. I wanted to share my thoughts on his predictions.
He writes:
Take experiments. Every year the technical bar gets raised. Some days my field feels like an arms race to make each experiment more thorough and technically impressive, with more and more attention to formal theories, structural models, pre-analysis plans, and (most recently) multiple hypothesis testing. The list goes on. In part we push because want to do better work. Plus, how else to get published in the best places and earn the respect of your peers?
It seems to me that all of this is pushing social scientists to produce better quality experiments and more accurate answers. But it’s also raising the size and cost and time of any one experiment.

A Curated List of Our Postings on Technical Topics – Your One-Stop Shop for Methodology

David McKenzie's picture
This is a curated list of our technical postings, to serve as a one-stop shop for your technical reading. I’ve focused here on our posts on methodological issues in impact evaluation – we also have a whole lot of posts on how to conduct surveys and measure certain concepts that I’ll leave for another time. Updated August 20, 2015.
Random Assignment

Getting beyond the mirage of external validity

Markus Goldstein's picture
This post is coauthored with Eliana Carranza
No thoughtful technocrat would copy a program in every detail for a given context in her or his country.    That's because they know (among other things) that economics is not a science but a social (or dismal even) science, and so replication in the fashion of chemistry isn't an option.  For economics, external validity in the strict scientific sense is a mirage.

What do 600 papers on 20 types of interventions tell us about how much impact evaluations generalize? Guest post by Eva Vivalt

This is the first in our series of posts by students on the job market this year.

Impact evaluations are often used to justify policy, yet there is reason to suspect that the results of a particular intervention will vary across different contexts. The extent to which results vary has been a very contentious question (e.g. Deaton 2010; Bold et al. 2013; Pritchett and Sandefur 2014), and in my job market paper I address it using a large, unique data set of impact evaluation results.
I gathered these data through AidGrade, a non-profit research organization I founded in 2012 that collects data from academic studies in the process of conducting meta-analyses. Data from meta-analyses are the ideal data with which to answer the generalizability question, as they are designed to synthesize the literature on a topic, involving a lengthy search and screening process. The data set currently comprises 20 types of interventions, such as conditional cash transfers (CCTs) and deworming programs, gathered in the same way, double-coded and reconciled by a third coder. There are presently about 600 papers in the database, including both randomized controlled trials and studies using quasi-experimental methods, as well as both published and working papers. Last year, I wrote a blog post for Development Impact based on this data, discussing what isn't reported in impact evaluations.

External validity as seen from other quantitative social sciences - and the gaps in our practice

Jed Friedman's picture
For impact evaluation to inform policy, we need to understand how the intervention will work in the intended population once implemented. However impact evaluations are not always conducted in a sample representative of the intended population, and sometimes they are not conducted under implementation conditions that would exist at scale-up.

Learn to live without external validity

Berk Ozler's picture
We promised some time ago to review the recent working paper by Pritchett and Sandefur on external validity, and the title of this post is the main take-away for me: my name is Berk Özler and I agree with this specific message. However, while I’d like to say that there is much more here, I am afraid that I, personally, did not find more to write home about...

Questioning the External Validity of Regression Estimates: Why they can be less representative than you think.

David McKenzie's picture
A common critique of many impact evaluations, including those using both experimental and quasi-experimental methods, is that of external validity – how well do findings from one setting export to another? This is especially the case for studies done on relatively small samples, although as I have ranted before, there appears to be a double standard in this critique when compared to both other disciplines in economics and to other development literature.

Why similarity is the wrong concept for External Validity

David McKenzie's picture
I’ve been reading Evidence-based policy: a practical guide to doing it better by Nancy Cartwright and Jeremy Hardle. The book is about how one should go about using existing evidence to move from “it works there” to “it will work here”. I was struck by their critique of external validity as it is typically discussed.

Thinking about the placebo effect as a “meaning response” and the implication for policy evaluation

Jed Friedman's picture

In recent conversations on research, I’ve noticed that we often get confused when discussing the placebo effect. The mere fact of positive change in a control group administered a placebo does not imply a placebo effect – the change could be due to simple regression to the mean.