Published on Development Impact

A Curated List of Our Postings on Technical Topics – Your One-Stop Shop for Methodology

This page in:

This is a curated list of our technical postings, to serve as a one-stop shop for your technical reading. I’ve focused here on our posts on methodological issues in impact evaluation – we also have a whole lot of posts on how to conduct surveys and measure certain concepts curated hereIt is currently updated up to July 20, 2023



IEanalytics: introducing the Development Impact Evaluation wiki


Random Assignment

Allocating treatment and control with multiple applications per applicant and ranked choices

Is optimization just re-randomization redux? Thoughts on the "don't randomize, optimize" papers

Be an optimista, not a randomista, when you have small samples

Tips for randomization in the wild: adding a waitlist

How to randomize in the field

Stratified randomization and the FIFA world cup

Doing stratified randomization with uneven numbers in the Strata

How to randomize using many baseline variables

Public randomization ceremonies

Designing experiments to measure spillover effects

Mechanism experiments and opening up the black box

Sample weights and RCT design

Response to the policymaker complaint that randomized experiments take too much time

Have RCTs taken over development economics?

Definitions in RCTs with interference

What does a game-theoretic model with belief-dependent preferences teach us about how to randomize?

Are we over-investing in baselines?

Most good you can do, but for whom? (on spillovers)

How to document that randomization was indeed done randomly

A beginners guide to doing/thinking about doing an RCT/field study in China

Doing a baseline when you have very little time before the program starts or people know their status

You're probably doing it right: experimental design with heterogeneous treatment effects

Why I am now more cautious about using or recommending matched pair randomizations and like matched quadruplets instead

Using randomized experiments to learn about market competition

(How) should I stratify when randomizing at the group level?

Bringing informative priors into your experiment with Bayesian impact evaluation

We randomized, but did we really though?

More on using a waitlist as part of your experimental design


Ethical Issues with randomized experiments and other research

Incorporating participant welfare and ethics into RCTs

Reporting requirements for ethical considerations in economics RCTs

Research with adolescents: issues surrounding consent

Questions around consent in cluster impact evaluations

Clinical Equipoise

Doing experiments with socially good, but privately bad treatments

Taking ethical validity seriously, and response 1, response 2, and response 3

Sometimes it is ethical to lie to your study subjects

The ethics of machine learning

The ethics of a control group in randomized impact evaluations

Hurting people while trying to help 


Pre-analysis plans and reporting

Pre-registration of studies to avoid fishing and allow transparent discovery

A joint test of orthogonality when testing for baseline balance and balance tests after stratification

A pre-analysis plan check-list

The New Trial Registries

What isn’t reported in impact evaluations but maybe should be

Randomization checks: testing for joint orthogonality

An addendum to pre-analysis plans: pre-specifying when not to use data

A pre-analysis plan is the only way to take your p-values seriously

Trouble with pre-analysis plans? try these 3 weird tricks

Should we require balance t-tests on observables with randomized experiments?

Registered reports: piloting a pre-results review process at the JDE

Declaring and diagnosing research designs

Power to the Plan

How good are pre-analysis plans in practice? and lessons for writing/reviewing your next one

Registering studies when all you want is a little more credibility

A saner approach to pre-analysis plans

Pre-analysis plans and registered reports: what the new opinion piece does and does not imply


Propensity Score Matching

Guido Imbens on clustering standard errors with matching

Testing different matching estimators as applied to job training programs

The covariate balanced propensity score

What do you need to do to make matching more convincing? Rhetorical vs statistical checks

The journey of a matching paper



The often unspoken assumptions behind diff-in-diff

How big data helped us estimate the impact of an intervention with 0.8% take-up

What Are We Estimating When We Estimate Difference-in-Differences?

Difference-in-Differences in practice: the case of mobile phones and farmer productivity

Revisiting the Difference-in-Differences Parallel Trends Assumption Part I: Pre-Trend Testing

Revisiting the Difference-in-Differences Parallel Trends Assumption: Part II what happens when parallel trends are violated?

When your difference-in-differences has too many differences.

An adversarial or "long and squiggly" test of the plausibility of parallel trends in difference-in-differences analysis

DiD you see Beta? Part 1, Part 2

Explaining why we should believe your DiD assumptions

A new synthesis and key lessons from the recent difference-in-differences literature

What to do about parallel trends when you only have baseline data


Regression Discontinuity

Curves in all the wrong places: Gelman and Imbens on why not to use higher-order polynomials in RD

Regression discontinuity with an implicit index

Tools of the trade: the regression kink design

How should you draw an RDD graph?



Ready, set, bunch

We got bunching, now what?


Spatial Discontinuity

Using spatial variation in program performance to identify impacts

Spatial Jumps


Event Study Designs

Econometrics Sandbox: Event Study Designs & Co.

Econometrics Sandbox: Randomization Inference for Event Study designs


Synthetic Controls

Evaluating an Argentine tourism policy using synthetic controls: tan linda que enamora?

The synthetic control method, as applied to regulatory reforms

Do I need to “recruit” a control group for my trial?


Instrumental Variables

Rethinking identification under the Bartik/Shift-Share instruments

Judge leniency designs: Now not just for Crime Studies

Just a little Bartik exposure

I'm not a fan of leave-one-out/spatial instruments


Machine learning

How can machine learning and artificial intelligence be used in development interventions and analysis?

The ethics of machine learning


Other Evaluation Methods

Impact as narrative

Small n impact evaluation methods

Can we trust shoestring evaluations?

Using case studies to explore and explain complex interventions

Bringing informative priors into your experiment with Bayesian impact evaluation



Another reason to prefer Ancova: dealing with measurement changes between baseline and follow-up

Endogenous stratification: the surprisingly easy way to bias your heterogeneous treatment effects and what to do instead

Why is difference-in-difference estimation still so popular in experimental analysis?

Regression adjustment in randomized experiments (part one, part two)

When to use survey weights in analysis

Adjustments for multiple hypothesis testing

An overview of Multiple hypothesis testing in Stata => better to now look at An updated overview of multiple hypothesis testing in Stata

Bounding approaches to deal with attrition

Linear probability models versus probits

Dealing with multiple lotteries

Estimating standard errors with small clusters (part one, part two)

Decomposition methods

Estimation of treatment effects with incomplete compliance

What does Alwyn Young's paper mean for analysis of experiments?

Winsorizing, testing balance, and dealing with incomplete take-up

You ran a field experiment, should you then run a regression?

Sometimes, increasingly, estimating the ITT is not enough in experiments

Finally an easy way to do randomization inference in Stata

When should you cluster standard errors? New wisdom from the econometrics oracle

Your go to regression specification is biased: here is the simple fix

What should you do when your random assignment gets compromised?

Econometrics Sandbox: Attrition strikes back!

Be careful with inference from 2x2 experiments and other cross-cutting designs and What should you do with experiments with factorial designs

Experimental design and Ancova when everyone has the same baseline outcome value

Randomly Drawn Equators? (randomization inference to account for spatial correlation)

Using a Surrogate Index to estimate long-term treatment impacts from a short-term follow-up

If your follow-up survey has attrition, what should you do for your descriptive analysis?

What's new in the analysis of heterogeneous treatment effects

Stratum fixed effects: what kind of covariates are they?

Missing values of baseline covariates in RCTs

Interpreting treatment effects on an inverse hyperbolic sine outcome variable and alternatives

A crowd-sourced checklist of the top 10 little things that drive us crazy with regression output


Power Calculations and Improving Power

Should I work with only a subsample of my control group if I have take-up problems?

Power calculations: what software should I use?

Does the intra-cluster correlation matter for power calculations if I am going to cluster my standard errors?

Power calculations for propensity score matching

Power calculations 101: dealing with incomplete take-up

Collecting more rounds of data to boost power

Improving power in small samples

Did you do your power calculations using standard deviations? Do them again.

Power calculations for regression discontinuity (part 1, part 2, part 3)

Power calculation software for randomized saturation experiments

Statistical power and the funnel of attribution

Should you over-sample compliers if budget is limited and you are concerned about take-up?

Take-up and the Inverse-Square Rule for Power Calculations Revisited: When does power not fall quite so drastically with take-up, and when does lower take-up increase power?

Why ex-post power using estimated effect sizes is bad, but an ex-post MDE is not

Back of the envelope power calcs

Different-sized baskets of fruit: how unequally sized clusters can lead your power calculations and analysis astray

You're probably doing it right: experimental design with heterogeneous treatment effects & when should you assign more units to a study arm?

Design sandbox: power calculations and optimal design for cost effectiveness (part 1, part 2).

Six questions about doing power calculations

Can I pool together data from experiments in multiple countries to improve sample size and power?

Heterogeneity analysis and statistical power in field experiments

Seven ways to improve statistical power without increasing n


On External Validity

Getting beyond the mirage of external validity

All those external validity issues with impacts? They apply to costs too

External validity as seen from other quantitative social sciences and the gaps in our practices

Towards a more systematic approach to external validity: understanding site selection bias

Weighting for external validity

Will that successful intervention over there get results here?

Learn to live without external validity

Why the external validity of regression estimates can be less than you think

Why similarity is the wrong concept for external validity

A rant on the external validity double standard

Towards policy irrelevance: on the experimental arms race

What's wrong in how we do impact evaluation?

What's in a title? Signaling external validity through titles in development economics

A framework for taking evidence from one setting to another

Informing policy with research that is more than the sum of the parts

External Validity Musings

Threats and opportunities in taking promising results to scale


Jargony Terms in Impact Evaluations

A proposed taxonomy of behavioral responses to evaluation

Quantifying the Hawthorne effect

The Hawthorne Effect

The John Henry Effect

Placebo effects

Clinical Equipoise

Social Desirability Bias/Experimenter Demand Effects and more on Experimenter Demand Effects


Stata  (and R) Tricks

Generating regression and summary statistics tables in Stata

Graphing impacts with Standard Error Bars

Calculating the intra-cluster correlation

Generating regression and summary statistics tables in Stata: A checklist and code

Stata code for correlated random coefficient models

Finally an easy way to do randomization inference in Stata

IEanalytics: introducing ietoolkit

Five small things I've learned recently

Tools of the trade: Using iemargins to graph impacts with standard error bars (IE Analytics update)

(love Stata but think R may be more your thing? Check out DIME's R training course for advanced Stata users!)

Nice and fast tables in Stata

Visual libraries for Stata and R

iefieldkit to document primary data collection and cleaning in Stata

Making visually appealing maps in Stata

An updated overview of multiple hypothesis testing commands in Stata

Can AI write your Stata code?



Worm wars: the anthology

Worm wars: a review of the reanalysis of the Miguel and Kremer deworming study

Response to Brown and Wood's response 

Brown and Woods response on "how scientific are scientific replications"

how scientific are scientific replications?

The infinite loop of failure of replication in economics

More replication in economics

A cynic's take on papers with novel methods to improve transparency

What development economists talk about when they talk about reproducibility

New opportunities for replications

Stata linter produces Stata code that sparks joy

How to make user-written Stata commands really reproducible

How to make sure your research is replicable: evidence from 55 papers


Systematic reviews and meta-analysis

How systematic is that systematic review? The case of learning outcomes

How standard is a standard deviation? A cautionary note on using SDs to compare across impact evaluations

Should we give up on SDs for measuring effect size?

What do 600 papers on 20 types of interventions tell us about what types of interventions generalize?

If you want your study included in a systematic review, this is what you should report

How to do meta-analysis using a Bayesian hierarchical model and when does it make sense to do so?

Do financial literacy interventions actually work better than I think they do? (and thoughts about meta-analyses)


Book Reviews for Books on Impact Evaluation

Review of Banerjee and Duflo's Poor Economics (and authors' reply)

Review of Manzi's Uncontrolled

Review of Imbens and Rubin's Causal Inference

Review of Glennerster and Takavarasha's Running Randomized Experiments

Review of Gerber and Green's Field Experiments

Review of Karlan and Appel's Failing in the Field

Review of Ogden's Experiments in Development from Every Angle

Review of Leigh's Randomistas: how radical researchers changed the world

Review of Gugerty and Karlan's The Goldilocks Challenge: Right-fit evidence for the social sector

Review of Luca and Bazerman's The Power of Experiments: Decision-making in a data-driven world

Review of Cunningham's Causal Inference: The Mixtape

Words that resonate from Blattman's Why we Fight

Review of List's The Voltage Effect and List et al's The Scale-up Effect in Early Childhood and Public Policy


Getting Published 

11 tips for making a short presentation based on your research

Gotcha? tips and tricks for the economics seminar

10 journals for publishing a short economics paper

How to publish statistically insignificant results in development

Writing a papers and proceedings paper

Have descriptive development papers been crowded out by impact evaluations?

Make your research known - 10 tools for increasing consumption of your research

What makes a paper "development economics" and making this clear for a general audience

Pre-results Review at the Journal of Development Economics: Lessons learned so far

Experiences so far with the JDE's Short Papers

State of Development Journals 2017

The State of Development Journals 2018

The State of Development Journals 2019: Quality, Acceptance Rates, Review Times, and Open Science

The State of Development Journals 2020: Quality, Acceptance Rates, Review Times, and Multiple Rounds of Revisions

The State of Development Journals 2021: Quality, Acceptance Rates, Review Times, and how much did the pandemic change submissions

The State of Development Journals 2022: Quality, Acceptance Rates, Review Times, and What's new

The State of Development Journals 2023: Quality, Acceptance Rates, Review Times, and What's new

Publishing stats and news from the AEA journals 2023

See also our curated miscellanea for interviews with journal editors




Florence Kondylis

Research Manager, Lead Economist, Development Impact Evaluation

David McKenzie

Lead Economist, Development Research Group, World Bank

Join the Conversation

The content of this field is kept private and will not be shown publicly
Remaining characters: 1000