Endogenous stratification: the surprisingly easy way to bias your heterogeneous treatment effect results and what you should do instead

A common question of interest in evaluations is “which groups does the treatment work for best?” A standard way to address this is to look at heterogeneity in treatment effects with respect to baseline characteristics. However, there are often many such possible baseline characteristics to look at, and really the heterogeneity of interest may be with respect to outcomes in the absence of treatment. Consider two examples:<br />
A: A vocational training program for the unemployed: we might want to know if the treatment helps more those who were likely to stay unemployed in the absence of an intervention compared to those who would have been likely to find a job anyway.<br />
A: A vocational training program for the unemployed: we might want to know if the treatment helps more those who were likely to stay unemployed in the absence of an intervention compared to those who would have been likely to find a job anyway.
B: Smaller class sizes: we might want to know if the treatment helps more those students whose test scores would have been low in the absence of smaller classes, compared to those students who were likely to get high test scores anyway.
How to overcome the (almost insurmountable) task of tracking poverty trends without good consumption data?

Just imagine a scenario where your counterpart—the Minister of Economic Development in country X—is soon to present to his Congress the latest poverty trends. This is for a hearing on the country's next 5-year (or 10-year) economic development plan. As a development practitioner, you are tasked with supporting him or her with the technical analysis, despite the notorious challenge that the most recent round of household survey data is not comparable to earlier rounds due to various changes in survey design.
Generating Regression and Summary Statistics Tables in Stata: A checklist and code

As a research assistant working for David, I've had to create many, many regression and summary statistics tables. Just the other day, I sent David a draft of some tables for a paper that we are working on. After re-reading the draft, I realized that I had forgotten to label dependent variables and add joint significance tests in a couple regression tables. In an attempt to avoid forgetting these details in the future and potentially help future researchers, I thought I'd post a checklist for generating regression and summary statistics tables.
Tools of the Trade: Graphing Impacts with Standard Error Bars
This week I finally got around to learning how to make a graph which displays the means of different treatment groups for a range of outcomes, along with standard error bars to show whether there is a significant difference between groups. Here is an example:
Tools of the Trade: Intra-cluster correlations

In clustered randomized experiments, random assignment occurs at the group level, with multiple units observed within each group. For example, education interventions might be assigned at the school level, with outcomes measured at the student level, or microfinance interventions might be assigned at the savings group level, with outcomes measured for individual clients.