Syndicate content


Weekly links July 28: overpaid teachers? Should we use p=0.005? beyond mean impacts, facilitating investment in Ethiopia, and more…

David McKenzie's picture
  • Well-known blog skeptic Jishnu Das continues to blog at Future Development, arguing that higher wages will not lead to better quality or more effective teachers in many developing countries – summarizing evidence from several countries that i) doubling teacher wages had no impact on performance; ii) temporary teachers paid less than permanent teachers do just as well; and iii) observed teacher characteristics explain little of the differences in teacher effectiveness.
  • Are we now all doomed from ever finding significance? In a paper in Nature Human Behavior, a multi-discipline list of 72 authors (including economists Colin Camerer, Ernst Fehr, Guido Imbens, David Laibson, John List and Jon Zinman) argue for redefining statistical significance for the discovery of new effects from 0.05 to using a cutoff of 0.005. They suggest results with p-values between 0.005 and 0.05 now be described as “suggestive”. They claim that for a wide range of statistical tests, this would require an increase in sample size of around 70%, but would of course reduce the incidence of false positives. Playing around with power calculations, it seems that studies that are powered at 80% for an alpha of 0.05 have about 50% power for an alpha of 0.005. It implies using a 2.81 t-stat cutoff instead of 1.96. Then of course if you want to further adjust for multiple hypothesis testing…

Biased women in the I(C)T crowd

Markus Goldstein's picture
This post is coauthored with Alaka Holla

The rigorous evidence on vocational training programs is, at best, mixed.   For example, Markus recently blogged about some work looking at long term impacts of job training in the Dominican Republic.   In that paper, the authors find no impact on overall employment, but they do find a change in the quality of employment, with more folks having jobs with health insurance (for example). 

A new answer to why developing country firms are so small, and how cellphones solve this problem

David McKenzie's picture
Much of my research over the past decade or so has tried to help answer the question of why there are so many small firms in developing countries that don’t ever grow to the point of adding many workers. We’ve tried giving firms grants, loans, business training, formalization assistance, and wage subsidies, and found that, while these can increase sales and profits, none of them get many firms to grow.

Weekly links July 21: a 1930s RCT revisited, brain development in poor infants, Indonesian status cards, and more…

David McKenzie's picture

What a new preschool study tells us about early child education – and about impact evaluation

David Evans's picture
When I talk to people about impact evaluation results, I often get two reactions:
  1. Sure, that intervention delivered great results in a well-managed pilot. But it doesn’t tell us anything about whether it would work at a larger scale. 
  2. Does this result really surprise you? (With both positive results and null results, I often hear, Didn’t we already know that intuitively?)

A recent paper – “Cognitive science in the field: A preschool intervention durably enhances intuitive but not formal mathematics” – by Dillon et al., provides answers to both of these, as well as giving new insights into the design of effective early child education.

False positives in sensitive survey questions?

Berk Ozler's picture

This is a follow-up to my earlier blog on list experiments for sensitive questions, which, thanks to our readers generated many responses via the comments section and emails: more reading for me – yay! More recently, my colleague Julian Jamison, who is also interested in the topic, sent me three recent papers that I had not been aware of. This short post discusses those papers and serves as a coda to the earlier post…

Random response techniques (RRT) are used to provide more valid data than direct questioning (DQ) when it comes to sensitive questions, such as corruption, sexual behavior, etc. Using some randomization technique, such as dice, they introduce noise into the respondent’s answer, in the process concealing her answer to the sensitive question while still allowing the researcher to estimate an overall prevalence of the behavior in question. These are attractive in principle, but, in practice, as we have been trying to implement them in field work recently, one worries about implementation details and the cognitive burden on the respondents: in real life, it’s not clear that they provide an advantage to warrant use over and above DQ.

Trouble with pre-analysis plans? Try these three weird tricks.

Owen Ozier's picture
Pre-analysis plans increase the chances that published results are true by restricting researchers’ ability to data-mine.  Unfortunately, writing a pre-analysis plan isn’t easy, nor is it without costs, as discussed in recent work by Olken and Coffman and Niederle. Two recent working papers - “Split-Sample Strategies for Avoiding False Discoveries,” by Michael L.

What does a game-theoretic model with belief-dependent preferences teach us about how to randomize?

David McKenzie's picture

The June 2017 issue of the Economic Journal has a paper entitled “Assignment procedure biases in randomized policy experiments” (ungated version). The abstract summarizes the claim of the paper:
“We analyse theoretically encouragement and resentful demoralisation in RCTs and show that these might be rooted in the same behavioural trait –people’s propensity to act reciprocally. When people are motivated by reciprocity, the choice of assignment procedure influences the RCTs’ findings. We show that even credible and explicit randomisation procedures do not guarantee an unbiased prediction of the impact of policy interventions; however, they minimise any bias relative to other less transparent assignment procedures.”

Of particular interest to our readers might be the conclusion “Finally, we have shown that the assignment procedure bias is minimised by public randomisation. If possible, public lotteries should be used to allocated subjects into the two groups”

Given this recommendation, I thought it worth discussing how they get to this conclusion, and whether I agree that public randomization will minimize such bias.

Weekly links July 7: Making Jakarta Traffic Worse, Patient Kids and Hungry Judges, Competing for Brides by Pushing up Home Prices, and More…

David McKenzie's picture
  • In this week’s Science, Rema Hanna, Gabriel Kreindler, and Ben Olken look what happened when Jakarta abruptly ended HOV rules – showing how traffic got worse for everyone. Nice example of using Google traffic data – MIT news has a summary and discussion of how the research took place : “The key thing we did is to start collecting traffic data immediately,” Hanna explains. “Within 48 hours of the policy announcement, we were regularly having our computers check Google Maps every 10 minutes to check current traffic speeds on several roads in Jakarta. ... By starting so quickly we were able to capture real-time traffic conditions while the HOV policy was still in effect. We then compared the changes in traffic before and after the policy change.”All told, the impact of changing the HOV policy was highly significant. After the HOV policy was abandoned, the average speed of Jakarta’s rush hour traffic declined from about 17 to 12 miles per hour in the mornings, and from about 13 to 7 miles per hour in the evenings”
  • From NPR’s Goats and Soda: 4-year kids of Cameroonian subsistence farmers take the marshmallow test, as do German kids – who do you think did best?