Syndicate content

Setting up your own firm for a firm experiment

David McKenzie's picture

The typical approach to examining how workers, consumers, or governments interact with a firm has been for researchers to find a willing firm owner and convince them to run experiments. Examples include Bandiera et al. working with a UK fruit-farmer to test different payment incentives for immigrant workers; Bloom et al. working with a Chinese travel agency to test the effect of letting workers work from home; and Adhvaryu et al. working with an Indian garment firm to measure impacts of soft-skills training for workers and of introducing LED-lighting. However, finding/persuading a firm to do the experiment that a researcher would like to do can be hard, with many of these existing samples coming about through a researcher having a former student or relative who runs one of these firms.

So what should you do if you lack a connection, or you want to do something that you cannot persuade a firm to do?

Recently, a number of researchers have taken a different approach, which is to set up and run for themselves a firm in order to answer research questions. I thought I would give some examples of this work, and then discuss some of the issues that arise or things to think about when deciding about pursuing this research strategy.

Weekly links January 18: an example of the problem of ex-post power calcs, new tools for measuring behavior change, plan your surveys better, and more...

David McKenzie's picture
  • The Science of Behavior Change Repository offers a repository of measures of stress, personality, self-regulation, time preferences, etc. – with instruments for both children and adults, and information on how long the questions take to administer and where they have been validated.
  • Andrew Gelman on post-hoc power calculations – “my problem is that their recommended calculations will give wrong answers because they are based on extremely noisy estimates of effect size... Suppose you have 200 patients: 100 treated and 100 control, and post-operative survival is 94 for the treated group and 90 for the controls. Then the raw estimated treatment effect is 0.04 with standard error sqrt(0.94*0.06/100 + 0.90*0.10/100) = 0.04. The estimate is just one s.e. away from zero, hence not statistically significant. And the crudely estimated post-hoc power, using the normal distribution, is approximately 16% (the probability of observing an estimate at least 2 standard errors away from zero, conditional on the true parameter value being 1 standard error away from zero). But that’s a noisy, noisy estimate! Consider that effect sizes consistent with these data could be anywhere from -0.04 to +0.12 (roughly), hence absolute effect sizes could be roughly between 0 and 3 standard errors away from zero, corresponding to power being somewhere between 5% (if the true population effect size happened to be zero) and 97.5% (if the true effect size were three standard errors from zero).”
  • The World Bank’s data blog uses meta-data from hosting its survey solutions tool to ask how well people plan their surveys (and read the comments for good context in interpreting the data). Some key findings:
    • Surveys usually take longer than you think they will: 47% of users underestimated the amount of time they needed for the field work – and after requesting more server time, many then re-request this extension
    • Spend more time piloting questionnaires before launching: 80% of users revise their surveys at least once when surveying has started, and “a surprisingly high proportion of novice users made 10 or more revisions of their questionnaires during the fieldwork”
    • Another factoid of interest “An average nationally representative survey in developing countries costs about US$2M”
  • On the EDI Global blog, Nkolo, Mallet, and Terenzi draw on the experiences of EDI and the recent literature to discuss how to deal with surveys on sensitive topics.

Education spending and student learning outcomes

David Evans's picture

How much does financing matter for education? The Education Commission argued that to achieve access and quality education “will require total spending on education to rise steadily from $1.2 trillion per year today to $3 trillion by 2030 (in constant prices) across all low- and middle-income countries.” At the same time, the World Bank’s World Development Report 2004 showed little correlation between spending and access to school, and the World Development Report 2018 (for which I was on the team) shows a similarly weak correlation between spending and learning outcomes. (Vegas and Coffin, using a different econometric specification, do find a correlation between spending and learning outcomes up to US$8,000 per student annually.)


Sources: Left-hand figure is from WDR 2004. Right-hand figure is from WDR 2018

And yet, correlation is not causation (or in this case, a lack of correlation is not necessarily a lack of causation)! Last month, Kirabo Jackson put out a review paper on this topic: Does School Spending Matter? The New Literature on an Old Question. This draws on a new wave of evidence from the United States’ experience, moving beyond correlations to efforts to measure the causal impact of spending changes. (Jackson and various co-authors have contributed significantly to this literature.) I’ll summarize his findings and then discuss what we might expect to be the same or different in low- or middle-income contexts.

When it comes to modern contraceptives, history should not make us silent: it should make us smarter.

Berk Ozler's picture

On January 2, 2019, the New York Times ran an Op-Ed piece by Drs. Dehlendorf and Holt, titled “The Dangerous rise of the IUD as Poverty Cure.” It comes from two respected experts in the field, whose paper with Langer on quality contraceptive counseling I had listed as one of my favorite papers that I read in 2018 just days earlier in pure coincidence. It is penned to warn the reader about the dangers of promoting long-acting reversible contraceptives (or LARCs, as the IUD and the implant are often termed) with a mind towards poverty reduction. Citing the shameful history of state-sponsored eugenics, which sadly took place both the U.S. and elsewhere, they argue that “promoting them from a poverty-reduction perspective still targets the reproduction of certain women based on a problematic and simplistic understanding of the causes of societal ills.

What started as an Op-Ed with an important and legitimate concern starts unraveling from there. A statement that no one I know believes and is not referenced (in an otherwise very-well referenced Op-Ed) “But there is a clear danger in suggesting that ending poverty on a societal level is as simple as inserting a device into an arm or uterus” is followed by: “Providing contraception is critical because it is a core component of women’s health care, not because of an unfounded belief that it is a silver bullet for poverty.” In the process, the piece risks undermining its own laudable goal: promoting the right and ability of women – especially adolescents, minorities, and the disadvantaged – to make informed personal decisions about whether and when to have a child to improve their own individual welfare first and foremost.

Weekly links January 11: it’s not the experiment, it’s the policy; using evidence; clustering re-visited; and more...

David McKenzie's picture
  • “Experiments are not unpopular, unpopular policies are unpopular” – Mislavsky et al. on whether people object to companies running experiments. “Additionally, participants found experiments with deception (e.g., one shipping speed was promised, another was actually delivered), unequal outcomes (e.g., some participants get $5 for attending the gym, others get $10), and lack of consent, to be acceptable, as long as all conditions were themselves acceptable.” – caveat to note-  results are based on asking MTurk subjects (and one sample of university workers) whether they thought it was ok for companies to do this.
  • Doing power calculations via simulations in Stata – the Stata blog provides an introduction on how to do this.
  • Marc Bellemare has a post on how to use Pearl’s front-door criterion for identifying causal effects – he references this more comprehensive post by Alex Chino which provides some examples of its use in economics.

Changing gender attitudes, one teenager at a time

Markus Goldstein's picture
I’ve been trying to figure out how to get my kids to do more household chores.   Luckily, help was forthcoming from a recent paper by Diva Dhar, Tarun Jain, and Seema Jayachandran.   They take to Indian secondary schools with an intervention designed to increase support for gender equality among adolescents.   And yes,  it does work, including getting boys to do more chores.  
 

Attrition rates typically aren’t that different for the control group than the treatment group – really? and why?

David McKenzie's picture

When I start discussing evaluations with government partners, and note the need for us to follow and survey over time a control group who did not get the program, one of the first questions I always get is “Won’t it be really hard to get them to respond?”. I often answer with reference to a couple of case examples from my own work, but now have a new answer courtesy of a new paper on testing for attrition bias in experiments by Dalia Ghanem, Sarojini Hirshleifer and Karen Ortiz-Becerra.

As part of the paper, they conduct a systematic review of field experiments with baseline data published in the top 5 economics journals plus the AEJ Applied, EJ, ReStat, and JDE over the years 2009 to 2015”, covering 84 journal articles. They note that attrition is a common problem, with 43% of these experiments having attrition rates over 15% and 68% having attrition rates over 5%. The paper then has discussion over what the appropriate tests should be to figure out whether this is a problem. But I wanted to highlight this panel from Figure 1 in their paper, which plots the absolute value of the difference in attrition rates by treatment and control. They note “64% have a differential rate that is less than 2 percentage points, and only 10% have a differential attrition rate that is greater than 5 percentage points.” That is, attrition rates aren’t much different for the control group.

Power to the Plan. Guest Post by Clare Leaver, Owen Ozier, Pieter Serneels, and Andrew Zeitlin

Owen Ozier's picture

The holidays are upon us. You might like to show off a bit by preparing something special for the ones you love. Why not make a pre-analysis plan this holiday season? You’re thinking, I do that every year, but we want to tell you about a new twist: using a dash of endline data!

Pages