A saner approach to pre-analysis plans

|

This page in:

When I did my dissertation, back in the pre-RCT (in economics) days, it was a roller coaster. I would run some regressions. Get some results that were somewhat consistent and stable. And take them back to the theory I started with. Often, they wouldn’t quite line up. So, it would be back to thinking, and more exploration of the data. This is what I love about economics: using the ideas from theory to take into data and trying to make sense of it all – some detective work, some stepping back and taking a whole new angle on things, and the occasional freefall into intellectual darkness from which one might emerge a couple of days (and a few regressions) later with a glimmer of having found something that might make sense. 

My dad was a scientist. He worked in a lab, where they wore white coats and did experiments around a clear hypothesis, with sterile conditions and precisely measured things coming together in a precise way. Me, I loved the messiness of the world outside the lab. I wanted to be a social scientist, understanding (just a bit) how people interacted and why they acted the way they did.  And this meant inquiry which moved while you were in it and might not even be a linear process. 

So, it was with some dismay that I watched the “scientification” of economics. The latest big push in this direction was the push on pre-analysis plans (PAPs). To me, the idea that we could know enough to ex-ante describe all possible analyses and that this would be the only set of results that would be acceptable in a journal, was depressing.

A new paper by Esther Duflo and a bevy of influential coauthors gives me some hope that it’s not that bad. The main thrust of their argument is that we need to approach PAPs with some judiciousness. 

Duflo and co. make the argument that PAPs came from a good place. We needed a way to make sure that null results have some hope of seeing the light of day.  And we need to avoid cherry picking results, especially when there are less rosy/clear ones in the outputs of our regressions.  Finally, and this is one that has grown more important as researchers get into the business of designing interventions, we might have a conflict of interest. I know I respect and like most of the people I work on interventions with and I dread a result that indicates their intervention isn’t working or, worse, actually hurting people.  And I do worry this will skew how I approach the results.  

In principle, the PAP can deal with these issues – it ties my hands and makes things more transparent since I will deposit it in a registry. But if editors require strict adherence to this, or research teams self-police themselves to only consider PAP-specified results as valid (I’ve seen this), Duflo and co. argue that this could be detrimental to scientific progress.

The problem is that you cannot fully specify all contingencies nor, as they point out, is it worth spending a very large amount of time trying to get close. Instead, they offer some guidelines for a more sane approach to PAPs.

First, keep them slim.   Duflo and co. argue for a PAP covering the main fields in the AEA registry as sufficient in many cases (e.g. primary outcomes, experimental design, randomization details).  

Second, register the trial as soon as you can, ideally before the intervention goes to the field – but that won’t always be possible or the best course.  The argue for the AEA registry, but there are other options out there (3ie, EGAP, and Center for Open Science). One important thing they point out is that if you are shooting for a medical journal or Science, you also have to register with an approved clinical trial registry -- even if you are doing social science research.

Third, it’s ok to be uncertain. They put it nicely: “we view the PAP as closer to a historical record of initial thoughts and plans, rather than a binding, legal contract for what can be done and presented.”

Finally, they make a case for doing a populated PAP with your results (a nice public good) that you can file, and then a research paper – and these are distinct outputs and “should be treated as such.” 

Duflo and co. also include a number of helpful FAQs in their paper (e.g. How specific do I need to be on the primary outcome variable) with a promise to keep updating this section.

So, this seems to me like a better approach than getting stuck in a 30 page, heavily constrained and limiting document. Instead, we get the benefits of a commitment device to some systematic thinking ex ante and a public record of what is going on, but with the room for discovery and roller coaster riding.  Into the beautiful mess!

Authors

Markus Goldstein

Lead Economist, Africa Gender Innovation Lab and Chief Economists Office

Join the Conversation

ROBERT PICCIOTTO
April 27, 2020

Markus
Neither the Duflo et.al. paper nor your commentary refer to the vast evaluation literature about data analysis and interpretation. See for example Chapter 12 of Evaluation (2nd Edition) - the Carol Weiss classic. Nor are you acknowledging the relevance of theory-based evaluation models and approaches, let alone face up to the imperative of systems thinking in theory construction. This illustrates the parsimony of economics that Albert O Hirschman rightfully deplored. To be sure, you are respectfully hinting that the randomistas may not fully appreciate the messiness of a complex social world. This is a good start - but while RCTs are part of the evaluator's tool kit, a bolder critique of PAPs seems warranted ...
Best
Bob