A cynic’s take on papers with novel methods to improve transparency


This page in:

What is the signal we should infer from a paper using a novel method that is marketed as a way to improve transparency in research?

I got to thinking about this issue when seeing a lot of reactions on twitter like “Awesome John List!”, “This is brilliant”,etc. about a new paper by Luigi Butera and John List that investigates in a lab experiment how cooperation in an allocation game is affected by Knightian uncertainty/ambiguity. Contrary to what the authors had expected, they find adding uncertainty increases cooperation. The bit they are getting plaudits for is then the following in the introduction:

 “This surprising result serves as a test case for our mechanism: instead of sending this paper to a peer-reviewed journal, we make it available online as a working paper, but we commit never to submit it to a journal for publication. We instead offered co-authorship for a second, yet to be written, paper to other scholars willing to independently replicate our study. That second paper will reference this working paper, will include all replications, and will be submitted to a peer reviewed journal for publication. Our mechanism allows mutually-beneficial gains from trade between the original investigators and other scholars, alleviates the publication bias problem that often surrounds novel experimental results”.

The idea is an interesting one, and certainly much easier in lab experiments where the cost of replication is relatively low. But what should we infer about the paper from this emphasis on the novel methodology, and from the commitment not to publish until replicated?  One hypothesis is that the authors main interest all along was in improving the methodology for research, and designed the initial lab experiment with this in mind. But an alternative, cynical, view is that the authors realize that the unexpected and unexplained results in the experiment were going to be difficult to publish, unless they give the paper a new twist, and this emphasis on a new method for transparency provides a way.

It made me think of three other papers:

  1. Work by Marcel Fafchamps and Julien Labonne on whether politician’s relatives get better jobs, using administrative data from the Philippines. They find, using an RD design, that relatives of current office-holders have jobs in better-paying occupations. Their methodological innovation was to write “To address concerns about specification search and publication bias (Leamer 1978, Leamer 1983, Glaeser 2006), we propose and implement a split sample approach. We asked a third party to split the data into two randomly generated, non-overlapping subsets, A and B, and to hand over sample A to us. This version of the paper uses sample A to narrow down the list of hypotheses we wish to test and to refine our methodology. Once the review process is completed, we will apply to sample B, to which we do not have access yet, the detailed methodology (including the exact list of definitions of dependent and control variables, estimation strategy and sample) that has been approved by the referees and editor, and this is what will be published.”.  For publication, the working paper then split into a paper published using the results of the full sample, and a separate methods paper which describes when the inference gain from pre-commitment with sample splitting overcomes the loss in power. Marcel notes that the editor and referees did not wish to pre-commit to publishing the paper before it was demonstrated that the findings held in the full sample.
  2. The famous pre-analysis plans paper in the QJE by Casey, Glennerster and Miguel in which they test a community-driven development (CDD) program in Sierra Leone, and essentially find not much impact.
  3. My work on wage subsidies and soft skills training in Jordan. Here we did an expectations elicitation exercise of audiences the first time presenting it, and then used this to show that the results of the wage subsidy were different from what the average development economist would expect. But when we submitted the paper, one of the referees noted “it certainly makes a good blog post, but I don’t see the case for putting this in the paper. This is not something that would convince me of the importance of the findings” and we were asked to take this out of the paper (the paper then was split into paper 1 and paper 2, with paper 2 including this).
Suppose Casey et al. had found that the CDD program had had massive positive impacts on a wide-range of outcomes. Would the paper then have stressed the pre-analysis as much? A big part of the emphasis was on showing that it would have been possible to data-mine a successful or unsuccessful impact across the many measures they had, but if impacts were much stronger, results would probably be less sensitive to the measure used.

That is, in each case, the results themselves can become of secondary interest to the method. The cynical take is then that an emphasis on methods is a way to get people interested in what would otherwise be less interesting papers. An alternative take is that the methods were the goal of the paper in the first place. So perhaps pre-specifying the goal of pre-specification/methods papers in advance would solve this? I also wonder if the signal is different from senior researchers writing such papers if such a paper was a job market paper for example.

This issue of what the signal readers will draw from methodological innovations is one that I continue to wonder about. Here are two recent examples I’ve been thinking about:
  1. Given the well-documented bias journals have against non-U.S. papers, and especially against some small developing countries, I have often thought about writing “The purpose of this paper is to provide proof of concept of an economic mechanism. As such, the paper should be read with this goal in mind, and the specific country and context in which this proof was documented will be revealed upon paper approval”, or perhaps just “this study was conducted in a country in the Western Hemisphere”.
  2. I’ve been intrigued by the “results-free review” process that has been trialed by some political scientists – in which researchers submit papers that contain no mention of their results. I’ve thought about this in the context of experiments where I would think a null result is an interesting finding – perhaps just submitting a paper and holding back the results (and obviously not presenting them places either beforehand). [note here I am thinking of this differently from the polisci context in deciding to do this when the author has results, but wants the paper to get judged on whether it asks an interesting question and uses appropriate methods].
But I think the natural tendency of reviewers will be to view the signal in such submissions as saying that the paper is from 1) a tiny country they don’t care about; and 2) the paper probably doesn’t find much in the way of significant impacts.

Let me be clear, I think the methods in each of the papers I’ve described are really interesting and useful tools. But what I struggle with is how to avoid them being seen as “gimmicks” to sell papers which have otherwise not very interesting results? One approach is to make the paper fully centered on the method, and then, as in many econometric theory papers, the application becomes secondary and just serves as an illustration. The other is perhaps to think of more examples of papers where the results themselves are strong and would stand on their own, but where there has also been a big methodological innovation - so the signal that strong methods = weak results no long dominates in people’s minds. Any ideas of papers to point to in this regard?


David McKenzie

Lead Economist, Development Research Group, World Bank

Joe Cummins
May 01, 2017

There is the whole literature Lalonde started that looks at job training and the real-world ability of quasi-experimental methods to recover experimental results:
The original Synthetic Control paper on smoking is probably a good example of an applied paper that is really a methods paper and where the empirical result is interesting in and of itself: https://economics.mit.edu/files/11859
Ludwig & Miller's work on Head Start also has some methods-y RD stuff that is interesting in its own right. I think Doug once told me about half the cites that paper had were for methods, and half for the empirical contribution: http://home.uchicago.edu/ludwigj/papers/QJE_Headstart_2007.pdf
I think of the Kremer et al. paper Incentives to Learn as being useful for the non-parametric treatment effects estimates using local-linear regressions separately across groups...but maybe that is just because that is the first time I'd seen them used well. Ditto for the Bitler et al. "What Do Mean Impacts Miss" paper, which smartly uses QTEs to investigate heterogeneity in treatment effects.
Econ is sometimes a bit hostile to "methods" papers that are neither pure econometric theory nor focused on an empirically import result. There are some notable exceptions: "How much do we trust difference-in-differences estimates" and the subsequent clustering literature; some paper someone wrote called "In Pursuit of Balance". But if readers/editors don't immediately see the usefulness of the method, it is hard to convince them you've added much to the literature. I think that is mostly reasonable - after all, I'm with Goethe on the whole "Moreover, I hate everything that only instructs me without increasing or immediately stimulating my own activity."
But because of that general hostility to "methods" papers (or maybe better said a lack of journal space for them), I think you are right that the current structure incentivizes a kind of (accidental) obfuscation, where shoehorning a methods paper into an empirical paper can be confusing to the reader about what is really important in the work. It is a tough balance for both individual researchers and the field as a whole. Although...looking above, maybe the rule is just that if your paper is on inference, you can do a pure methods paper with simulations/monte-carlos, but if it is on point-estimates, you have to do a meaningful application.

Rachel Glennerster
May 01, 2017

In our case (Casey et al) this was an example where we thought a zero result was pretty interesting but stressing the methodology was the only way to get a top journal to care about a small developing country. Our first draft talked about the novel transparency methodology (which we had to commit to years before the final results came through) but we ended up playing it up as the only way to get any coverage of a paper on a small African economy, even though this was the first RCT of a program on which the World Bank spent an estimated $50 billion over 10 years. Was that a sell out? or cynical? I dont think so, I think its the realities of working on Africa. You have to be more creative to get published well. Having said that, people have found the discussion of the practicalities of doing preanalysis in the paper useful.

Dan Stein
May 02, 2017

I love this post- I've shared similar skepticism when reading these "accidental methods" papers in the past. Thanks to Rachel above for being up-front about that "stressing the methodology was the only way to get a top journal to care..." But this is really kind of sad- think of the hundreds of extra person-hours that went into taking interesting experiments with null-results and massaging them into new form that catches the interest of editors. That seems like a dead-weight loss to me. I give kudos to journals like the Journal of Development Effectiveness, which is explicitly committed to publishing null results. But I wish there was an outlet which had both this commitment and generated significant citations.