Published on Development Impact

Working Papers are NOT Working.

This page in:

In research, as in life, first impressions matter a lot. Most sensible people don’t go on a first date disheveled, wearing sweatpants and their favorite raggedy hoodie from their alma mater, but rather wait to break those out well into a relationship. Working papers are the research equivalent of sweatshirts with pizza stains on them, but we wear them on our first date with our audience.

It is common practice in economics to publish working papers. There are formal working paper series such as NBER, BREAD, IZA, World Bank Policy Research Working Paper Series, etc. With the proliferation of the internet, however, people don’t even need to use these formal working paper series. You can simply post your brand new paper on your website and voilà, you have a working paper: put that into your CV! Journals are giving up double-blind refereeing (AEJ is the latest) because it is too easy to use search engines to find the working paper version (it’s not at all clear that this is good. See the recent comments on Blattman’s blog, which make it look far from clear that giving up on double-blind peer-review is a good idea). But, do the benefits of making these findings public before peer-review outweigh the costs? I recently became very unsure…

In economics, publication lags, even for journals that are fast, can be long: it is not uncommon to see articles that state: Submitted December 2007; accepted August 2010. I’ll grant you, a fair bit of that period may be due to the fact that the authors sat on a resubmission, but it is common to wait 4-5 months for a first decision, and then similar times for subsequent decisions on revise and resubmits. But, research findings are public goods and working papers are a way to get this information out to parties who can benefit from the new information while the paper is under review.

But, that assumes that the findings are ready for public consumption at this preliminary stage. By preliminary, I mean papers that have not yet been seriously reviewed by anyone familiar with the methods and the specific topic. Findings, and particularly interpretations, change between the working paper phase and the published version of a paper: if they didn’t, then we would not need peer-reviewed journals. Sometimes, they change dramatically. (BTW, the promise that the blogosphere would serve as the great source where we get many good comments on our working papers simply has not come through. Useful comments require time and careful reading, which is not how stuff online is consumed.)

Now, back to the point about first impressions: When a new working paper comes out, especially one that might be awaited (like the first randomized experiment on microfinance), people rush to read it (or, rather, skim it). It gets downloaded many times, gets blogged about, etc. Then, a year later a new version comes out (maybe it is even the published version). Many iterations of papers simply improve on the original premise, provide more robustness checks, etc.. But, interpretations often change; results get qualified; important heterogeneity of impacts is reported. And sometimes, main findings do change. What happens then?

People are busy. Most of them had only read the abstract (and maybe the concluding section) of the first draft working paper to begin with. Worse, they had just relied on their favorite blogger to summarize it for them. But, guess what? Their favorite blogger has moved on and won’t be re-blogging on the new version of the working paper. Many won’t even know that there is a more recent version. The newer version, other than for a few dedicated followers of the topic or the author, will not be read by many. They will cling to their beliefs based on the first draft: first impressions matter. By the time your paper is published, it is a pretty good paper – your little masterpiece. The publication will cause an uptick in downloads, but still, for many, all they’ll remember is the sweatshirt, and not the sweat that went into the masterpiece.

Of course, we can update working papers. But, unless we can alert everyone that there is a new version of a paper (AND make them read it and understand the changes since the first draft), this is of little use. Even when I am specifically looking for more recent versions of a paper, I am usually unable to find the most recent one with a simple Google search (Try it here for the Miracle of Microfinance. Now, go to Duflo’s web page for her papers and look for the same paper: what do you see?). Also, some working papers are that for a long time: this one by Duflo, Dupas, and Kremer, which came out a couple of weeks ago first appeared in 2006 and was updated in 2010. The authors likely did not intend to publish the findings until now (they were collecting Biomarker data on STIs until recently, but kept the public informed on short-term and medium-term impacts of te interventions on schooling and fertility). The findings, naturally, seem to have evolved.

There is another problem: people who are invested in a particular finding will find it easier to take away a message that confirms their prior beliefs from a working paper. They will happily accept the preliminary findings of the working paper and go on to cite it for a long time (believe me, well past the updated versions of the working paper and even the eventual journal publication). People who don’t buy the findings will also find it easy to dismiss them: the results are not peer-reviewed. At least, the peer-review process brings a degree of credibility to the whole process and makes it harder for people to summarily dismiss findings they don’t want to believe.

I have some firsthand experience with this, as my co-authors and I have a working paper, the findings of which changed significantly over time. In March 2010, we put out a working paper on the role of conditionalities in cash transfer programs, which we also simultaneously submitted to a journal. The paper was reporting one-year effects of an intervention using self-reported data on school participation. The reviews, which were fast (as good as it gets at about a month), suggested that we should not only report longer-term data but also use alternative measures of schooling – less subject to reporting bias. We followed this advice and updated our working paper, which now presented two-year impacts using enrollment and attendance data collected from schools, in addition to independent achievement tests, in December 2010 and resubmitted it to the same journal, again simultaneously. After one more revise and resubmit, the paper is now forthcoming, and the final version (more or less) can be found here.

What’s the problem? Our findings in the March 2010 version suggested that CCTs that had regular school attendance as a requirement to receive cash transfers did NOT improve school enrollment over and above cash transfers with no strings attached. Our findings in the December 2010 version DID. The difference was NOT that we had longer-term data: if we use self-reported enrollment to examine one-year or two-year impacts, the results are the same (see Table III, panel A in the paper linked above). Rather, the difference was caused by the kind of data that we were using: we supplemented self-reports with administrative data, enrollment data collected from schools, monthly attendance ledgers, and independent achievement tests in math and languages. These additional data all lined up to refute the findings based on self-reported school participation. It turns out that asking school-age people whether they are attending school is not the best way of assessing impacts of schooling interventions (a paper I have with Sarah Baird on this is forthcoming in a special issue of the JDE on measurement, and I blogged earlier about similar evidence here).

However, the earlier (and erroneous) finding that conditions did not improve schooling outcomes was news enough that it stuck. Many people, including good researchers, colleagues at the Bank, bloggers, policymakers, think that UCTs are as effective as CCTs in reducing dropout rates – at least in Malawi. And, this is with good reason: it was US who screwed up NOT them! Earlier this year, I had a magazine writer contact me to ask whether there was a new version of the paper because her editor uncovered the updated findings while she was fact-checking the story before clearing it for publication. As recently as yesterday, comments on Duncan Green’s blog suggested that his readers, relying on his earlier blogs and other blogs, are not aware of the more recent findings. Even my research director was misinformed about our findings until he had to cite them in one of his papers and popped into my office.

Many working papers will escape this fate – which is definitely not the norm. But, no one can tell me that working papers don’t improve and change over time as the authors are pushed by reviewers who are doing their best to be skeptical and provide constructive criticism. But, it turns out that those efforts are mainly for the academic crowd or for the few diligent policymakers who are discerning users of evidence. We don’t approve drugs based on a news release of the success of a trial. We need peer-reviews to confirm the findings (and further studies to confirm the findings before approval). Why is it OK to prescribe economic policy based on a working paper? Are we sure that the people who are doing the prescribing have all the information they need? Is it because bad economic policy kills people slower than a bad drug?

So, what if we chose to not have working papers? There is no doubt that the speed with which journals publish submitted papers would have to change. Some journals pay reviewers: this could become more prevalent to encourage speedy but thorough reviews. And, these days, journal articles, with all the requested online appendices, the data, dofiles, etc. are much more attractive than working papers and I don’t think they are more academic. If you can write well and make your findings accessible for policymakers, you do equally well via a journal article as a working paper.

If we didn't have working papers, we could also go back to double blind reviews again. No, it won’t be perfect, but double-blind was there for a reason. I see serious equity concerns with single blind reviews (Those of you out there who receive a paper to review: if you are not sure who the authors are by the time you read the abstract, please resist the urge to Google the title). This should be our default position until we study the effects of single- vs. double-blind reviews in economics a bit more.

The biomedical field does not have working papers and turnaround, on average, is much quicker. Colleagues from this field never understand how we have unpublished papers for so long, even though they have been aware of the results sometimes for years. People have recently been calling for economics to borrow trial registration, CONSORT guidelines, etc. from the biomedical field (I have my doubts that these would adequately address the issues). Let’s borrow faster publications instead without sacrificing on the quality of the peer-reviews if we can.

Update (7/5/2011): In Slate.com today, Dave Johns has the perfect follow-up to this post, in an article called "Social contagions debunked": http://www.slate.com/id/2298208/pagenum/all/#p2


Authors

Berk Özler

Lead Economist, Development Research Group, World Bank

Join the Conversation

The content of this field is kept private and will not be shown publicly
Remaining characters: 1000