Andrew Gelman has a post from last week that discusses the value of preregistration of studies as being akin to the value of random sampling and RCTs that allow you to make inferences without relying on untestable assumptions. His argument, which is nicely described in this paper, is that we don’t need to assume nefarious practices by study authors, such as specification searching, selective reporting, etc. to worry about the p-value reported in the paper we’re reading being correct.
Researchers put a lot of effort into developing survey questionnaires designed to measure key outcomes of interest for their impact evaluations. But every now and then, despite efforts piloting and fine-tuning surveys, some of the questions end up “not working”. The result is data that are so noisy and/or missing for so many observations that you may not want to use them in the final analysis. Just as pre-analysis plans have a role in specifying in advance what variables you will use to test which hypotheses, perhaps we also want to specify some rules in advance for when we won’t use the data we’ve collected. This post is a first attempt at doing so.
I’ve been asked several times what I think of Alwyn Young’s recent working paper “Channelling Fisher: Randomization Tests and the Statistical Insignificance of Seemingly Significant Experimental Results”. After reading the paper several times and reflecting on it, I thought I would share some thoughts, with a particular emphasis on what I think it means for people analyzing experimental data going forward.
Chris Blattman posted an excellent (and surprisingly viral) post yesterday with the title “why I worry experimental social science is headed in the wrong direction”. I wanted to share my thoughts on his predictions.
“Take experiments. Every year the technical bar gets raised. Some days my field feels like an arms race to make each experiment more thorough and technically impressive, with more and more attention to formal theories, structural models, pre-analysis plans, and (most recently) multiple hypothesis testing. The list goes on. In part we push because want to do better work. Plus, how else to get published in the best places and earn the respect of your peers?
It seems to me that all of this is pushing social scientists to produce better quality experiments and more accurate answers. But it’s also raising the size and cost and time of any one experiment.