When Development Impact shut down for August, I had ambitious goals. Unfortunately I didn’t meet them all (why does that always happen?). However I did manage to madly review almost 60 proposals for the funding of prospective impact evaluations financed by various organizations and donors. Many of these proposals were excellent (unfortunately not all could be funded). However it was surprisingly informative to read so many proposals in such a condensed time. Among the unsuccessful proposals (only in my eyes of course; my review was only one input into the final decision) there were several categories of error that repeatedly arose. You (our reader) may find these errors to be rather basic, but I encountered them so often that I think it bears reviewing here.
Not all quasi-experimental designs are created equal
Fortunately there was little confusion over the use of the phrase “experimental design” (alas a handful of proposals stated they will conduct a randomized study but instead proposed drawing a random sample from a population to conduct an observational study – hopefully this conceptual confusion will grow ever smaller over time). The majority of the proposals, however, did not put forth randomization for the mode of identification and virtually all of these proposals described their study as quasi-experimental.
So far so good. There are many rigorous quasi-experimental methods that come close to approximating the identifying conditions enabled by randomization, and I am by no means a “randomization or the highway” reviewer. However, a simple comparison among program beneficiaries and non-beneficiaries, without any attempt to understand the forces of selection that determine program participation, does not pass the evidentiary bar. And unfortunately these types of descriptive proposals were numerous.
At first I was confused by the “quasi-experimental” designation for these studies. I am used to the term referring to methods such as regression discontinuity, IV, difference-in-difference with matched pre intervention trends, etc. In other words, methods that attempt to infer causality on the basis of a handful of explicit (and sometimes testable) assumptions. A well-reviewed book by colleagues from last year – Impact Evaluation in Practice – has a good introduction of “the IE toolkit” and the methods that I have come to know as quasi-experimental.
So why was the designation “quasi-experimental” applied to purely descriptive studies? Well I learned that, at least in relation to other fields, my designation of “quasi-experimental” is too narrow.
For example the broader social research field defines a quasi-experimental study as any attempt at casual inference where the researcher does not have control over treatment assignment. This includes what I would consider a naïve comparison of means. With the emergent program of identification methods under the rubric “Impact Evaluation”, there is a decided hierarchy of rigor among the various methods. In other words there is a dividing line between quasi-experimental and “queasy-experimental” methods (I took that nifty designation from the link above).
Study context often dictates the applicable study method and sometimes there may not be much choice – I was highly favorable towards a proposed interrupted time series analysis because of the importance of the study question and the contextual inability to construct a more rigorous counterfactual. Nevertheless a call for Impact Evaluation Proposals will place a high premium on internal validity. If the analytic method at the heart of your proposal is not rigorous enough given the question and context, it is probably not worth the time to submit to an IE funding window.
An informed power analysis includes consideration of cluster effects/design effects
A power analysis in a proposal for a prospective evaluation is a good thing. In fact any proposal without an attempt at a power analysis has a major strike against it – I start to wonder if the authors are hiding something or whether they understand the principles behind the determination of sufficient study size. Fortunately few proposals had absolutely no power analysis, although a disturbing number of submissions ignored two fairly fundamental and related aspects:
If inference is to be made on the basis of survey data, there is almost always a relevant cluster effect that will affect the determination of sufficient sample size. (This is the design effect in the public health lingo.) A surprising number of studies completely ignored this, even when it was clearly applicable given the nature of their sampling plan or the nature of treatment exposure.
On a related note, if there are relatively few study clusters (say less than 30) then we are less likely to be in an asymptotic world, and the power analysis should reflect this. Berk and I have discussed this general issue before here, here, and here.
The stipulated content or format of the proposal is not meant to be optional
You (the author) may have a sterling 35 page CV, but if the proposal format stipulates a maximum 5 page CV for all PIs, the submitted sterling 35 page CV will likely do more harm than good. More importantly, calls for proposal typically stipulate a specific researcher bio or CV format. Often the funding agency wants the researcher to explicitly list other IE or IE-like research experience involving study design, management, analysis, and dissemination. Ignoring these formats forces the reviewer to comb through a generic CV and infer the standardized information. This can lead to errors and, as you might expect, does not generate any good will from the reviewer.
If the proposal guidelines ask all submitters to discuss a specific issue, such as the possibility for Hawthorne or John Henry effects, then the proposal needs to include this discussion. If the particular discussion issue doesn’t apply, then the researcher should argue why not. It simply doesn’t look good for a proposal to ignore a request for information or discussion when the previous eight proposals that I just read all satisfied the same request. I start to wonder how interested the authors actually are in the proposed study or I question if the authors understood the issue sought.
So, yes, the above observations can be considered relatively elemental. However when a deadline is approaching and a team rushes to coordinate all inputs in order to create a seamless proposal, I imagine that even these lessons can be forgotten. Please don’t. And I hope I never forget as well(!).