If you are like most people working with quantitative data in development, getting too many statistically significant results is probably not your most pressing problem. On the contrary, if you are lucky enough to find a star, whether it's of the 1%, 5% or 10% type, there are plenty of star-killers to choose from. In what is perhaps the only contribution to the rare genre of 'econometrics haiku', Keisuke Hirano reflects on one of them: T-stat looks too good // Try clustered standard errors - // Significance gone (in Angrist and Pischke's MHE).
Nevertheless, here is another star-killer: A trial registry. A trial registry is a publicly available database where researchers and evaluators can register a research design prior to initiating a study. Outcome measures and subgroups are crucial aspects to register and importantly, any future changes made to the registration will be visible to everyone. Medical researchers have done this for a long time and trial registration has been required in the US by FDA in the Food and Drug Administration Amendments Act since 1997.
In a recent paper, my co-authors and I pick up on the previous suggestions that hypotheses should be registered in advance (e.g. Duflo et al 2007). We argue that a trial registry for non-medical development interventions can create two levels of credibility of results. If outcome, subgroup and analysis strategy has been specified and registered in advance, results belong to level one. If on the hand there is no registration, i.e. for secondary analysis of the data, then the results belong to level two. As such, secondary analysis is useful and should continue, but the opportunity of achieving level one credibility should not be forgone.
One reason for establishing a trial registry is that there is likely to be significant bias in available evidence. DeLong and Lang (1992) look at the distribution of significance levels and find that published research is biased toward statistically significant results. Two possible reasons are data mining and journal acceptance bias. A trial registry can serve as a check against data mining and can give an overview of all studies on a topic, thereby reducing publication bias or at least helping us assess the magnitude. Edward Glaeser at Harvard has once argued that randomized control trials make data mining "essentially disappear" (Glaeser 2006, page 20). We disagree for the simple reasons that RCTs are often based on large LSMS-style surveys which allow for considerable flexibility in terms of outcome measure. Moreover, analyses on subgroups are common. On top of that, a trial registry will not only be relevant to researchers involved in randomized control trials, but to all studies involving targeted data collection.
Fortunately, we have heard that a trial registry is currently under way. In our paper, we go through the literature from medicine on trial registries. Building on the paper, I present the three mistakes I think should be avoided when looking at fifty years of trial registry history from medicine.
Mistake number one: Include only a sub-discipline, like development economics
If a trial registry does not include all trials within a category of interest, it cannot act as a bulwark against publication bias. Because of this, builders of a trial registry should strongly consider to open the registry to social science in general. Even though this is a larger and certainly slower task, it may be well worth the effort, also for the narrow discipline of development economics. Developments in medicine were surprisingly slow. The first trial registry in medicine was established in the mid 1970s and included only clinical trials related to cancer. Already back then, it was clear that a comprehensive trial registry for medicine would be more beneficial than separate sub-registries, but it was not until 2000 that a comprehensive trial registry, ClinicalTrials.gov, became available (Dickersin and Rennie 2003). Furthermore, the use of this registry only took off in 2005. One reason for this delay was that several subfields maintained their own registries and with these the incentive for common action simply wasn't there.
For development interventions, it is highly likely that a study can be registered in two registries. Today, a trial on the returns to education in Malawi might fit both the current trial registry for education and a future trial registry for development interventions. So even when thinking about development economics in isolation, there are benefits to being inclusive.
Mistake number two: Fail to align researcher incentives
A trial registry alone is unlikely to make a difference. A necessary second step is that researchers must register trials. Trial registry history from medicine suggests that they will do so only if it matters to the likelihood of publication. The graph below shows the cumulative registrations in ClinicalTrials.gov from 2000 onwards. During the first five years the number rises steadily with 390 new registrations per month. From September 13th 2005 this rate changes to 1300 new registrations per month. On this date, the International Committee of Medical Journal Editors enacted a policy whereby no articles building on original data would be accepted if unregistered. Even though ICMJE is a closed organization with few members, its Uniform Requirements for Manuscripts which includes the trial registry requirement, quickly got adopted by many journals. Today the list counts 850 medical journals. Even without any legal or financial power, this was a true game changer.
In social science, the same may very well happen if the organizers of the registry are successful in gathering the top journals from various fields, agreeing on common standards for input and prioritizing participation of important stakeholders over speed of implementation. Getting donors on board and mobilizing support among governments in developing countries should also help.
Mistake number three: Collect low quality data
The quality of the registry will depend on the quality of the data in it. ClinicalTrial.gov has adopted a wide range of procedures to ensure data quality and credibility:
- All entries are checked manually for consistency
- All entries must be approved by the submitting institution's organizational account holder. This prevents registration of too many trials.
- Several procedures work to guard against double registration: Sponsors must confirm entries, automatic searches check for similarities and a system of unique ID's enable checking against other registries.
- Spell checkers check both entries and search strings against a dictionary of medical language. AEA keywords could serve as a starting point for such a dictionary.
Necessary but not sufficient
We should not have too high expectations and we should be careful that a trial registry does not install a false sense of security. A trial registry will not solve all the credibility issues we have. Further, its effective implementation is likely to require time and money. But being clear about the arguments in favour of a registry and making use of the experiences from medicine should provide a good starting point.
Ole Dahl Rasmussen, PhD Student at University of Southern Denmark and a microfinance and evaluation advisor to DanChurchAid.
Missing your Friday links? – follow David on twitter @dmckenzie001