Published on Development Impact

Guest Post by Sebastian Galiani: Replication in Social Sciences: Generalization of Cause-and-Effect Constructs

This page in:

I agree with the general point raised by Berk in his previous post in this blog (read it here). We need to discuss when and how to conduct scientific replication of existing research in social sciences. I also agree with him that, at least in economics, pure replication analysis –which in my view it is the only genuine replication analysis- is of secondary interest –I hope to return to this issue in a future contribution in this blog. Instead, I believe that we should emphasize replication of relevant and internally valid studies both in similar and different environments. There is now excessive confidence in the knowledge gathered by a single study in a particular environment, perhaps as a result of a misconstruction of the virtues of experimentation in social sciences. As Donald T. Campbell once wrote (1969):

“Too many social scientists expect single experiments to settle issues once and for all. This may be a mistaken generalization from the history of great crucial experiments in physics and chemistry. In actually the significant experiments in the physical sciences are replicated thousands of times…. Because we social scientists have less ability to achieve “experimental isolation,” because we have good reasons to expect our treatment effects to interact significantly with a wide variety of social factors many of which we have not yet mapped, we have much greater needs for replication experiments than do physical sciences”.

In general, we may not presume that an estimated causal relationship is universally true in the sense that it holds under all conditions with all types of people and in any circumstance. All causal statements are inevitably contingent. Thus it is clearly useful to learn as much as possible about these contingencies and, where possible, identify the relationships that hold more consistently than others.
Any causal study faces two sources of threats to its validity: internal and external (see Campbell, 1957). Most of the research effort is normally devoted to dealing with threats to internal validity. This refers to whether one can validly draw the inference that, within the context of the study, the average differences in the dependent variables were caused by the differences in the relevant explanatory variables. External validity, instead, is concerned with the extent to which a causal relationship holds over variations in persons, settings and time. Thus, whenever it is possible, once an identification strategy for a causal construct is deemed reasonably valid internally, it is worth inquiring about the external validity of the results obtained.
As noted by Fisher (1935) in his seminal work:
“Any given conclusion… has a wider inductive base when inferred from an experiment in which the quantities of other ingredients have been varied, than it would have from any amount of experimentation in which these had been kept strictly constant”.
Related to external validity is the idea of causal generalization, which is concerned with specifying the range of application of a causal mechanism that has been identified with at least one instance of a treatment and outcome and at least one sample of persons and settings. In practice, there is a sense in which all causal generalization is about interpolation and extrapolation. Such exercise inevitably relies on social science theory (in addition to statistical theory). Rubin (1992) suggests that causal generalization is about estimating a response surface, i.e., mapping a third variable to an estimated causal relationship. Even though this is practically difficult to attain, a response surface is a useful way to think about causal generalization (see Shadish, Cook and Campbell, 2002).
In this perspective, Cruces and Galiani (2007) investigate the extent to which the cause-and-effect construct identified by Angrist and Evans (1998) can be generalized to the context of two developing countries where, compared to the US, fertility was known to be higher and female education levels were lower. Thus, they investigate whether in such different socioeconomic environments childbearing also leads to a reduction in female labor supply. They find that the estimates for the US can be generalized both qualitatively and quantitatively to Mexico and Argentina.
In the same spirit, Galiani et al. (2014) provide empirical evidence on the causal effects that upgrading slum dwellings has on the living conditions of the extremely poor in El Salvador, Mexico and Uruguay. This paper experimentally evaluates the impact of a housing project run by the NGO TECHO which provides basic pre-fabricated houses to members of extremely poor population groups in Latin America. The main objective of the program is to improve household well-being. The findings of the study show that better houses have a positive effect on overall housing conditions and general well-being. In two out of the three countries, the research team also document improvements in children’s health. What is more, the one case in which these improvements do not seem to have health effects among children is the one in which the experiment took place in a better, more urbanized environment in which services were more accessible. There are no other noticeable robust effects on the possession of durable goods or in terms of labor outcomes. The results of this study are unusually robust in terms of both internal and external validity because they are derived from similar experiments in three different Latin American countries.
Thus I believe there is a potentially high reward in replicating valid empirical strategies of relevant cause-and-effect constructs and this kind of the study should receive much more attention both among academics but also among policy makers. Ultimately, the external validity of causal estimates is established by replication in new data sets (Angrist, 2003). In addition, external replication of reasonably valid identification strategies would lead, to the extent that it is possible, to causal generalization. In conclusion, replication studies would broaden our knowledge about our cause-and-effect constructs of interest.
Angrist, J. (2003): “Treatment Effect Heterogeneity in Theory and Practice”, NBER WP 9708, Cambridge, MA, US.
Angrist, J. and W. Evans (1998): “Children and their Parents’ Labor Supply: Evidence from Exogenous Variation in Family Size”, American Economic Review 88(3), pp.450-77.
Campbell, D. T. (1969): “Reforms as experiments”, American Psychologist 24, pp. 409-29.
Campbell, D. T. (1957): “Factors relevant for the validity of experiments in social settings”, Psychological Bulletin 54, pp. 297-312.
Cruces, G. and S. Galiani (2007): Fertility and female labor supply in Latin America: New causal evidence, Labour Economics, Volume 14, 2007, pages 565-73.
Fisher, R. (1935): The Designs of Experiments, Oliver and Boyd, London.
Galiani, S., P. Gertler, R. Cooper, S. Martinez, A. Ross and R. Undurraga (2014): “Shelter from the storm: Upgrading housing infrastructure in Latin America Slums”, Mimeo.
Rubin, D. (1992): “Meta-Analysis: Literature Synthesis or Effect-Size Surface Estimation?”, Journal of Educational Statistics 17, pp. 363-74.
Shadish, W., T. Cook and D. Campbell (2002): Experimental and Quasi-Experimental Designs for Generalized Causal Inference, Houghton Mifflin Company, New York.
Sebastian Galiani is Professor at the Department of Economics at the University of Maryland. He is a research affiliate to the NBER, BREAD and JPAL. He is also Visiting Professor at Universidad Torcuato Di Tella in Argentina where he is Co-Director of LICIP, a research lab on crime, institutions and policies.

Join the Conversation

The content of this field is kept private and will not be shown publicly
Remaining characters: 1000