Increasingly impact evaluations are moving beyond a simple testing of a single treatment versus a control group towards testing competing interventions designed to achieve a given outcome. Examples include Berk’s work  on testing conditional versus unconditional cash transfers, Drexler et al.’s work  testing standard business training versus a simplified rules of thumb approach, and Banerjee et al’s work  comparing a teacher helper (balsakhi) to computer-assisted learning for remedial education. These types of studies are important in helping us choose among a menu of policy options.
A second class of studies employs what is known as a 2x2 design, in which one group gets treatment A, another gets treatment B, a third group gets treatments A and B, and the fourth group is the control group. As I discussed earlier this year , historically these were done to allow researchers to piggyback a second or third experiment on top of an existing one, to allow this other intervention to be done more cheaply- think Pascaline Dupas’ work on Sugar Daddies  which was implemented in schools which were also part of an evaluation of an HIV prevention curriculum, or Dean Yang and co-authors layering an experiment which offered discounts  to remitters on top of one which offered bank accounts  with different degrees of control to remitters. They tested for complementarities between the treatments, but complementarities were not the focus of this work, and power appears to have been low to detect them – I think the common approach here is a footnote in one paper acknowledging a second intervention was done, noting there was no significant interaction, and thereby justifying saving the other treatment for a second paper.
However, there are a number of interventions where testing for complementarities between treatments is an explicit research question. For example, Gine and Mansuri  ask whether business training is more or less effective when coupled with access to credit; in an experiment in Jordan  David and co-authors ask whether the combination of soft skills and a wage voucher is more effective than either alone, and in on-going work in Sri Lanka, David and co-authors are testing whether labor and capital, labor and skills, and capital and skills are complementary treatments. However, to date few studies have found evidence of strong complementarities.
Financial Literacy for Migrants in Indonesia
A recent experiment we conducted with Yoko Doi of the World Bank’s Indonesia country office is an exception, and provides a case where the whole is greater than the sum of the two parts. The research paper is here , while a 2-page impact note  summarizes some of the policy implications. In this post we’ll concentrate on the findings concerning complementarity.
The context is a pilot program on financial literacy for overseas migrant workers and their families developed as a partnership between the Government of Indonesia and the World Bank and implemented in Greater Malang area and Blitar District of East Java. There has long been a concern that families have trouble saving enough of the remittances they receive, and so the training was designed to enhance financial knowledge and build up savings. The key question then was who should be trained? Standard approaches either train just the migrants (either in pre-departure seminars or at destination), or just their remaining family (typically once the migrant has left), so we wanted to know which would work best, and whether training both the migrant and their family would be even more successful.
Our pilot worked with 400 Indonesian migrant workers and their households. Almost all of the workers were women, about to go abroad to work as housemaids in Hong Kong, Malaysia, Singapore, and Taiwan. Workers were randomly assigned into one of four groups:
· Group A: financial literacy training was provided to the migrant worker only.
· Group B: financial literacy training was provided only to the family member of the worker who would be responsible for receiving remittances.
· Group C: both the migrant and their family worker received training;
· Group D: control group, in which neither the migrant nor their family member were trained.
If we were doing the first approach discussed above, we would concentrate on comparing A vs B as alternative ways to raise savings. If we wanted to see which approach is most effective, we might compare A vs B vs C. But if we want to see if there are complementarities, we would also like to see whether C>A+B – that is, whether the effect of getting both treatments is greater than one would predict from looking at the two treatments alone.
So what do we find?
The figure below shows some illustrative results. Training both the migrant and their family leads to more financial awareness, more financial record-keeping, a greater likelihood of savings, and a lower likelihood of taking a loan than training just one or the other. Savings levels are also higher with the combined treatment (not shown on the graph). But more than that, the purple bars which show the impacts for the combined treatment are stronger impacts than the green bars, which show what one would predict from just adding up the effects of the two separate treatments – that is, the whole is more than the sum of the parts!
Note that we only measure outcomes for the migrant family, not the migrant (who is overseas at the time of the follow-up survey), so migrant only training may have had more impact on the migrant than the family remaining in Indonesia. Nevertheless, the results we find are striking for two reasons – first, because existing financial literacy interventions have struggled to find impacts on outcomes like saving at all – we think our context is an example of intervening at a “teachable moment”, where households potentially had both the interest to learn about money management and the opportunity soon after to put what was learned into practice; and second – due to the complementarities – it may be that both the desire to change savings behaviors and the ability to put this new knowledge into effect is much greater when both the remittance sender and the receiver have this training.
The existence of these complementarities is of significance for both research and policy. On the research side, our study emphasizes the need for deeper thought in research design keeping in mind possible complementarities (noting that these could also be negative!). But there is a greater point to be made for policy. Indeed, if such complementarities exist, then it's pure arbitrage to identify them to get bonus bang for the marginal policy $. Anyone else know of studies which have found strong complementarities in a 2x2 design?