Coauthored with Raka Banerjee and Talip Kilic
So if you missed it, Part I of this two-part blog post  outlines all of the main reasons that you should consider incorporating Computer Assisted Personal Interviewing (CAPI) into your survey efforts. We’ll now try to even things out, by going over the many pitfalls to watch out for when switching to CAPI.
First of all, simple mistakes can be very costly. Imagine if you run into a problem with your paper questionnaire mid-way through the fieldwork process. You get the word out to your supervisors, they tell your enumerators, and then your enumerators fix the errors in each copy of their questionnaire – it’s a bit of a pain, but it’s not too difficult. With CAPI, however, you will have to have your electronic questionnaire reprogrammed (meaning a programmer needs to be kept on-duty for just such an occasion), and then somehow get the revised versions of the questionnaire to all your teams, even those teams in far-flung rural areas with little or no internet access. Meanwhile, the entire fieldwork process may have to be shut down while the teams wait for the revised application, whereas with a paper questionnaire the enumerators could have instantly made the fix and continued with their work.
At the end of the day, with CAPI, you will only be as good as your programming. The paper  we mentioned yesterday by Caeyers and co. has an interesting unintended experiment where the programmers forgot 13 validation checks when programming the CAPI. Their findings: no significant difference between CAPI and pen and paper for these questions.
Because of the dire consequences of mistakes, the single most important issue to take into account is that CAPI involves high start-up costs – and by high, we mean very, very high. Programming errors are bound to happen, but for the reasons mentioned above, it’s extremely important to minimize these types of programming errors at all costs. First of all, this means high programming costs – much higher than programming the data entry application for a paper questionnaire. This also means higher piloting costs, as you’re more likely to have bugs and problems in the piloting stage, which will need to be followed up with more programming. Finally, this means higher training costs. With a paper questionnaire, your enumerators get to see all the questions and understand how they are linked as they conduct the interview. However, with CAPI, your enumerators will need to develop a more intuitive sense of the questionnaire, as all they’ll be physically holding in their hands is a tablet. Electronic questionnaires can be designed to minimize the potential confusion to the enumerator, but this still points to the downside of the multidimensionality allowed for by the CAPI setting. You may also lose some experienced enumerators because of the technology barrier, although this may not be as much of a concern as one might expect. Indeed, an interesting paper  by Marcus Bohme and Tobias Stohr suggests that older and less computer-literate folks will take longer to train in CAPI – but makes the point that you need to be mindful that they may have other advantages in the field (e.g. when age proximity matters for interview quality).
In the previous post, we mentioned that you’ll be getting data faster. That’s true, but in some cases, you might also end up with no data. As a device-based technology, CAPI is susceptible to the same sorts of problems as any such technology: if an enumerator’s hard drive fails before s/he has a chance to upload data to the server, it’s not like there is a paper version lying around – that data is simply gone. So you need to have solid backup systems in place. (On the other hand, while we’ve seen goats eat paper questionnaires, they probably don’t like the taste of silicon).
This leads us to another important issue, which is security. With CAPI, your enumerators will be transferring and backing up confidential data containing sensitive information about identifiable respondents across the internet. If you decide to use a free file sharing system for this, it raises serious questions about the security of the confidentiality that you promised the respondents at the beginning of the interview (for example, one popular service recently suffered a security breach with some of the data stored in their cloud). There are solutions for this – Virtual Private Networks (VPNs), secure File Transfer Protocol (FTP), etc. – but this needs to be taken into account at the outset of any CAPI survey effort.
You need to buy computers and/or tablets and/or mobile phones for each enumerator! This has all sorts of implications – most obviously, buying a device for each enumerator costs much more money than printing a bunch of surveys on paper. Depending on your field staff needs for your survey, you may simply have too many enumerators to be able to afford the cost of buying a machine for each one. You are then also dependent upon that computer/tablet/mobile phone to work properly. If an enumerator suffers machine failure, that enumerator is then out of commission until their device can be fixed or replaced. Paper, as an inherently less buggy technology, is far less likely to go wrong. Finally, you’re dependent on the thing that your machine is dependent on – electricity. It’s no secret that many of the places that most need improved data also lack a regular, reliable supply of electricity, and that has consequences for the progress of your fieldwork. This might mean that your teams have to schlep generators around. This might also mean that your survey managers are going to have a hard time ensuring that any updates/revisions to the CAPI application are received in a timely fashion (and without implications for data quality).
Hardware is not the only machine-related concern: software is also an issue. The CAPI software package that you choose for programming your questionnaire can drastically change the parameters of your survey effort. Software packages vary significantly in their strengths and weaknesses, and you have to make the best choice for a given survey based on a number of issues, including sample size, questionnaire length and complexity, the need for visual aids or other tools during the interview process, and many other considerations. To help you with the choice, the Living Standards Measurement Study – Integrated Surveys on Agriculture (LSMS-ISA) project, in collaboration with the IRIS center at the University of Maryland, recently published a comparative assessment of the existing software programs for the development of CAPI applications , which is freely available at www.worldbank.org/lsms-isa . The World Bank’s LSMS-ISA project and the Development Research Group’s Computational Tools team are also currently in the process of developing a free CAPI software package that is expected to be released to the public within the next year or so (and we’ll blog about that, so you won’t miss it).
Lastly, there is still limited empirical evidence on the improved data quality of a field operation implemented using CAPI, as opposed to a well-supervised pen-and-paper operation with field-based data entry featuring similar consistency checks (and with comparable tools to improve the quantification of non-standard unit-item combinations). We’ve discussed two such papers here; one other paper  worth looking at is by Fafchamps and coauthors, which is a bit more agnostic on the quality gains (at least for firm profits and sales).
The central take-home message regarding CAPI at this point concerns the importance of getting it right. There is no computerized substitute for a well-designed, well-supervised field work effort. The key here is that the basic principles of data quality control – accuracy checks, data cleaning, etc. – are no different when integrated into the development of a CAPI application than when they are implemented into surveys that feature pen-and-paper interviewing with field-based data entry. And CAPI tools are only useful as far as enumerators and field supervisors take advantage of the available facilities and act on inconsistencies accordingly. CAPI can greatly improve the speed and accuracy of data capture, but to do so, it must be implemented correctly. This means far greater up-front investments in the survey effort prior to fieldwork in order to ensure that the CAPI application is well-designed, bug-free, and correctly incorporates all necessary data quality checks. Ultimately, the decision to use CAPI has to take into account its constraints and disadvantages as well as its potential rewards. The first and foremost goal is, as always, better data.