Paper v Plastic Part I: The survey revolution is in progress


This page in:

Coauthored with Raka Banerjee and Talip Kilic

In the eternal quest for better data, one of the most exciting modes of data collection as part of survey efforts across the globe is known as Computer-Assisted Personal Interviewing, or CAPI. CAPI means the integration of interviewing and data entry process via the use of a handheld device, such as a tablet computer or a netbook, preloaded with an electronic questionnaire. Pre-programmed consistency checks identify potential errors on-the-fly, so that the enumerator can take corrective action by double-checking the information with the respondent, or letting the programmer know that there’s something wrong with the code. If it’s done correctly, the result is a marked increase in both the efficiency of the data collection process as well as the accuracy of the data collected.
CAPI is by no means a new methodology – since the late 1980s, several household surveys have been implemented on CAPI platforms in developed country settings, including in the Netherlands, Norway, the United Kingdom, and the United States. More recently, there have been an increasing number of applications in developing countries; we know of CAPI surveys that have been done in Burkina Faso, China, Colombia, Ethiopia, Ghana, Guatemala, Haiti, Latvia, Kenya, the Philippines, Swaziland, Tanzania, and Uganda. Implementing CAPI in developing country contexts has become increasingly feasible, as the cost of purchasing the necessary equipment to conduct a CAPI survey – mobile phones, PDAs, netbooks, or tablet computers – continues to decrease.
Why should we care? Because CAPI allows survey designers to accommodate idiosyncratic behavior more effectively. Going through a survey no longer has to be a linear or static process – instead, with CAPI, you can use filters to create a whole range of versions of the questionnaire and these can evolve as the interview evolves.   If you try this with paper (aka PAPI –paper and pen interviewing), your enumerators are almost surely going to get stuck in a quagmire of mistakes in skip-codes.   Let’s look at some of the positive features of using CAPI in surveys:
First of all, this aspect of multidimensionality instead of linearity gets you greater precision.  Imagine an enterprise survey.   The costs for different types of enterprises (retail, manufacturing vs. service for instance) assume different structures.   You have two main options with paper: generic questions and a lot of training, or lots of skip codes and potential errors. Either way, the data’s unlikely to be good quality. With CAPI, though, the skip codes are automated (and here the key is in the programming) and presto, you can break your enterprises into as many types as you want and ask customized questions based on the type. A nice paper by Bet Caeyers, Neil Chalmers and Joachim De Weerdt compares the error rates in paper and CAPI surveys versus paper in a Tanzania data collection experiment.   In terms of skip codes, they find that there is an average of 10 routing errors per survey for paper, 0 for full-blown CAPI. 
You can also get more detail. Let’s say you cared about the sex as well as the relationship of the person from whom the respondent borrowed money.   Or you really wanted to know the 4 digit industry code the respondent worked in.   Doing this on paper will either require a lot of extra questions or one heck of a code list. In CAPI, you can make a multi-stage code list, which will walk the enumerator quickly through to the precise level of detail you want. 
Another way of enhancing precision is through the media aspect – with CAPI, you can incorporate photos, video and even voice recordings into your survey.  For example, say you want to capture the amount of bananas produced by a farmer. Farmers tend to report bananas in bunches, but this is super-noisy. Using CAPI, your enumerator can pop up photos of three differently-sized bunches with a recognizable object appearing in the picture as a reference point, and the farmer can pick the size that best resembles his typical bunch. Imagine the size of the book of photos your enumerators would be carrying if you tried to do this with paper.
CAPI gives you more flexibility by allowing for flexibility in reporting during an interview. Let’s take an agricultural survey. Farmers may think of their inputs as crop-specific, their harvest as parcel-specific, and their sales as aggregate (or not – the point here is different units/levels of aggregation).   With CAPI, the enumerator can adjust the reporting level on-the-fly in order to suit the respondent’s preferences. You may want to force certain types of reporting for analysis purposes, and this too is possible.    
You can get better accuracy through automated routing and error checks pre-programmed into the electronic questionnaire. There’s effectively no limit to the number and complexity of data quality control measures that could be built. On one end of the spectrum, you can program error checks flagging improbable values or inconsistencies between modules in a pop-up window, asking the enumerator to confirm that the respondent really is 116 years old, for example. Or you can ensure that certain questions are automatically enabled or disabled based on the respondent’s answers.  On the extreme end, you can program in reference tables of (i) KG-equivalent conversion factors for consumption items and non-standard unit combinations and (ii) caloric conversions for consumption of items in KG-equivalent terms and compute a basic report for the enumerator to check whether recorded consumption information is within the realm of possibility given the household composition. You can either program these as “hard checks” where the enumerator can’t go on and “soft checks” where the computer asks them to confirm the improbable answer. 
Unsurprisingly, this can drastically reduces call backs, as much of the enumerator error in a survey can either be corrected on the spot, or can be avoided entirely due to automatic skip patterns. Caeyers and co. again give us some sense of the magnitude of all this in their experiment in Tanzania.   The score: full CAPI reduces the number of impossible entries by 0.49 per questionnaire and the number of improbable entries falls from 1.35 in paper to 0.63 in CAPI. Not huge numbers on the face of it, but think of the cost of dropping the questionnaires with impossible answers -- it can add up quickly. (One interesting aside: This automated checking of errors may weaken enumerator’s skills if they ever have to go back to paper.  Caeyers and co. find that the number of years of CAPI experience seem to significantly increase the number of errors on paper questionnaires).
Yet another benefit of CAPI – enumerators can get instant feedback on their own performance (rather than waiting weeks or months for headquarters to catch onto problems with a paper questionnaire). This can be a powerful learning tool towards helping them improve their interviewing methods, even as the fieldwork progresses. It’s hard to think of another way that this might be achieved, particularly in such a real-time fashion.
Uploading data from a previous round of a panel survey for the purpose of updating the info in the current round  (or correcting wrong data from the previous round!) is cleaner and easier to track on a CAPI platform compared to a survey operation based on paper.
You get your data faster. Yes, you no longer have to pay for data entry or double data entry – instead, all of your data is entered on the spot, dramatically speeding up the time that it takes for the raw data to reach you. The knock-on implication of this is that you can get a good look at your data faster, meaning that any issues that are only apparent at the analysis level can be corrected for, while the survey is still in the field! Not only do you get your data faster, but the survey itself actually takes less time (not counting the preparation time, which is likely to be longer – more on that tomorrow).   In their experiment in Tanzania, Caeyers and co. found that for full CAPI, interview times were 10 percent less in comparison to PAPI (for an 80 minute survey).
CAPI also may give you the ability to get more confidential data.  Let’s take a set of sensitive questions, such as a module on domestic violence, or on personal sexual history. If your respondent doesn’t feel comfortable talking about these issues with the enumerator, they might ordinarily refuse to answer, maybe even stopping the survey altogether. However, with CAPI, you can hand the tablet to the respondent, allowing them to input answers themselves without worrying about what the enumerator will think. 
Monitoring is a huge improvement with CAPI. Not only can you automatically include date/time/GPS location stamps on every survey (you can even have a time stamp on each question, depending on the software), but with the rapid transmission of data, supervisors at headquarters can notice and correct for any patterns of errors much more quickly. 
CAPI serves as a new frontier in terms of validation for survey design improvement. This is because you can experiment more easily – for example, you can easily randomize the order of drop downs, or the order of sections, allowing for the inclusion of simple validation tests into all types of surveys, and improving general knowledge on best practices in survey design.
Lastly, the implementation of a survey effort on CAPI platform provides the survey designers with the opportunity to manage the entire survey process electronically. Aside from the enumerator conducting the interview electronically, with CAPI you also have survey managers assign clusters of interviews to their survey teams electronically, team leaders assigning individual households to their interviewers electronically, and perhaps most importantly, survey managers and team leaders tracking progress electronically. There’s even more than this – think of how much easier a tracking effort in a panel survey would be if survey managers could share tracking cases across teams with ease – electronically!
But it’s not all good news . . . Check back on Wednesday for the second half of this two-part blog post, which pours some much-needed cold water on all the pros of CAPI we’ve listed here.

Join the Conversation

July 25, 2012

Great post! One more CAPI project is in motion is this survey in South Sudan, conducted by the National Bureau of Statistics there:…

July 24, 2012

This is well-put! I'm looking forward to reading the second half othis blog post! - Michelle McConnaughay, CAI Coordinator, Innovations for Poverty Action