Where babies, I mean, datasets, come from…


This page in:

For most researchers, datasets come into the world in Stata format.  For those privileged few with the opportunity to collect primary data, conception happens in a concept note or grant proposal, which grows into a survey design, a sample calculation, then a questionnaire.  At this point though, most of us are content to let the stork intervene, carrying a freshly powdered dataset to our office, ready to be nursed into a strong and gifted paper.[1]  We gloss over the messy birth process, in which flesh and blood interviewers and respondents must interact to convert questionnaire to data, bringing with them their shortcomings, prejudices, and other foibles.  In a new working paper, “Interviewer Effects in Subjective Survey Questions: Evidence from Timor-Leste,” I explore some of these human interactions by studying interviewer effects. 

The paper uses data from Timor-Leste, which included standard information on socioeconomic characteristics and consumption-based welfare measures, as well as a specialized module on access to justice.  Included in the latter were a series of subjective questions on controversial topics: corruption, community dispute resolution, women’s rights, and land claims.  In addition, I make use of a unique addition of information on the interviewer’s age, gender, and their personal opinions about the subjective questions, to address two primary research questions: (1) Are interviewer effects present in the survey data, and, if so, is the magnitude greater for the sensitive subjective questions compared to more objective questions?  (2) What, if any, conclusions can be drawn regarding the nature of these interviewer effects? 

At the risk of spoiling the suspense, interviewer effects were present, and they were more prominent in subjective than objective questions.  The hypothesis is that respondents are more likely to give the answer they perceive the interviewer would like to hear for the more delicate topics (as opposed to the presence of invasive plant species on one’s farm).  Discerning what the interviewer wants to hear could be based on the physical characteristics of the interviewer (such as wanting to appear more progressive on gender issues to female interviewers) or picked up through some social cues, intentional or unintentional, by the interviewer (such as voice inflection when the questions are asked).  Using the interviewer’s opinion data, I attempt to separate the two phenomena. 

The full analysis is in the paper for those so inclined, but for the rest I can assure you that it is all properly econometric, with fixed effects logit and multi-level mixed-effects logistic regression modeling, variance decomposition, empirical best linear unbiased predictors, etc.  The results find the respondent’s characteristics explain the majority of the interviewer effects variation, which is analogous to the subtle cues received from an interviewer for a question on race mattering less than the fact that the interviewer is black.  The findings also suggest women respondents are more susceptible to the influence of interviewer’s opinions, and that people are more resistant on questions about which they may feel more strongly. 

While overall the paper is admittedly not the final word on interviewer effects, it does highlight the importance of sniffing a bit closer at newly arrived bundles of data.

[1] At some point later this sweet child becomes the lazy ne’er do well that lives in our basement and eats Cheetos all day, for whom our fondest wish is to get past the revise-and-resubmit process and never come back.  But I digress.


Join the Conversation