Notes from the field: collecting gender disaggregated data in practice


This page in:

So I have blogged in the past about the potential and the use of gender disaggregated data, but my work this past week in Ghana made me realize (again and in new ways) how complicated it can get in practice.  

Let me start with a recap as to why you might want to do this.   Obviously you want to collect disaggregated data for any kind of gender analysis.   For example if one is doing intrahousehold analysis, you would want separate measures of male/female non-earned income, and the like.   But, there also may be a set of arguments to do this if one simply wants to get a better picture of what is going on in the household overall.   Let’s take the case of expenditure. When Chris Udry and I did a survey in Ghana, we asked men and women to report separately on their food expenditure and the expenditure of their spouse.   Taking one of the rounds of data this gives you three possible measures of consumption (in old Ghana cedis): women’s report of total (about 195,000), men’s report of total (around 254,000) and the sum of their own reports (around 299,000).   (It turns out there is very little purchasing of the same item so the fact that the own report is higher is not due to double counting).   Obviously, these measures give you quite different pictures and, in ongoing work with Michael Boozer and Tavneet Suri, we show that you get significantly different measures of poverty based when you use these different measures. 

Now let’s say you want to do this in practice, in the context of a multi-topic/multi-module survey.   The ideal way is to have separate interviews for men and women, particularly when information is sensitive.   My experience is that it is hard to divine a proiri what the sensitive areas will be – for example, it could be that domestic violence is something the couple can talk about together, but the existence of her business is not (I described some of how this worked in practice in this post).   I have also leaned towards having women interviewed by women and men interviewed by men.   But I haven’t seen good data on whether this actually makes a difference (it obviously would in some contexts, such as when women have restricted contact with outsiders).   In addition, the structure of the survey will now be that you have separate modules (or even entire questionnaires) for men and women.   And then you need to build in flexibility for those who are not currently married or living with a partner – the big decision here will be whether you ask another adult of the opposite sex in the household to respond or whether you have the head only respond (it depends on what you are after and opens up the larger question of what is a household – which is a topic for another post).  

This week, working on another, recent survey in Ghana, brought home to me the multiple ways this can be really difficult in practice:  a set of things to watch out for (and additions to this list are more than welcome).   First, it is essential that you train the heck out of the enumerators and then go with them during the first set of surveys.   Let’s take the example of a roster of all of the plots a household is farming, with men and women reporting in their separate interviews about the farms that they control.   Here, we had decided to have one common plot roster for the household - with the idea being that this would avoid double counting should the male and female claim the same plot as their own.   But for enumerators trained in the standard multi-topic surveys this is hard to grasp – they are used to the roster for a given respondent being the one that is used only for that respondent.   So we faced a problem where enumerators created separate rosters for men and women and thus a given household very likely had two plots with the same number while others (who did it correctly) did not.   These separate rosters also generate a host of logistical issues which include the first interviewer (e.g. who interviews the husband) getting a copy of the household plot roster to the interviewer interviewing the other person (e.g. the wife).   Another issue is the multitude of household structures one encounters in the field.   What we did for this was to make an extensive checklist for each possible household situation indicating which modules should be asked.   The checklist helps, but my guess is you still end up missing more chunks of the questionnaire than you do in a normal survey. Indeed, my colleague Niklas Buehren pointed out that a better option might be to just print completely different versions of the complete questionnaire for each different household type.    Computer assisted interviews would also go a long way to help solve this problem (even more so if you could move data across interviewers instead of just from the interviewer to the editor).  

A final problem that we saw (in an earlier survey) is that enumerators who think for themselves will doubt the utility of collecting what could be the same data twice (for example in the cross-reporting on expenditure I discussed above).   It takes really careful training and no small amount of cajoling to get them round to collecting this and not copying it from someone else.

I was also surprised to learn this week that the potential problems do not end with the collection of the data.   Let’s go back to the case of the household plot roster.   If the household has two plots, one controlled by the woman and one by the man, the correctly done plot roster will list two plots numbered one (the woman’s) and two (the man’s).   When the questionnaire is administered, the woman, in her interview, will report on plot 1 and the man on plot 2.   Now let’s take a case where there is a table for the enumerator to fill out on plots within the main body of the questionnaire.   The usual way to do this is to leave a blank column on the left where the enumerator fills in the plot number and then answers questions about that plot across the row.   Here, the first line in the woman’s questionnaire will have the number 1 and in the man’s questionnaire the first line will have number 2.   Now, when this comes to the data editor prior to data entry, the problem arises.   Editors who are schooled in the usual household survey will be perturbed by seeing a plot ID of 2 in the first line of the male’s filled out questionnaire. In our case, these conscientious editors changed the 2 to a 1 and so it was entered.   So what was fixed with the enumerators (problem of two household plots with the same number) was undone in data editing.  

While collecting gender disaggregated data certainly has its uses, this is indicative of the very creative problems that can arise.   Indeed, these problems are probably increasing in the amount of experience the survey team has in administering more standard questionnaires.   And clearly, every single step of the process needs careful supervision or else there will be a large amount of cleanup work to be done.    



Markus Goldstein

Lead Economist, Africa Gender Innovation Lab and Chief Economists Office

Susan Watkins
March 04, 2013

Markus, do you know the fol;lowing papers by Mariano Sana and Alex Weinreb?

Sana, Mariano and Alexander A. Weinreb. 2008. Insiders, Outsiders and the Editing of Inconsistent Survey Data. Sociological Methods Research 36:515-341.

This was an experiment to determine whether missing data and inconsistent responses were better corrected by (1) interviewers (2) supervisors (3) data managers (4) data analysts. There's also a PAA paper, Sana, Weinreb and Guy Stecklov, 2011, that expands on the 2008 piece.

Weinreb, Sana. 2008. The Effects of Questionnaire Translation on Demographic Data Analysis. 2008. Population Research and Policy Research.

There's also Miller, K., E. Zulu and S.Watkins, 2001. Husband-Wife Survey Responses in Malawi. Studies in Family Planning 32 (2) 161-174. Shows inconsistent results on topics the spouses should agree on, consistent results for the Penn Malawi data, the MDHS, the KDHS and a Penn Kenya study.

The Penn study in Malawi has also used ethnographic data to learn what people say to each other in informal conversations in their social networks. While this obviously doesn't say which specific response is likely to be incorrect, it does provide some information on what sorts of questions are likely to be biased.

Markus Goldstein
March 07, 2013

Many Thanks Susan. This is clearly worthy of another post!