Time use data are widely recognized as difficult to collect – expensive and complicated. Yet time use data are critical for understanding, among other things, productivity in income generating activities, as well as constraints people (especially women) face in terms of allocating time and pursuing opportunities. A couple of weeks ago, Florence blogged on new work on valuing people’s time, particularly in income activities without a wage or pay slip. Here I focus on the challenges of measuring time use-- many of which are discussed in this UN Women report produced from the impressive work of Nancy Folbre and Jacques among others.
Despite the demand for such data, there are big gaps in the availability of time use data in national representative surveys. Filling this gap will likely require new ways on how to collect these data which are cheaper but reliable. Folbre has called for more methodological analysis and efforts to hybridize time diaries and activity lists—to develop ‘light diaries’. This is what is offered in the new NBER working paper by Field, Pande, Rigol, Schaner, Stacy, and Moore.
Field et al develop and test a short, low-cost time use data collection method which they refer to as a “hybrid” approach since it encompasses assisted time diary elements and stylized survey questions. They compare it to measures from a traditional time use approach (survey-based assisted diary method, used by India’s National Sample Survey) and a ‘gold standard’ approach which entail brief 2-3 minute interview visits every hour for a day.
Spoiler: the Hybrid model approach producers data that (depending on the measure) is closer to the gold standard data, is simpler for low-literacy respondents, requires less enumerator training, and is shorter to field than the traditional approach. There are, however, short comings which will be discussed below.
Designing the hybrid approach
To design their hybrid, Field et al. conducted open-ended, semi-structured conversations which focused on where respondents reported undertaking activities (activities inside or outside the dwelling, to better understand mobility) and for what purpose activities were undertaken (described as production, consumption, or both). Consistent with my priors, this detailed study from Sri Lanka, and on-going debates about low female labor force participation in India, they find that what a researcher would label as an income-generating activity (caring for livestock which sometimes earned money by selling output like milk or eggs) would often be classified by women respondents as a household chore. They also noted the potential miss the measurement of caring for children while undertaking other activities which they call passive childcare.
Informed by these discussions, they develop their hybrid approach which starts with respondents narrating the activities they undertook in the previous day chronologically. Then enumerators convert respondents’ narratives into stylized time use categories – allocating 24 “hour” tokens to 8 major activities which were represented by pictures. This avoids respondents aggregating time (such as adding up total hours cooking) and avoids requiring literacy. But it requires rounding-up to hours. Enumerators also probed about passive caregiving and aggregated that.
Testing the hybrid approach
They tested the three methods on a sample of households as part of a RCT in northern Madhya Pradesh. These are households that appeared in public workfare program payroll and had at least one married, unbanked woman. The final sampled individuals are poor, have low years of schooling, and predominantly belong to disadvantaged castes.
In a deviation from many survey methods studies, rather than randomize method across respondents, they randomized methods and have repeat visits to respondents. On day 1, all respondents complete the gold standard. The following day, they are randomly assigned either hybrid or traditional, and report on time use for the day before. Then at least one week after that (day 3), they are randomly assigned one of the 3 methods. So, for example, a person is interviewed on Monday with gold standard; then time use data about that same Monday is collected the next day (Tuesday) and then they report on another day at least a week later. Day 3 was included to explore the potential “priming” effect of completing the gold standard on day 1 on the recall reporting on the next day. We will return to the issue of priming. In addition to comparing regression-adjusted mean hours across activities, differences in time, over-reporting (zero hours in the gold standard but non-zero in traditional or hybrid), and under-reporting are also explored.
Both the hybrid approach and the traditional approach perform well compared to the gold standard in terms of hours by activity, with one exception. The traditional approach over-estimates passive caregiving and under-estimates leisure. I had trouble getting my head around this and wondered about the priming effect of the gold standard which might be more acute for passive caregiving (drawing attention to this) and for the traditional approach. Still, considering that the traditional and gold standard approaches use the same time use categories and coding, which differ from the hybrid approach, the hybrid results are impressive. However, for low duration activities (which most often are active caregiving and outdoor chores), the hybrid approach struggles. It is more likely to miss low and medium time intensity activities than the traditional approach. Underreports for high intensity activities are, by contrast, rare. And the hybrid approach performs slightly better than the traditional module in terms of avoiding overreporting activity categories not reported in the gold standard – again, a question of priming…?
What about priming…
To check if completing a gold standard survey on day 1 influences recall reporting on day 2 (about day 1 time use), they draw on the data from day 3. Day 3 is not a within person comparison but compares results from the three methods randomly assigned. They assume that if there is priming, the differences from the gold standard would be larger for the latter (the third interview) than for the day 2 reporting. They do not find this. But it seems a bit muddled to compare the within-person result (that is, what I reported by gold standard on day 1 compared to my next-day recall) to the across-person result (what I report by gold standard compared to my peer who had a hybrid or traditional approach). I am left wondering if the gains from the within respondent approach are worth the priming concerns which I have despite the day 3 results.
Still, I am a survey methods nerd and much appreciate the hybrid approach they have developed and tested. As the authors note, the fit of this method will depend on whether low duration activities are important in your study and to what extent you want to measure multitasking. But if broad category groupings and larger time increments fit your needs, their hybrid approach might be a fit for you. I look forward to seeing continued efforts to study cheaper and easier to implement time use modules, including the potential for apps as tested this paper from Zambia.
Join the Conversation