Published on Development Impact

What do we measure when we measure food consumption?

This page in:
guest post by John Gibson and Alberto Zezza

A key theme of this blog is measurement. As Jed argued in an early post on the subject, “one of the most fundamental welfare constructs in economics” is consumption and “accurate consumption measurement has been a long standing challenge for applied work”. A recently published special issue of Food Policy entitled “Food Counts” has 13 papers on measuring food consumption and expenditures with household surveys. The summary intro paper and a few of the other papers are Open Access.

Compared to the 2012 Journal of Development Economics special issue on measurement the focus is narrower. A good bridge between the two special issues is the consumption survey experiment in Tanzania, which was blogged about here, and here and provides a new paper in the special issue. If you also find it somewhat disturbing that the Tanzania dataset has been the only methodological experiment on the measurement of consumption in low and middle income countries in the last decade or so, you will welcome the release of this special issue.  The geographic scope of the volume is compelling as it includes papers using datasets from places as diverse as Niger, the US, Bangladesh, Brazil and Canada, to list a few.

Only two of the special issue papers use experiments (Niger and Tanzania) but there are several findings from this collection of papers that will interest impact evaluators.
In terms of motivating this work, measurement errors in food expenditure data are unlikely to be well-behaved, in following the classical assumptions of being mean zero and uncorrelated with everything. Classical measurement errors provide comfort to applied econometricians, since they don’t cause bias if on the left-hand side, and just attenuate if on the right-hand side, giving a conservative lower bound to any estimated causal impacts. In contrast, non-classical errors matter even on the left, and may cause regression coefficients to be exaggerated. Taken together, the survey design lessons that can be learned from these studies can help keep the non-classical component of measurement error in check.

The volume has lots to say that applies to development research broadly, particularly regarding other survey design decisions that have had specialists scratch their heads since time immemorial. One peculiar aspect of food consumption which makes the measurement task even more tricky, is that different analysts (economists, nutritionists, food security specialists) look for slightly different things in a measure of food consumption, as they use it in different ways.  A nice feature of the special issue is that the papers are written by experts from different disciplines, who contribute their different perspectives. Nutritionists are more interested in quantities, and in having specific food items (e.g. iron or vitamin A rich foods, or foods that are or can be vehicles for fortification) singled out in the list of food items included in a questionnaire, even when these may account for a relatively low budget share and hence be of lesser interest to poverty economists.

The papers in the special issue also tackle a range of survey design issues one at a time, drawing either on experiments or on existing datasets that allow for comparisons across methods drawing on the same populations. Here are some titbits from a subset of papers, the volume actually has more:
  • Diary versus recall, and the appropriate reference period.   Diary surveys are often felt to yield higher consumption reports than recall surveys, since they should have less memory loss. However, in Niger 7-day recall gave annualized per capita total consumption that was 28% higher than from a 7-day diary.  A likely reason is telescoping. A decomposition of incidence errors and value errors (conditional on incidence) in Tanzania shows short recall overstates the value of consumption, conditional on any being reported. That this design gets close to the benchmark (from tightly supervised diaries) is due to the happy coincidence of offsetting errors, which need not always offset.
  • Food away from the home and pre-prepared foods.  Most household surveys do a poor job of measuring food-away-from-home (FAFH). This error contributed to dead-end debates about why economic reform did not seem to improve nutrition in India. It is heartening, therefore, that Fiedler and Yadav find that adding specific, common, FAFH items into the food recall list, and collecting individual household member data on broad categories of meals, by type and by source, significantly improves accuracy, for a food survey in India.
    The volume also includes the paper Renos Vakis blogged about some time ago, showing how in Peru extreme and moderate poverty estimates change dramatically but in opposite ways when FAFH is accounted for. Extreme poverty is significantly higher once FAFH is accounted for, driven by the higher calorie cost of food bought outside the home relative to home-made meals, which increases the poverty line (so more people “become” poor). By contrast, moderate poverty is significantly lower when FAFH is taken into account, this time driven by increases in household FAFH expenditures that raise overall expenditures (and thus compensate the effect of the poverty line increase).
  • Individual versus household consumption.  The special issue authors have economics and nutrition backgrounds. In economics, problems in valuing household public goods and observing individual consumption without distorting it mean that we typically settle for dividing household-level resources by some nominal or adjusted measure of household size. In nutrition, the unit of analysis is necessarily the individual, and this may heighten doubts some have about the value of household consumption and expenditure surveys (HCES) for nutritional analysis. Encouragingly, Sununtnasuk and Fiedler find 91% agreement when comparing the adequacy of nutrients across individuals using household-level HCES data versus using the gold standard of 24-hour individual recall.
  • Food acquisitions versus food consumption.   The origin of HCES is to get expenditure weights for a consumer price index, almost always on an acquisitions basis. Current interest in HCES to measure consumption has produced a mix of questions; food purchases on an acquisitions basis but food from other sources (gifts, own-production, occasionally stocks) on a consumption basis. Using acquisition data gives higher apparent calorie availability, so switching to consistently asking about consumption may introduce non-comparability into measured hunger.
  • Length and specificity of food lists in recall surveys.  Some recall surveys try to save interview time by compressing lists of specific food groups into broad categories, or using a subset list. It seems that little time is saved by these cuts, while considerable error is added. 
The set of papers makes important advances on some of the research questions it starts with (see above), but many areas remain open and will require more (ideally experimental) work. We know little about how respondents actually answer these survey questions. Most designs are based on the idea that respondents count and recall each episode, so shortening the recall period and using more specific food groups makes the memory task easier. Yet it is likely that rules of thumb are often used to estimate consumption of particular foods. Attempts to help this estimation strategy, with the ‘usual month’ recall that asks about how many months the food is consumed, how many times per month and the typical amount (and value) per time fail badly. If we knew more about what causes a switch to using rules of thumb, and what improves their accuracy, we might design better surveys.  The experimental evidence from Tanzania suggests that methods to improve the accuracy of recall surveys should consider a dual track approach: (a) efforts related to the prompting of households to report any positive consumption, as recall modules under-report consumption incidence for almost any food group, and (b) efforts aimed to improve the accuracy of consumption value reported, conditional on the household reporting any consumption.

On the latter, even though early LSMS work (dating 30 years back) advocated for bounded recall, little is known to date about the data quality implications of using bounded versus unbounded recall. The idea of bounded recall is that questions could be phrased as “since my last visit…” to provide a distinct event to reduce telescoping. Two visits also let measures, like anthropometrics, be taken twice, with averaging to reduce effects of random measurement error. For bulky root crops and where metric units are rarely used (and metric reports rarely trusted) a first visit lets teams distribute simple measuring devices, like an empty sack, so recalled quantities can be in sack-loads with weighing trials to create metric equivalents. Despite these advantages bounded recall is rarely used. Experiments could show if it is worthwhile.

Finally, another important concern where survey mangers can learn from more research is how best to deploy survey resources. This relates to David’s “more T in experiments” work on allocating resources over the N and T dimension. There are fairly low within-year autocorrelations for consumption and calories. Thus, seeing the same household at a different time of the year, say six months later, gives new information about it. Yet many surveys concentrate their resources on a sequence of short, adjacent, survey visits. The special issue had papers using seven consecutive 2-day recalls in Bangladesh, and three consecutive 10-day diaries in Mongolia. Evidence from Ghana is that data quality declines rapidly with such designs. In contrast, in an earlier paper John shows how fewer visits, with longer gaps between them, and using the intra-year correlations to correct extrapolation to annual totals may be far more informative about poverty and hunger. Using data from even just two repeated visits, and using a corrected form of extrapolation, based on correlations between the same household’s expenditures in different months of the year, gives much smaller errors in estimates of inequality and poverty than extrapolating to annual totals from expenditure reports in only some months of the year.

Join the Conversation

The content of this field is kept private and will not be shown publicly
Remaining characters: 1000