Measurement is on my mind. Partly because of the passing of Alan Krueger (credited with having a major influence on the development of empirical research – notably his influential book Myth and Measurement). But also because a couple of weeks ago, I attended an all-day brainstorming meeting on “Methods and Measurement” hosted by Global Poverty Research Lab at Northwestern University and IPA. The workshop covered a range of topics on gaps and innovations in research methods related to measurement, such as: integrating data sources and applying new methods (such as satellite data and machine learning combined with household surveys to get improved yield estimates), untangling socioeconomic complex data (such as mapping social networks), crafting measurement of concepts where we lack consensus (e.g. financial health), and bringing new tech into our survey efforts (using smartphones, physical trackers, etc.).
The topic of measurement and survey design is not new to Development Impact. David kindly provides a curated list of posts (updated as of fall 2018) on issues of measurement, survey design, sampling, survey checks, managing survey teams, reducing attrition, and all the behind-the-scenes work needed to get the data we need. In just the last few months, there have been at least three more posts (1 & 2 & 3).
You might think that the measurement agenda in development economics is mainly about “new” areas – non-cognitive skills, social networks, mental health, subjective well-being, intrahousehold decision making…. But there is also (still) a big agenda in boring, traditional topics. Consider labor.
The “official” definitions behind labor statistics continue to be redefined. A fact probably known by few readers: the 19th International Conference on Labor Statistics (convened by the ILO in 2013) set forth new definitions of work and employment, including the change that subsistence farmers would no longer be classified as employed (but they are working—read the report to understand the difference). And from this, a work stream now exists on how to adapt household surveys in light of these new definitions, not least because it is not clear how to define subsistence farming. For example, the Women’s Work and Employment Partnership, a collaboration of the FAO, ILO, and the World Bank, is conducting methodological research on measuring work in three countries – Ghana, Malawi and Sri Lanka.
Our measurement approaches (and thereby our surveys and our research) fall short in many areas where employment categories are blurred, including in dependent self-employment (consider the case of mini-bus drivers in Tanzania [gated and ungated] and measuring casual wage work (and the prospect that surveys are vastly under-measuring rural wage work in Africa). And, surprisingly, as a result we still struggle with some basic concepts often fundamental in development economics, including farm labor productivity. Measures of labor productivity are critical to understanding, among other areas, the extent of labor (mis)allocation and patterns of structural change in Africa.
Our study in Tanzania [gated and ungated] set out to study how one aspect of survey design--the recall period over which surveys solicit responses--affects estimates of farm labor in Tanzania (not to be confused with a different Tanzania study on measuring work-- in this blog by Markus). We conducted a randomized survey experiment in the Mara region of Tanzania during the long rainy season of 2014. One group of households received weekly in-person surveys throughout the entirety of the season, during which they would report on the agricultural labor done by each household member on each plot. A second group was surveyed weekly by phone – with much lower costs than face-to-face interviews but not much tested in terms of reliability (for an exception, see this paper on surveying microenterprises). Another set of households were only interviewed at the end of the season – the data that we normally get from household or farm surveys to assess farm labor productivity.
We found that the end-of-season recall survey overestimated time on each farm plot by a factor of nearly four (4!) compared to weekly surveys. This extreme inflation of time farming is due in large part to the cognitive burdens of reporting an estimate of time worked over several months, especially since that work is not regular like a 9-to-5 office job. Read the paper for insights from the fields of social and cognitive psychology to understand the results.
Whether this mis-reporting matters depends on what you want to measure. While we find that hours per person per plot are significantly higher in the recall survey, this gap disappears when measuring total household farm hours. This is because while total hours are over-estimated, the number of plots and the people reported to work on the farm are under-reported in the recall survey! Maybe two wrongs do make a right: combining the over-reported hours with under-reported plots and people results in the same aggregate farm labor measure. Does this mean all is well with our data? That depends on what is being studied. The lower hours per person per plot can result in significantly understating agricultural labor productivity depending on how one tackles the analysis.
But does what happens in surveys in rural Tanzania happen in other rural parts of the region? Here are we lucky to have a second study on this topic done in Ghana. They also find over-estimated hours when surveyed with long recall periods (leading to significantly overestimating plot-level labor productivity), the size of the bias is much smaller than in the Tanzania study. And while households under-report both plots and people, as they did in Tanzania, the under-reporting in plots in much smaller in Ghana. They attribute some of the different results to higher education levels among the households in Ghana. A forthcoming third study in Malawi will shed more light.
Stepping back to the start of this blog, I have been thinking about how to build more momentum on studying measurement. First, bring the money. Doing surveys to study survey methods and measurement is an expensive undertaking. A shorter route is embedding survey methods research as an add-on to a research project. Bravo to IPA and the Global Poverty Research Lab at Northwestern University for the call for proposals to offer funding to do just this. Second, journal editors: please support publication of this work, even if it is not your typical development econ paper. Studies that point us towards improved measurement are a critical public good and can help keep researchers from making the same measurement mistakes all over again. Special issues of journals can help (hats off to the JDE for the 2012 Symposium on Measurement and Survey Design). Without publication outlets, needed replication work (like the Ghana study above) is much less likely to happen.