An increasing number of economists analyze subjective welfare data – which records a subject’s “happiness” or “life satisfaction” – as a complement to more traditional money-based measures of wellbeing such as income or consumption. Both the promise and the pitfalls of subjective welfare (SW) measures have been widely discussed, including in this blog here and here and here. One major challenge is the concern that fixed personal characteristics (such as someone’s “natural optimism”) determine SW responses to a far larger degree than time-varying economic factors. If that is the case then the usefulness of SW data for informing economic policy is not clear. Now two recent papers teach us more about the interpretive difficulties of SW in the presence of fixed individual characteristics.
Jed Friedman's blog
Consumption or income, valued at prevailing market prices, is the workhorse metric of human welfare in economic analysis; poverty is almost universally defined in these terms, and the growth of national economies measured as such. Yet for almost as long as economic analysis has utilized these measures, various shortcomings have been noted in the ability of these constructs to comprehensively capture welfare. One example – these measures can’t fully account for access to non-market goods. More famously, with Amartya Sen’s emphasis on human functionings and capabilities, these measures may not fully capture an individual’s ability to achieve and exhibit agency.
In part inspired by this view that people intrinsically value capabilities and functionings as opposed to money-metric measures per se, a burgeoning sub-field of poverty research has proposed various measures of subjective, or self-reported, well-being (SWB). SWB is widely seen as multi-dimensional and unable to be captured in only one question. Hence there are numerous approaches to the measure of SWB, most notably combinations of evaluative/cognitive approaches, such those that inquire about life satisfaction, and hedonic/affective approaches such as those asking about happiness.
I think it’s uncontroversial if I claim that the field of economics is of mixed minds about the usefulness of SWB: these measures hold some promise for comprehensive welfare assessment yet there are various interpretive challenges. I’ve blogged about some of these challenges in the past. Most concerning is the worry that salient characteristics such as gender and education, which naturally vary in any population, influence how SWB questions are understood and reported, thus complicating cross-group comparisons. Now two recent papers have made advances in the field and, taken together, highlight both the pitfalls and the promise of SWB.
Authored by Elizabeth Frankenberg, Duncan Thomas, and Jed Friedman
Ten years after the devastating 2004 Indian Ocean tsunami, Aceh provides an example of remarkable resilience and recovery that reflects the combination of individual ingenuity, family and community engagement and the impact of domestic and international aid. The tsunami devastated thousands of communities in countries bordering the Indian Ocean. Destruction was greatest in the Indonesian provinces of Aceh and North Sumatra, where an estimated 170,000 people perished and the built and natural environment was damaged along hundreds of kilometers of coastline. In response, the Indonesian government, donors, NGOs and individuals contributed roughly $7 billion in aid and the government established a high-level bureau based in Aceh to organize recovery work.
To shed light on how individuals, communities, and families were affected by and responded to the disaster in the short and medium term, we established the Study of the Tsunami Aftermath and Recovery (STAR). Beginning in 2005, STAR has followed over 30,000 people who were first enumerated in 2004 (pre-tsunami) in 487 communities (community location depicted in the figure below), as part of a population-representative household survey conducted by Statistics Indonesia. Interviews were conducted annually for 5 years after the tsunami; the ten-year follow-up is currently in the field. We ascertained survival status for 98% of the original pre-tsunami respondents and have interviewed 96% of survivors. The study is designed to provide information on the short-term costs and longer-term recovery for people in very badly damaged communities and in comparison communities where the disaster had little direct impact.
My summary of recent attempts to quantify the Hawthorne effect a few weeks back led to some useful exchanges with colleagues and commenters who pointed me to further work I hadn’t yet read. It turns out that, historically, there has been a great deal of inconsistent use of the term “Hawthorne effect”. The term has referred not only to (a) behavioral responses to a subject’s knowledge of being observed – the definition we tend to use in impact evaluation – but also to refer to (b) behavioral responses to simple participation in a study, or even (c) a subject’s wish to alter behavior in order to please the experimenter. Of course all these definitions are loosely related, but it is important to be conceptually clear in our use of the term since there are several distinct inferential challenges to impact evaluation arising from the messy nature of behavioral responses to research. The Hawthorne effect is only one of these possible challenges. Let me lay out a classification of different behavioral responses that, if and when they occur, may threaten the validity of any evaluation (with a strong emphasis on may).
Many who work on impact evaluation are familiar with the concept of the Hawthorne effect and its potential risk to the accurate inference of causal impact. But if this is a new concept, let’s quickly review the definition and history of the Hawthorne effect:
When state institutions find it a challenge to deliver services in under-resourced areas, its common for policy makers to consider leveraging existing local non-state capacity to help. This involvement of NGOs or CBOs is meant to supplement the state as service provider but a recent paper by Ashis Das, Eeshani Kandpal, and me demonstrates possible pitfalls with this extension approach. Just as implementation capacity of governments is a key determinant of government program performance, NGO capacity is a key determinant of NGO performance and under-resourced areas are likely to contain under-resourced local organizations. We find this to be the case in our study context of malaria control in endemic regions of India. Besides highlighting this challenge, our results also highlight the difficulties that small-scale evaluations present to the generalizability of findings, especially those implemented by non-state actors. Implementation capacity can be a key confounder of generalizability and it is not often measured or even discussed the current practice of impact evaluation needs to think harder about measures that capture implementation capacity in order to generalize IE results to other contexts.