There has been a lot of work recently on measuring women’s agency. Together with colleagues from a range of institutions that care a lot about gender and do a lot of surveys, we did a survey a couple of years back. More recently, Aletheia Donald and I did a post about some work showing how disagreement over decision making might be showing us other facets of power rather than just looking at individual responses.
An interesting new paper by Seema Jayachandran, Monica Biradavolu and Jan Cooper seeks to give us concrete guidance on how to measure agency more simply – especially when agency isn’t the main focus of the survey at hand. Jayachandran and co. take a very nice qual-quant approach to try and figure out what are the key questions you should be asking.
The key issue here is: what’s the gold standard against which these survey questions should be measured? Enter the qualitative work. Jayachandran and co. conduct 45 minute semi-structured interviews with a sub-sample of their respondents (more on the sample in a minute). These interviews cover five selected domains: women’s decision making around kids’ education, kids’ health and household expenditures plus her own fertility and mobility. From these interviews they get a rich set of data that lets them pull out things like resistance in addition to the usual restrictions. (Cool methodological side note: during the qualitative work they hired a “distractor” who kept the rest of the household engaged in a discussion so interviews could be private).
Jayachandran and co. and their research team then code these qualitative interviews to reduce the answers on domains to a 1-4 scale (fertility turns out to be more complicated and needed a two-stage process).
Now, in case you’re a diehard experimental economist and think this is all a bunch of malarkey, Jayachandran and co. also run a real-stakes game with their respondents to set up another potential standard against which to measure their survey questions. It turns out the game didn’t work well (I’ve had similar experiences – where games don’t correlate well with survey questions, but may line up with some behaviors – that’s a topic for a future post).
So putting aside the game and returning to the qualitative work, Jayachandran and co. now have a standard against which to measure our frequently used quantitative survey questions. To put together the candidate questions, they assemble a long list of questions that have been used to measure women’s agency by a number of reputable survey and research folks. They toss out questions that overlap and end up with 64 questions.
So, off to northern India to put this to the test. They sample married women with a child under 10 across 21 villages. 443 of them get the quantitative questions, with a random subset of 210 getting the qualitative questions as well.
Now, what to do with all of the data? Wait, did you say machine learning? Indeed, this is one of the approaches Jayachandran and co. use. They use LASSO stability selection which takes repeated sub-samples of the survey question data and looks at which ones do the best at predicting the results of the qualitative work. They limit the number of questions they want to end up with at 5 – since this is a reasonable add-on to a survey that isn’t focused on agency.
The second approach they take is backward sequential selection (aka how I get dressed in the time of COVID). This approach basically iterates on creating indices with the survey question answers, regressing them on the qualitative answers, and progressively tossing out the variables that lead to the smallest loss of R-squared.
Interestingly, these two approaches converge on three of the top five questions (although they don’t rank them the same). These are:
· Is her opinion heard when an expensive item like a bicycle or cow is purchased for the household?
· Does she need permission from other household members to buy clothing for herself?
· Is she permitted to visit women in other neighborhoods to talk with them?
The LASSO stability approach also picks a question on whether she can buy things in the market without permission and another on who she has to consult on children’s health care. The backward sequential selection chooses instead whether she is allowed to go alone to meet her friends for any reason and who in the household decides to pay schools for a relative of hers.
Jayachandran and co. note that these are fairly specific. None of the more general questions (e.g. on a ladder, where do you see yourself…) show up in the top questions by either method. And they also provide a bit of a cost-benefit in terms of interview length. Using the top 5 questions explains about 27-29 percent of the variation in the qual measures. On the other hand, using all 64 of the questions that Jayachandran and co. tested (and which take 45 minutes to administer) explains 53 percent of the variation.
So this is helpful work – giving us some core questions that seem to capture a chunk of much more in-depth and richer qualitative measures. However, as Jayachandran and co. note, this is from one sample in northern India. And an obvious next step would be to take this to another context. And to that you can bring their neat new approach of MASI – machine learning and semi-structured interviews. A couple of other things to think about. First, this is a measure of stock of agency, not changes. It’s altogether another question to ask which are the most dynamic components of agency. And that brings us to using this in the context of an impact evaluation – not only would you want to think about what would change, but also in which domain(s). The quest continues.