The NYT coverage of the new working paper, titled “Gender and the Dynamics of Economics Seminars,” (Dupas et al. 2021) got the expected Twitter outrage. I’d be comfortable speculating that most people did not read the paper carefully and, instead, relied on the NYT’s scant coverage of the paper, which, frankly, did not enlighten us more than if you simply read the abstract (it quickly moved on to the rest of the literature on gender discrimination in economics).
That we have a huge problem in economics is not in doubt, but the article and the social media takes don’t do justice to the findings of this novel paper. The findings in the paper are much more nuanced than the impression you would have gotten from the coverage, but it requires actually going through the evidence presented in the paper. Fortunately, Development Impact is here with an earnest attempt. So, please spend the next 15 minutes with me to look at the evidence presented in the paper a little more closely…
Not only are there more men in economics but they also do more than their share of talking…
Authors collected data on 85 seminars between January and May of 2019 with multiple coders at each seminar using a data collection tool specifically designed to get at attendance and quantity and quality of questions. The seminars are a mix of regular seminars that departments have and job market talks (JTs, using the authors preferred abbreviation), where the speakers are overwhelmingly PhD students or post-docs. This is the time of the year (right after the annual meetings for Economists, where employers interview job seekers) that includes a significant number of JTs. If the period was September to December, there would be no JTs. This will become important later, so please make a mental note of it for now…
39% of all talks and 49% of JTs had a female presenter (implying a one-in-three share for women in regular department seminars). The average attendance at all seminars is about 33, which goes up to 42 for JTs. Of these 68 and 73 percent are men, respectively. The mean number of questions is 30 and 35, respectively, of which 83 and 89 percent are asked by men. So, men definitely ask more than their share of questions. You might remember that I provided evidence on this from World Bank seminars two years ago and wrote an update last year. While you can certainly write “MMmMMmMMFMMMMMM” on the back of a lunch napkin to count questions at seminars, the Qualtrics tool used by multiple coders to collect better data systematically across institutions is a huge improvement. Hence, the new paper confirms the earlier findings that men talk more than would be suggested by their share of attendance at seminars.
Before we dive into the heterogeneity of findings that I find important for careful interpretation of the results, let’s quickly recap what the authors find: There are 3.3 more questions asked to female presenters across both types of seminars, which goes up to 6.1 during job market talks (Table 3, columns 6 and 8). These findings are highly significant statistically and imply that about 51-55% of the questions in seminars during, say, a semester, are directed at women depending on the type of seminar.
Job market talks look particularly problematic…
Now, I’d like to focus on Table 5. The first row replicated the findings presented above, but let’s do one thing using the estimates presented. The 3.3 more questions to women at ALL seminars are a weighted average of questions at regular seminars and job market seminars. These can be distinctly different events: Beatrice Cherrier, a historian of economics, had a wonderful thread on Twitter about functions of seminars in science. I highly recommend clicking on that here and I will briefly return to this issue later. What are the numbers in regular seminars vs. JTs?
The paper does not give the answer to this, but we can come pretty close by making our own calculations. [Please note that these are approximations to give us an idea on the heterogeneity of questions across types of seminar series. With the raw data, they can easily be made more precise.] We know that the sample of seminars in this study is composed of approximately 40% JTs and 60% regular seminars (I ignore the NBER summer institute evidence, which is treated separately in the paper). So, I can approximately recover the numbers of questions asked to men and women at regular econ seminars vs. JTs, by using these shares as weights that produce the 3.3 additional questions finding. When I do that (please see the table below), I find that there are 1.3 more questions asked to women (22.05 vs. 23.37) at regular seminars. This contrasts starkly with the 6.1 more questions to female presenters at JTs (30.8 vs. 36.9). My suspicion is that if the authors ran this regression separately, i.e. for the sample excluding JTs, the coefficient estimate of 1.3 may not be statistically significant.
The evidence on heterogeneity across regular seminars vs. JTs implies two things: First, a lot more questions are asked at JTs, partly because the overall attendance is higher. Attendance goes up by 56% (from 27 to 42 – own calculations using Table 1). Second, the number of questions to women goes up by approximately the same percentage (58% from 23.4 to 36.9 – see Table below), while questions asked to men increase by only 39% (from 22 to 30.8). That JTs have a higher differential in questions asked to female vs. male presenters is interesting in itself. Equally interesting, however, may be the fact that this differential is much smaller in regular seminars. The authors briefly touch on this finding on page 16 without dwelling on it, only to say that “…the seminar culture in economics may impact how the profession assesses candidates for hire.” I, myself, want to stick to mainly numbers and facts here and avoid speculation, but I will briefly raise some questions at the end that may provide fodder for the authors to do some more digging and for the readers to discuss. Before we get to that, however, things get more interesting when we examine heterogeneity by gender of the questioners AND type of seminar…
Who is doing the asking?
Table 5 breaks down the additional questions to women by gender of the person asking the question. As students don’t ask many questions and the coefficients for them are never statistically significant, I will focus on questions from faculty. The evidence here (Table 5, rows 2-3, columns 6-7) shows that male faculty ask more 5.8 questions at JTs presented by women than they do at JTs presented by men; the same figure is 0.75 more questions by female faculty. My calculations suggest that men ask 27 questions to male job market candidates but 32.8 to female job market candidates. The same figures for female faculty are 3.5 and 4.25, respectively (please see the Job Market seminars column in my table above). It’s interesting to note here that the ratio of average number of questions asked to female vs. male presenters is 1.21, with no difference by the gender of the questioner.
This, however, implies something interesting about regular (non-JT) seminar series. My calculations suggest that male faculty ask male speakers 18.1 questions at regular department seminars, while that same number is actually lower for female speakers at regular seminars 17.4 – implying that men ask more questions at seminars presented by women during JTs but not during regular department seminars. The same figures for female faculty are 4.2 and 5.7, respectively, meaning that women do ask more questions at seminars presented by women during regular seminars. If we were to reproduce the coefficient estimates in column 6 for regular seminars instead of ALL seminars, my estimate for these figures would be 0.7 fewer questions to women in column 2 (by male faculty) and 1.5 more questions to women in column 3 (by female faculty). I suspect that the former would not be statistically significant, while the latter might be at borderline significant (at the 10% level). The ratio of average number of questions asked to female vs. male presenters is now markedly different by the gender of the questioner: 0.96 among male faculty vs. 1.36 among female faculty (please see the Regular seminars column in my table above). This is a distinctly different pattern than that found in job market seminars, with respect to the gender of the questioner.
Patronizing and hostile questions…
While the NYT article quoted the abstract of the paper by saying that “…women were more likely to get questions that were patronizing or hostile,” it did not give a great sense of the magnitudes of these effects. Table 2 shows that one question is categorized as hostile out of every 10 seminars, when we consider ALL seminars. This number doubles to two out of 10 seminars during JTs, which implies that it is approximately 0.036 during regular departmental talks. An average department that has five flyouts during the recruitment season can expect to have one question categorized as hostile, while approximately 28 department seminars over the course of two semesters (reasonable to think of 14-week semesters during which there are regular weekly seminars) would also produce one hostile question. Questions categorized as patronizing are about five times as common: you would expect to encounter one per JT and one every other department seminar (Table 2). These figures give us an idea of the absolute magnitude of the problem…
Table 7 presents the findings with respect to how many more questions considered as hostile, patronizing, supportive, disruptive, or demeaning are directed towards females compared to males. I will ignore the coefficient estimate (0.305) on ‘patronizing’ in column 6 (for all seminars) with a p-value=0.10. But, please note here that the same estimate for JTs is 0.108, meaning that the effect on more patronizing questions to women has to be coming from departmental seminars. In fact, my rough calculations suggest that the number of patronizing questions to male and female speakers is 1 vs. 1.1, respectively, during JTs, but 0.04 vs. 0.48 during regular seminars. Similar calculations produce 0.15 vs. 0.25 hostile questions during JTs but 0 and 0.08 during regular seminars (please see the my table below).
The differential numbers of questions categorized as patronizing and hostile directed at men vs. women provide us with an insight and a question. The insight is stark: during regular departmental seminars, there are no hostile or patronizing questions directed at male speakers: female speakers get the whole lot. Fortunately, the absolute numbers are small: women encounter one hostile (patronizing) question for every 12 (two) seminars. Job market talks are more ‘egalitarian’ when it comes to patronizing questions (equally likely to male and female speakers at one such question per seminar) but women are still more likely to face hostile questions than men during JTs.
The question then is the gender composition of who is asking these types of questions during seminars, especially regular ones. Small sample sizes might hamper the analysis here because the absolute numbers of such questions get very small, making heterogeneity analysis by the gender of the questioner harder. However, we should remember that additional questions to women at regular seminars come from women. Do the additional patronizing/hostile questions follow the same pattern? They do not need to: male faculty might be asking a smaller number of hostile/patronizing questions during regular seminars and a much larger number of such questions, disproportionately to women during JTs. I cannot separate the averages from the marginal effects because the authors provide a gender breakdown of the questioner for the overall number of questions, but not by the ‘tone of question.’ The authors can certainly answer this question and let the readers know whether additional hostile and patronizing questions to women are being asked by male or female faculty during regular department seminars (and JTs).
Now, some speculation and questions for future research…
Here I leave the evidence, and share some thoughts and questions that the paper raised for me. Above, we dived deeper into more questions asked at seminars presented by women by (a) seminar type and (b) gender of the questioner. The first interesting finding is that the main culprit for the larger number of questions asked at seminars presented by women is the job market talk. These seminars, while having the same format, are distinctly different creatures compared with regular seminar series. Even setting aside the fact, as Beatrice Cherrier nicely described here, that they serve a different purpose, the composition of the audience can be very different. This is because folks who normally don’t attend each other’s’ seminars now mix together in the same audience. So, once a year, you are part of a culture that is not yours: That, in itself, is likely to make people behave differently than they would in their own seminar, where they might feel more at home (both with the crowd around the table and the topic discussed by the speaker). There might also be a lot more clarification questions in JTs, because a larger share of the audience is unfamiliar with the topic of the paper than regular seminars. As these same attendees need to provide feedback on the candidate (and sometimes even vote), they may want to ask more questions, but also feel awkward and not want to sound stupid at the same time. People who feel comfortable with the topic at hand, on the other hand, may want to ask more (and tougher more probing) questions – just to make sure that they can find any “bodies buried” before making a job offer to a candidate.
Second, as Table 2 shows, by far the largest category of question is one of ‘clarification.’ Suppose that people want to ask a clarification question that they suspect might sound stupid. If they believe (rightly or wrongly) that they are more likely to get a kinder answer from a female than a male speaker, this would produce the kinds of patterns we see in this paper. In that sense, it may be that it is not female presenters who are getting more questions, it is male presenters who are getting less. Without more context, data, and a normative judgment, it is hard to say that this is clearly bad for women. Making the necessary clarifications that at least some audience members need is all the more pertinent in a job market talk: you would want those people to ask their questions and receive satisfactory answers, especially if they are from other fields and will be voting to make you an offer or not in a few days/weeks.
Third, departmental seminars are different yet: invitations to such talks may often come through networks, meaning that at least some people in the audience are familiar with the speaker. It may be that they are current or former or future collaborators; they might be former colleagues; they might even be friends. What might this imply for the finding that women are more likely to ask more questions to women during such seminars? One hypothesis is that they are more likely to be familiar with each other, so they feel more at ease to interrupt and ask a question. This would be true if one’s (professional) network is more likely to include more researchers of the same gender. Another is that women anticipate more dismissive responses from male speakers and, hence, ask less questions to them. Yet another is the possibility that women have internalized the ‘culture of economics’ and also discriminate more against female speakers: women do ask more questions to female presenters in both types of seminars, so this possibility should not be discounted. Here, it would certainly help to know the gender distribution of the patronizing, supporting, or hostile questions: it is possible that there are gender differences in the motivation to ask a question: some audience members may ask the speaker a question to help them out while others might be trying to trip them up. If women not only ask more questions to women but they are also equally likely to ask hostile or patronizing questions to them, the problem is deeper and different than the surrounding discussion suggested so far.
Finally, a word of caution about the tone of questions. The authors should be commended for the use of a sophisticated tool for data collection. However, to categorize a question as patronizing or hostile, especially during a few seconds after the question is asked, is no easy feat. As the coders would certainly hear the answer to the question (and the ensuing possible back and forth), it is hard to isolate a hostile question from a hostile interaction. Even if the coders were not told the purpose of the survey at the time of the data collection, these are smart graduate students, who can put two and two together. So, the self-reporting bias may go in the direction of overreporting certain tones (which are expected to be found in the ensuing research), and perhaps differentially. They may also underreport hostile and patronizing questions if they are already hardened by the ‘culture of economics’ (someone told me that they would be really interested in seeing how a sociology or a public health graduate student might have coded the same seminars). It might be useful, in the future, to obtain IRB approval for video recordings of seminars and have them coded independently (perhaps even by the help of machines) and analyze those. That would of course come with its own quantum issues of measurement of behavior affecting said behavior.
This is clearly important research and adds to our understanding of the issues we face. But, the discussion surrounding the evidence presented in the paper that I have seen online (OK, Twitter) and the NYT article has been too coarse, too simplistic, too reactionary. We need to think harder about what our seminars are about; under what conditions/assumptions more questions to one sub-group than another can be interpreted to be a bad thing; who is hurt by such differential treatment; who is doing the differential treating; how to categorize types and tones of questions and so on. We also need to think about how the gender dimension is interacting with other characteristics of the speaker, such as race and home institution. As always, these imply the need for more research, hopefully in the very near future, hopefully involving researchers from different fields. A pre-analysis plan would also help. In the meantime, the readers should remember that had the authors collected data from September to December (instead of January to May), when there are no recruitment seminars, we might be reading a different paper and there might not be a NYT article.
"Men ask more questions, and women get more hostile questions"... Should set your alarms going that something was omitted from the data, when it was readily available.
Anecdotally: Mythbusters did a fun experiment where they had a female cast member work at a coffee shop, and minimize or stuff her bra during different shifts. In short, when men experienced a stuffed bra, they tipped maybe 20% more. Everyone screamed sexism of course. Then they looked at the data - women tipped *200%* more. Everyone went very quiet and tried to dodge the issue.
As noted too - more or even hostile questions aren't necessarily "bad". Human communication is messy. It might be that people are choosing not to voice concerns when a man is speaking - which is bad. They may feel free to voice those concerns with a woman - which is good.
Looking at everything with the simple rubric of "man bad, woman good" is dumb and unhelpful. The New York Times reporting is a great example of cherry picking data to make a (bone-headed) political point.
Thanks. I got curious, so I Googled the Mythbusters episode (not linking here to avoid future spam, but it's very easy to find the three-minute 2014 clip on this). They say that women tipped 40% more (not 200%). It's also not clear to me how they separated the tips by gender, as it is not practical to have separate jars by gender or to discern the amount of coins someone drops into the jar. So, your point taken, but the anecdote seems to be just that...
The Mythbusters were monitoring the tipping interaction via camera and threw a switch on a physical shute as each event happened, so tips from each gender went into separate containers (under the counter) - then then added it up at the end. What was hard to separate out was that Kari did three days, first day 'normal', second day 'flat' and third day 'busty'. It was possible that she had become better at getting tips by the third day, and her approach just worked better on women than men. But it was certainly not the expected result.
Hi thanks for this dive! (saw this via Marginal Revolution link)
1) Maybe I am crazy but doesnt figure 6 show that seminars presented by women have a larger audience, hence more questions? It seems that they control for this in Table 5, but is this the case for the rest of their results too?
2) (Note: I am not an economist) I have an alternative theory: I can imagine (from my experience attending Biostats seminars) that if you are presenting a paper in econometrics there will be less questions asked, on average, simply because the topic is more difficult to follow (via presentation), than if you are presenting a paper in applied micro. I can also imagine that more seniors will try to debunk the applied micro paper more often than econometrics paper, not necessarily because its lower quality, but simply because its much easier to provide quick criticism and "look smart". So could these findings be a result of differential gender distributions within different economic fields rather than a "proof" of discrimination?
Thanks. I did not know about the MR link - nice to have some crossover traffic ;-)
On your first question, I don't think that the authors actually control for it in Table 5. They seem to run robustness checks in Table A.4 and show that the coefficients only decline marginally when they control for attendance. There is a discussion of this on Page 16 (Figure 6) as well, where the authors agree that both higher attendance and more questions per attendee are likely to be at play... My take on this is that this is again coming from JTs: as I pointed out in the post, questions to women go up at the same rate as the attendance from regular seminars, whereas questions to men don't go up as much. If you just consider regular dept. seminars, it's likely that the extra questions are explained more so by attendance than Qs per attendee...
For your second question, the authors say in the abstract that their results are not explained by women presenting in different fields. All the tables have fixed effects for seminar series and JEL codes.
I'll tweet your comment and maybe the authors might chime in to respond with much better answers than I have...