Published on Data Blog

Building gender aware data systems is critical to create a level playing field for women and girls

Portrait of Abou amid millet stalks dried in the sun
Photo: © Stephan Gladieu / World Bank

Data can create economic and social value, which can be multiplied through repurposing and reuse. But are these social and economic benefits derived from data equitably shared, particularly in countries that would stand to gain the most from the production, use, and repurposing of data? 

To advance an affirmative answer to this question, the World Development Report (WDR) 2021: Data for Better Lives calls for a new social contract for data that ensures better representation for marginalized people. This means that data captured and analyzed in data systems reflect the needs of ALL people. While data are generally undersupplied and underused for the purposes of improving lives, they are severely undersupplied and underused for the purposes of helping girls and women, who represent half of the world’s total population. 

How severe are the gaps in data on women and girls?

Throughout the ongoing COVID-19 pandemic, gaps in sex-disaggregated data have been pronounced, rendering at best, a partial understanding of the differential impacts of the crisis on men and women  (see Chapter 2 of the WDR 2021 for a discussion). In March 2020, only 61% of reported COVID-19 cases were disaggregated by sex, with only 26 countries providing these data. By November 2020, reporting had grown to 80 countries, but the proportion surprisingly still stood at 60%. 

proportion of covid cases

Source: WDR2021, Data for Better Lives. Based on contributions from Mayra Buvinic (Center for Global Development), Lorenz Noe (Data2x), and Eric Swanson (Open Data Watch), with inputs from the WDR 2021 team.

Prior to the pandemic, only ten of the 54 gender-specific indicators (19%) in the Sustainable Development Goals (SDGs) were widely available (that is, based on international standards for measurement), and only a fourth of those gender-specific indicators were from relatively recent data (2010 or later). In seven of the ten countries where the recent economic contraction is severest, less than 38% of SDGs are available by sex. Several assessments have consistently highlighted the gender data gaps that are critical for designing and evaluating policies.

Gender data that are available are improving lives

Violence against women and girls (VAWG) is a global pandemic. One out of three women and girls (35%) worldwide between the ages of 15 and 49 has experienced physical violence, sexual violence, or both.  Using data from the Gender Data Portal we can see that at least 200 million girls and women have undergone female genital mutilation (FGM), and in at least 11 countries, more than half of women ages 15–49 have undergone FGM. We know these facts because representative population-based studies have been undertaken to understand the prevalence of VAWG. These studies have used a standardized methodology in more than 90 countries across all regions and all income groups.

Prevalence of female genital mutilation in women ages 15-49

Adapted from Kashiwase and Pirlea 2019.
Source: Data are drawn from the World Bank Gender Data Portal, (SH.STA.FGMS.ZS), using data from Demographic and Health Surveys, Multiple Indicator Cluster Surveys, and UNICEF
Note: FGM = female genital mutilation; UNICEF = United Nations Children’s Fund.

A long-running and rich example of the value of granular data are the Demographic and Health Surveys, which cover topics such as HIV/AIDS and gender-based violence. Over the last few decades, data from 82 of these surveys, disaggregated by sex, have been used as inputs for developing laws banning domestic violence, developing HIV education programs, and more. In Vietnam, a survey on gender-based violence revealed that more than half of women have experienced physical, sexual, or emotional abuse; that nearly half of these had physical injuries as a result; and that seven in eight did not seek any help. These data spurred a public discussion about the topic, informed the National Strategy on Gender Equality, and introduced counseling, health, legal, and shelter services for women subject to violence at home.

What can we do to improve gender data gaps?

First, advance the development, dissemination, and implementation of international standards that will improve the granularity, accuracy, comparability, and policy-relevance of gender data. As emphasized by the WDR 2021, standards can significantly enhance the quality of data in data systems and increase their usefulness. 

For example, relying on proxy respondents to elicit individual-level information—a common cost-saving mechanism in large-scale household surveys—has been shown to produce wrong estimates of gender differences in asset ownership, labor market outcomes, and control of income. Likewise, imprecise definitions of employment in the Middle East and North Africa have been shown to blur the lines between unemployment and informality and distort the role of women in national labor markets. Initiatives such as the World Bank Living Standard Measurement Study – Plus (LSMS+) and UN Women’s Women Count are contributing towards the development and adoption of best practices in individual-level data collection.

Second, invest in data production on specialized topics that are not covered by standard surveys—including on-time use and gender-based violence. It is critical to promote existing standards while developing improved methods for these types of data collection.

Third, make gender data and analysis publicly available and accessible in a transparent manner. As data are made more openly available and accessible, it’s also important to promote the use of already existing data to do more analysis on gender issues.  These efforts should be supported by initiatives to strengthen gender data literacy among both government officials and the general populace to encourage better understanding and use of gender data.

Fourth, promote interoperability and the safe integration of gender data sources to maximize their value for development. For instance, the Gender-Based Violence Information Management System (GBVIMS) facilitates the safe, ethical, effective, and efficient standardization and coordination of service-based data, which can be combined with data representative of a given population to yield important insights. While such efforts are critical, it is also important to ensure that investments in gender-based violence data systems do not divert limited funds and staffing away from the provision of services to the survivors of violence. Separate streams of investment—and greater investment—in service provision and data systems are necessary. 

Fifth, we need to ensure that inputs for machine learning from big data sufficiently reflect women and men in ways that are unbiased.  Notwithstanding the opportunities that big data and machine learning approaches offer, using algorithms in this way can amplify discrimination against individuals and reinforce existing racial, gender, and economic inequalities. As one example, women, especially in low- and middle-income countries, have limited access to mobile phones, the internet, and bank accounts, which limits their representation in data that are used in training machine learning models for targeting interventions or behavioral insights. When women are missing, less visible, or reflected in an unrepresentative way in training data, policies, rules, or programs based on machine learning algorithms will amplify those biases.

Finally, even though the pandemic created new demands for statistics, it also interrupted the supply. More than half of low-income and lower-middle-income countries reported that the COVID-19 pandemic affected national statistical offices’ ability to produce socioeconomic statistics. This problem requires immediate attention and building effective, gender-aware data systems will require sustained financial and human capital investments.

To download the full report, click here.

To learn more about the World Development Report 2021, please visit this website.


Malarvizhi Veerappan

Senior Data Scientist, Development Data Group, World Bank

Talip Kilic

Senior Program Manager, Living Standards Measurement Study (LSMS), World Bank

Join the Conversation

The content of this field is kept private and will not be shown publicly
Remaining characters: 1000