Published on Data Blog

Data Dive 2019: Hackathon brings data scientists together for development

This page in:

data dive hackathon banner

In October 2019, members of Data Science DC (a meetup within Data Community DC) teamed up with World Bank staff for a Data Dive Hackathon at the World Bank. Their mission? Design innovative, data-driven solutions to international development challenges, ranging from matching workers with jobs to smarter lending for the world’s poorest. Armed with datasets from the Development Data Hub and plenty of caffeine, eleven teams got to work.

One of the Hackathon’s big draws was the newly launched Global Jobs Indicators Database (JoIn), the result of a collaboration between the Development Data Group and the Jobs Group. JoIn includes 73 standardized labor supply indicators with disaggregation by age, sex, location and education level). It covers 112 countries and is drawn from more than 800 surveys developed from household surveys. The disaggregated data can help economists and country analysts understand the types of jobs created with economic transformation in countries, and who gets them. By tracking indicators over time, JoIn allows users to compare their country with successful episodes of jobs-related economic growth in other similar countries.

For the Jobs and Economic Transformation theme, Hackathon participants had a choice between solving three challenging jobs-related tasks: 1) developing a web-based data visualization tool or app to depict the Jobs indicators; 2) creating a jobs and economic transformation taxonomy; or 3) mapping worker characteristics to certain occupations and activities drawn from a sample of population census data in client countries. In addition to the Jobs and Economic Transformation challenge, Hackathon teams were also invited to present creative analysis of the relationship between government spending and fragility, conflict, and violence; predicting and addressing risk factors for loan performance; and leveraging investments in technology and human capital for sustainable development.

So, what did we learn? Check out a few highlights from the participants below:

  • The first group used JoIn data to create prototype data visualization platforms. They tracked unemployment over time to build a Tableau data services platform and interactive R Shiny dashboard.

  • A second group employed time-series data in the form of trends in GDP and the JoIn jobs indicators over time to check whether countries could be grouped by the jobs challenges or “jobs syndromes” they face, rather than by traditional country groupings. They used KMeans clustering to see whether clear country groupings emerged by considering demographics and stages of economic transformation.

  • The third group used employment variables from population census data to look for related patterns between education/skills and occupations in sectors.

  • The group looking at risk factors for loan performance challenged our entire notion of loan default risk and discovered that countries that tend to default on most of their loans (and therefore have high default percentages) are actually struggling to pay back a much smaller amount of capital than most countries’ debts. They reasoned that smaller countries in this position could be penalized for failing to pay back a large portion of a (relatively) tiny loan, even more so than larger countries who had failed to pay back more money (but a smaller percentage) of bigger loans.

  • Another team examined the relationships between development spending, education, and gender. They affirmed that investments in education and female school enrollment really do matter for sustainable development—as more women enroll in school, fertility rates and maternal mortality rates decrease while GDP per capita increases! Every 1% increase in government spending on education results in a corresponding 1.1% increase in female enrollment in secondary school.

At the World Bank, we’ve worked to make our data free and publicly accessible, and we hope that it will unlock the future of sustainable development. 


So, what’s the common thread between all these findings? Well for one thing, there are a lot of interested data scientists in the DC area keen to help the Bank look for patterns in open data in important topics. Second, preparation and data analysis take time – more time than we had to reach robust conclusions. You can see the final presentations here.

Having launched JoIn, analytic tools and guidelines for Jobs Diagnostics, the Jobs Group is now launching a broader external community of practitioners to share experiences and further the research.

All these innovative solutions literally run on data—and we believe that data is a global public good. At the World Bank, we’ve worked to make our data free and publicly accessible, and we hope that it will unlock the future of sustainable development. We invite you all to dive into our data to see what we can accomplish together. For more information, check out the abovementioned datasets (and many more) on the World Bank’s Development Data Hub.

 

Data Community DC (DC2) is a non-profit 501(3)(c) organization committed to connecting and promoting the work of data professionals (both experts and novices) in the National Capital Region by fostering education, opportunity, and professional development through high-quality, community-driven events. This hackathon matches the mission and is an example of the tremendous talent and interest in our community. We are looking forward to future collaboration with the World Bank and other interested organizations.


Authors

Haishan Fu

Chief Statistician of the World Bank and Director of the Development Data Group

Janet Dobbins

President at Data Community DC, Inc. and Vice President, Business Development & Strategic Partnerships at Statistics.com

Join the Conversation

The content of this field is kept private and will not be shown publicly
Remaining characters: 1000