Syndicate content

March 2019

More ways to access the Doing Business data: improvements, tips & tricks

World Bank Data Team's picture

The World Bank Group’s Doing Business project collects objective data on 11 areas of business regulations and their enforcement across the largest business cities of 190 economies.

To provide wide-ranging access, we make the World Bank Group’s Doing Business data available through various channels. Apart from the Doing Business website, the full dataset is available via:

  1. DataBank, a query tool which allows users to select indicators, economies and time periods
  2. Data Catalog, a one-stop shop for all of World Bank’s development data
  3. For more experienced users, indicators API make it easy to access the data programmatically (an API allows users to download data without interacting with a user interface);

In an effort to further improve the usability of the data, the Data Group, in close collaboration with the Doing Business team, has recently updated the Doing Business DataBank platform to offer users a set of enhanced analytical tools. We highlight a few of these tools and features and hope you will find them useful for your research:

  1. Simplified user interface and indicator search

The Doing Business data are organized according to 11 areas of business regulation, including “Starting a business”, “Registering Property” or “Getting electricity”, among other. Each of these topics has several components. For example, “Getting electricity” is composed of four sub indicators including the number, cost and time of each regulatory procedure required to connect to an electricity grid as well as the reliability of supply and transparency of tariff. The DataBank interface allows users to search for indicators within each topic in a simple and coherent way. All indicators within a topic are collapsed into a single category, and users can browse all topics in a side bar.

Making Analytics Reusable

Benjamin Daniels's picture
Fernando Hoces de la Guardia (BITSS) leads an interactive session using R Markdown to create dynamic, reproducible documents blending code and writing.

This is a guest blog post by the Berkeley Initiative for Transparency in the Social Sciences (BITSS), DIME Analytics, and Innovations in Big Data Analytics teams. This post was written by Benjamin Daniels, Luiza Andrade, Anton Prokopyev, Trevor Monroe and Fernando Hoces de la Guardia. The workshop also included presentations by Mireille Raad and Dunstan Matekenya.

Since 2005, the share of empirically-based papers published in development economics journals has skyrocketed, reaching more than 95% by 2015. Today, lab-style research groups and teams typically maintain in-house capacity for the entire research workflow. This development means that new, scalable methods for ensuring high-quality research design, data collection, analysis, and publications are needed for evidence to remain transparent and credible. We call these workflows “reusable analytics”, because they are research processes that can be verified by outside teams, or repurposed for a different analysis by the same team later on. Research teams almost universally plan to adopt such processes, but there is also a pervasive sense that actually making analytics reusable is costly and difficult. Therefore, our analytics teams are currently putting extensive effort into selecting and developing flexible tools and processes that can be used over and over again—so we can deliver recommendations and trainings for easy-to-learn and easy-to-use reusable analytics.

The DIME Analytics group, the Innovations in Big Data Analytics Team, and the Berkeley Initiative for Transparency in the Social Sciences (BITSS) recently co-hosted a hands-on workshop on “Making Analytics Reusable”. The workshop offered hands-on training in some core tools for reusable analytics: code collaboration using Git and GitHub, dynamic documents using Stata, R, and Python, and team task management using GitHub issues and project boards. The over-subscribed attendance reflects growing demand for modern principles and practices that accelerate learning, transparency, reproducibility and efficiency in research and policy analysis.

Feeling Ambivalent on International Women’s Day

Haishan Fu's picture
Photo: Lakshman Nadaraja/World Bank

On the eve of International Women’s Day, I was at a UN WOMEN side event in NYC when my phone started buzzing with well wishes for a happy women’s day from my friends in Asia, filling me with — ambivalence. To be honest, the day always leaves me with mixed feelings: despite the great strides that the world has made in women’s rights in various ways, for me, it’s also a reminder of how so many women still don’t enjoy our basic human rights.

As we’ve returned from women’s day to what in many ways is still a man’s world, I wanted to share three thoughts about the intersection of women’s rights with our data world today.

Demystifying machine learning for disaster risk management

Giuseppe Molinario's picture
Also available in: العربية | Español | Français

To some, artificial intelligence is a mysterious term that sparks thoughts of robots and supercomputers. But the truth is machine learning algorithms and their applications, while potentially mathematically complex, are relatively simple to understand. Disaster risk management (DRM) and resilience professionals are, in fact, increasingly using machine learning algorithms to collect better data about risk and vulnerability, make more informed decisions, and, ultimately, save lives.

Artificial intelligence (AI) and machine learning (ML) are used synonymously, but there are broader implications to artificial intelligence than to machine learning. Artificial (General) Intelligence evokes images of Terminator-like dystopian futures, but in reality, what we have now and will have for a long time is simply computers learning from data in autonomous or semi-autonomous ways, in a process known as machine learning.

The Global Facility for Disaster Reduction and Recovery (GFDRR)’s Machine Learning for Disaster Risk Management Guidance Note clarifies and demystifies the confusion around concepts of machine learning and artificial intelligence. Some specific case-studies showing the applications of ML for DRM are illustrated and emphasized. The Guidance Note is useful across the board to a variety of stakeholders, ranging from disaster risk management practitioners in the field to risk data specialists to anyone else curious about this field of computer science.

Machine learning in the field

In one case study, drone and street-level imagery were fed to machine learning algorithms to automatically detect “soft-story” buildings or those most likely to collapse in an earthquake. The project was developed by the World Bank’s Geospatial Operations Support Team (GOST) in Guatemala City, and is just one of many applications where large amounts of data, processed with machine learning, can have very tangible and consequential impacts on saving lives and property in disasters.

The map above illustrates the “Rapid Housing Quality Assessment”, in which the agreement between ML-identified soft-story buildings, and those identified by experts is shown (Sarah Antos/GOST).