World Bank data infrastructure: shortening the path from data to insights

|

This page in:

data_infrastructure_visual.PNG

 

By reducing the costs of working with data, the World Bank’s Development Data Hub helps increase data use and knowledge production.

 

Data is not valuable in a vacuum. Data is only valuable once information, insight or in other words knowledge is extracted from it and is used to make decisions, shape policies, and change behaviors.

Data scientists, analysts, and researchers spend a significant amount of time and effort extracting knowledge from data and communicating it. Because extracting knowledge from data can be expensive, it is important to find ways to reduce its cost. A robust and well-designed data infrastructure can contribute to this cost reduction by smoothing the frictions involved with data analytics projects: storing, searching, accessing, understanding, cleaning, transforming, analyzing, and visualizing data. Lowering that cost can go a long way toward increasing data use and knowledge production.

Making data assets easier to access, understand, and use has been, and continues to be, one main focus of the World Bank Data Group. The World Development Indicators (WDI) is one such good example of the Data Group’s work in the area. This is why the World Bank’s Development Data Hub (DDH) that currently houses the WDI and over 10,000 other datasets was designed and implemented—to ensure World Bank data assets are:

  • Easy to store – By providing a safe place to keep development data that has been collected or procured
  • Easy to find – By providing a search engine indexing all World Bank development data
  • Easy to access – By providing direct download and API access services
  • Easy to understand – By publishing rich metadata along with each dataset
  • Easy to use – By leveraging existing standards for data and metadata
  • Easy to combine – By nudging data producers to use existing standards when possible (for instance, using countries ISO3 codes in addition to country names for ease in mapping)

Because knowledge dissemination should also be cheap, the products derived from the World Bank data assets should be easy to share, publish, and disseminate.  Existing tools such as the World Bank Tableau server or the RStudio Connect server enable World Bank data professionals to share the results of their analysis with their clients, colleagues, or the entire world in just a few clicks.

Finally, this process of knowledge extraction should be transparent and easily reproducible. The World Bank Github account allows World Bank data professionals to easily share their code and collaborate with other researchers within and outside the Bank.

In the same way a robust transport infrastructure shortens the path from point A to point B, the World Bank Data Group aims to continuously shorten the path from data to insight by building a stronger data infrastructure. 

Join the Conversation

FERNANDO
February 05, 2020

True.... data infrastructure like the DDH is crucial, but data quality is still key. Datasets provided by WB are great for multilateral purposes, but I find them less useful for sub-national projects. It would be great if WB datasets could aggregate progressively more granular data, integrating, perhaps, data from national statistics institutions.