Introducing WBGAPI: A new python package for accessing World Bank data


This page in:

Version 1.0.4 of the wbgapi python package is now available. This package has been in the Python Package Index for almost a year, and the latest version adds several new features to make exploring and searching databases easier and more interactive.

Python packages for World Bank data have been around for a while, but WBGAPI is relatively new. I wrote this package to take advantage of some improvements in the API that have also been around for a while, but were difficult to understand or use, and not well supported in other packages. I also wanted to include better pandas support and in general make it easier to retrieve data without a lot of extra code.

The README file provides an overview and the package itself provides extensive documentation through python's help function. But just to get you started, here is a quick overview of 5 features that make WBGAPI unique.

1. WBGAPI makes databases easier to understand and use

The World Bank API sometimes gives the illusion that all indicators reside in one big database. For example:

Graphical user interface, text

Description automatically generated

Actually, the API consists of over 63 databases for a total (as of this writing) of 17,517 indicators. If you request an indicator such as population (SP.POP.TOTL) that is part of the World Development Indicators (WDI), the API returns the data from the WDI. But if you request one of the natural capital indicators (say NW.NCA.FORE.TO for forests) that is not in the WDI, the indicator comes from another database such as Wealth Accounts. And if you request a bunch of different indicators, it's possible they will come from different databases unless you explicitly specify which database you want for each indicator. To make it even more complicated, different databases often contain different countries and time periods, and are updated on different schedules. All of these details are available in the API, if you know how to find them.

WBGAPI has a different implementation that provides greater clarity and consistency about which database your data is coming from and what that database includes. By default, data requests are made against the WDI; it's not possible to inadvertently get data from an unspecified dataset.

WBGAPI provides an easy way to list all available databases:

(Several sample outputs below have been condensed or truncated for length)

2. Easier search and discovery

WBGAPI includes info() functions like the one shown above for exploring the indicators, countries and other elements of the API. These are optimized for both interactive mode and jupyter notebooks, but you can access the same information programmatically as iterable objects. For example:

You can also search metadata at the database level:

Or you can access metadata for a series or country/economy:

3. Simple but powerful data queries

WBGAPI lets you request a single indicator or any number of indicators from a database; the same is true for countries, aggregates and time periods. For example, you can request 2 indicators for 40 countries, or 40 indicators for 2 countries (or all countries) in a single request. If your request exceeds the API limit in some way, WBGAPI will chunk it into multiple API calls.

Here are just a few examples building on pandas data frames (pandas is optional but makes WBGAPI much more powerful). Note that lengthy outputs have been abbreviated for simplicity’s sake:

data queries