Syndicate content

Trade

Going Deeper into TCdata360 Data Availability Leaders and Laggers

Reg Onglao's picture

Note: This is the second blog of a series of blog posts on data availability within the context of TCdata360, wherein each post will focus on a different aspect of data availability. The first blog post can be viewed here.

With open data comes missing data. In this blog series, we hope to explore data availability by looking at it from various perspectives within the context of the TCdata360 platform[1]: by country, dataset, topic, and indicator.

In our previous blog post, we took a look at the country-level data availability over time through an interactive motion bubble plot inspired by the famous Gapminder visualization. In this follow-up post, we’ll still look at data availability from a geographical lens – but now looking into country classifications and other details that aren’t evident in a bubble plot, as well as the data availability leaders and laggers over time.

Overall Data Availability Leaders and Laggers

First, let’s focus on comparing individual countries to get a better sense of country-level differences in data availability. We computed for each country’s overall data availability by taking the median data availability across all years (1955-2016). Looking at the top 20 and bottom 20 countries in terms of overall data availability generates a few interesting patterns.

Watch the Growth of Trade country-level data availability in TCdata360

Reg Onglao's picture

Note: This is the first blog of a series of blog posts on data availability within the context of TCdata360, wherein each post will focus on a different aspect of data availability.

With open data comes missing data. We know that all indicators are not created equal and some are better covered than others. Ditto for countries in which coverage can range from near universal such as the United States of America to very sparse indeed such as Saint Martin (French part).

TCdata360 is no exception. While our data spans across over 200 countries and 2000+ indicators, our data suffers from some of the same gaps as many other datasets do: uneven coverage and quality. With that basic fact in mind, we have set about exploring what our data gaps tell us — we have 'data-fied' our data gaps so to speak.

In the next few blogs we'll explore our data gaps to identify any patterns we can find within the context of the TCdata360 platform[1] — which countries and regions throw up surprises, which topics are better covered than others, which datasets and indicators grow more 'fashionable' when, and the like. In this first blog, we’ll look at data availability at the country level.

Tracing the roots of TCdata360 datasets: an interactive network graph

Reg Onglao's picture

When doing data analysis, it's common for indicators to take the spotlight whereas datasets usually take the backseat as an attribution footnote or as a metadata popup.

However, we often forget how intertwined dataset sources are and how this affects data analysis. For instance, we can never assume that indicators from different datasets are mutually exclusive – it's possible for them to be the same indicator or to have an influence on the other as a component weight in an index, if the other dataset were used as a source for the other.

In this blog, we're interested to see if this applies to TCdata360 by taking a deeper look at its "dataset genealogy" and answer questions such as – Is it safe to do cross-dataset analysis using TCdata360 datasets? Are there interesting patterns in the relationships between TCdata360 datasets?

Quick introduction to network graphs

We call a dataset which serves as a data source for another dataset as "source", and a dataset which pulls indicator data from another as "target". Collectively, all of these are called "nodes".

To see the relationships between TCdata360 datasets, we mapped these in a directed network graph wherein each dataset is a node. By directed, we mean that source nodes are connected to their target nodes through an arrow, since direction is important to identify source from target nodes. For the purposes of this blog, we restricted the network graph to contain datasets within TCdata360 only; thus, all data sources and targets external to TCdata360 will not be included in our analysis.

Here's how the network graph looks like.

Each dataset is represented by a circle (aka "node") and is grouped and color-coded by data owner or institution. The direction from any source to target node is clearer in the interactive version, wherein there's a small arrow on the connecting line which shows the direction from target to source.

Interactive product export streamgraphs with data360r (now in CRAN!)

Reg Onglao's picture

Building beautiful, interactive charts is becoming easier nowadays in R, especially with open source packages such as plot.ly, ggplot2 and leaflet. But behind the scenes, there is an often untold, gruesome part of creating data visualizations -- downloading, cleaning, and processing data into the correct format.

Making data access and download easier is one of the reasons we developed data360r, recently available on CRAN and the newest addition to the TCdata360 Data Science Corner.

Data360r is a nifty R wrapper for the TCdata360 API, where R users ranging from beginners to experts can easily download trade and competitiveness data, metadata, and resources found in TCdata360 using single-line R functions.

In an earlier blog, we outlined some benefits of using data360r. In this blog, we’ll show you how to make an interactive streamgraph using the data360r and streamgraph packages in just a few lines of code! For more usecases and tips, go to https://tcdata360.worldbank.org/tools/data360r.

Introducing Data360R — data to the power of R

Reg Onglao's picture
 

Last January 2017, the World Bank launched TCdata360 (tcdata360.worldbank.org/), a new open data platform that features more than 2,000 trade and competitiveness indicators from 40+ data sources inside and outside the World Bank Group. Users of the website can compare countries, download raw data, create and share data visualizations on social media, get country snapshots and thematic reports, read data stories, connect through an application programming interface (API), and more.

The 2017 Atlas of Sustainable Development Goals: a new visual guide to data and development

World Bank Data Team's picture
Also available in: 中文 | العربية | Español | Français

The World Bank is pleased to release the 2017 Atlas of Sustainable Development Goals. With over 150 maps and data visualizations, the new publication charts the progress societies are making towards the 17 SDGs.

The Atlas is part of the World Development Indicators (WDI) family of products that offer high-quality, cross-country comparable statistics about development and people’s lives around the globe. You can:

The 17 Sustainable Development Goals and their associated 169 targets are ambitious. They will be challenging to implement, and challenging to measure. The Atlas offers the perspective of experts in the World Bank on each of the SDGs.

Trends, comparisons + country-level analysis for 17 SDGs

For example, the interactive treemap below illustrates how the number and distribution of people living in extreme poverty has changed between 1990 and 2013. The reduction in the number of poor in East Asia and Pacific is dramatic, and despite the decline in the Sub-Saharan Africa’s extreme poverty rate to 41 percent in 2013, the region’s population growth means that 389 million people lived on less than $1.90/day in 2013 - 113 million more than in 1990

Note: the light shaded areas in the treemap above represent the largest number of people living in extreme poverty in that country, in a single year, over the period 1990-2013.

Newly published data, methods and approaches for measuring development

Interactive chord diagram to visualize trade

Siddhesh Kaushik's picture

What comes to mind when we think of trade? Quite possibly, exports, imports and trade balance. Is there a quick way to get this information without having to look at tables? Most of us would like to see how much a country imports and exports, which are the major trade partners, and what is the trade balance. We have introduced a d3.js based interactive Chord diagram to quickly visualize this information.

For example, here is a visual of Australia’s Exports and Imports for 2015. The chart shows top countries to which Australia exported or imported that year, and the remaining are bundled as “others”. Here is how you can interpret the diagram.

Each country has a different color. The length of the arc for Australia represents Australia’s total imports and the other parts of the arc show Australia’s exports to various countries. We can see the Import arc is slightly bigger than the Export arc and hence Australia has an overall negative trade balance.

From data blur to slow-mo clarity: big data in trade and competitiveness

Prasanna Lal Das's picture

Tolstoy's War and Peace was the big data of its time. A memorable moment from the epic novel occurs when Prince Andrei awakens following a severe injury on the battlefield. He fears the worst but, "above him there was nothing but the sky, the lofty heavens, not clear, yet immeasurably lofty, with gray clouds slowly drifting across them. 'How quiet, solemn, and serene, not at all as it was when I was running.'" Time appears to slow down and the Prince sees life more lucidly than ever before as he discovers the potential for happiness within him.

In many ways the scene captures what we demand of big data—not the bustle of zillions of data points as confusing as the fog of war, but sharp, clear insights that bring the right information into relief and help us connect strands previously unseen. The question of whether this idea is achievable is the starting point of a paper about big data on trade and competitiveness just published by the World Bank Group. In it, we asked—can big data help policy makers see the world in ways they haven't before? Are decisions that are informed by the vast amounts of data that envelop us better than decisions based on traditional tools? We didn't want a story trumpeting the miracles of big data; we wanted instead to see the reality of big data in action, in its messiness and its splendor.

Things to do with Trade and Competitiveness Data… thank you API

Alberto Sanchez Rodelgo's picture

Who are Spain's neighbors? Is Canada closer to Spain than Portugal? What about Estonia or Greece? The answer? Depends on the data you are looking at!

Earlier this week I crunched data based on a selected list of indicators from the new Open Trade and Competitiveness platform from the World Bank (TCdata360) and found some interesting trends[1]. In 2009 Spain was closer to economies like Estonia, Belgium, France and Canada while 6 years later in 2015, Spain's closest neighbors were Greece and Portugal. How and when did this shift happen?

Other trends I spotted using the same data? It seems the Sub-Saharan region ranks the lowest in Ease of Doing Business, that in 2007 Israel held the record for R&D expenditure as % of GDP, while in the same year Malta topped FDI net inflows as % GDP, and that the largest annual GDP growth in the last 20 years occurred in Equatorial Guinea in 1997.

Figure 1: Dots represent values for an economy at a given point in time for years 1996 to 2016 overlaying their box-plot distributions. Colors correspond to geographical regions.

Pages