Syndicate content

Add new comment

Interactive product export streamgraphs with data360r (now in CRAN!)

Reg Onglao's picture

Building beautiful, interactive charts is becoming easier nowadays in R, especially with open source packages such as plot.ly, ggplot2 and leaflet. But behind the scenes, there is an often untold, gruesome part of creating data visualizations -- downloading, cleaning, and processing data into the correct format.

Making data access and download easier is one of the reasons we developed data360r, recently available on CRAN and the newest addition to the TCdata360 Data Science Corner.

Data360r is a nifty R wrapper for the TCdata360 API, where R users ranging from beginners to experts can easily download trade and competitiveness data, metadata, and resources found in TCdata360 using single-line R functions.

In an earlier blog, we outlined some benefits of using data360r. In this blog, we’ll show you how to make an interactive streamgraph using the data360r and streamgraph packages in just a few lines of code! For more usecases and tips, go to https://tcdata360.worldbank.org/tools/data360r.

Step 1: Install required libraries

We first install the data360r from CRAN, as well as the devtools, dplyr, and bit64 libraries for data processing and downloading. We also download the streamgraph (development) library from Github.

devtools::install_github("hrbrmstr/streamgraph")
library(streamgraph)
 
install.packages(c('devtools','data360r','bit64', 'dplyr'))
library(devtools)
library(data360r)
library(dplyr)
TIP: Using devtools::install_github automatically updates all package dependencies for the streamgraph package – namely, htmlwidgets, htmltools, magrittr, xts, tidyr, and dplyr. This may be a headache for some R users who have existing installed versions of these package dependencies.
 
If you want to disable automatic package upgrades, use
devtools::install_github("hrbrmstr/streamgraph", dep=FALSE)
Note, however, that you will still need to install the other packages currently not installed.

Step 2: Download and process data with data360r

With a single line of code, we can use the get_data360 function of data360r to download product-level export data (in US$ thousand) of all countries in the TCdata360 platform (data source: World Integrated Trade Solution).

df_exp <- get_data360(indicator_id = c(2369), output_type = "long")

We then compute for the World Total Export by aggregating all Observations by year (Period) and by product classification (except "All Products"). For better interpretability, we then rescale the data from $US thousands to trillion and convert the year (Period) variable to integer.

df_exp_sum <- select_(df_exp, "Product", "Period", "Observation") %>%
  group_by(Period,Product) %>%
  summarise(Observation = sum(Observation)) %>%
  ungroup() %>%
  filter(!(Product == "All Products")) %>%
  mutate(Obs_scaled = Observation / 1000000000,
  Period = as.numeric(as.character(Period)))

Step 3: Create interactive streamgraphs using streamgraph

Using the processed data, we can now create three different streamgraphs using the streamgraph library: (1) normal streamgraph, (2) zero-baseline streamgraph (similar to an area chart), and (3) 100% streamgraph (similar to a 100% area chart).

For better user experience, we add a dropdown box to easily toggle between different product classifications, pick a good Tableau color palette ("cyclic"), and tweak the x-axis into 2-year intervals.

To interact with the charts below, hover over a selected "stream" to see its product classification and associated value by looking at the text annotation in the upper left corner. You can also select the product classification using the dropdown box at the bottom of the chart.

Total World Exports (US$ Trillion), by Product Classification

streamgraph(df_exp_sum, key="Product", value="Obs_scaled", date="Period") %>%
  sg_axis_x(2, "year", "%Y") %>%
  sg_fill_tableau("cyclic") %>%
  sg_legend(show=TRUE, label= "Product Classification: ")

Total World Exports (US$ Trillion), by Product Classification

streamgraph(df_exp_sum, key="Product", value="Obs_scaled", date="Period",
            offset = "zero") %>%
  sg_axis_x(2, "year", "%Y") %>%
  sg_fill_tableau("cyclic") %>%
  sg_legend(show=TRUE, label= "Product Classification: ")

Percent of Total World Exports (US$ Trillion), by Product Classification

streamgraph(df_exp_sum, key="Product", value="Obs_scaled", date="Period",
            offset = "expand") %>%
  sg_axis_x(2, "year", "%Y") %>%
  sg_fill_tableau("cyclic") %>%
  sg_legend(show=TRUE, label= "Product Classification: ") %>%
  sg_annotate(label="Transportation", x=as.Date("1989-01-01"), y=0.91, color="#ffffff", size=18) %>%
  sg_annotate(label="Mach and Elec", x=as.Date("1989-01-01"), y=0.6, color="#ffffff", size=18) %>%
  sg_annotate(label="Intermediate Goods", x=as.Date("1989-01-01"), y=0.48, color="#ffffff", size=18) %>%
  sg_annotate(label="Consumer Goods", x=as.Date("1989-01-01"), y=0.29, color="#ffffff", size=18) %>%
  sg_annotate(label="Capital Goods", x=as.Date("1989-01-01"), y=0.09, color="#ffffff", size=18)

And that's all there is to it. Happy charting with data360r and these other R packages!

If you have ideas or questions for TCdata360, or if you’ve used the data360r package yourself, we’d be more than happy to see your work and get your feedback! Drop us a message at tcdata360@worldbank.org, or tweet with the hashtag #tcdata360.

The conclusions and opinions expressed in this blog do not represent the views of the World Bank Group.