During the past few years, interest in high-frequency price data has grown steadily. Recent major economic events - including the food crisis and the energy price surge – have increased the need for timely high-frequency data, openly available to all users. Standard survey methods lag behind in meeting this demand, due to the high cost of collecting detailed sub-national data, the time delay usually associated with publishing the results, and the limitations to publishing detailed data. For example, although national consumer price indices (CPIs) are published on a monthly basis in most countries, national statistical offices do not release the underlying price data.
"How can I unpivot or transpose my tabular data so that there's only one record per row?"
I see this question a lot and I thought it was worth a quick Friday blog post.
Data often aren’t quite in the format that you want. We usually provide CSV / XLS access to our data in “pivoted” or “normalized” form so they look like this:
But for a lot analyses and applications, particularly data visualisation tools like D3, ggplot2, Tableau, it’s more convenient to have your data “unpivoted” or “denormalized” so it looks like this:
Although this is less space efficient, space is cheap, and it means there’s always only one record per row, so you can use simple functions to access and filter data.
Here are three ways to “unpivot” or “denormalize” your data - in effect, to transpose columns to rows and have one complete record per row.
A year ago we had a post on launching a new version of the World Bank’s data query system, DataBank, offering over 9,000 indicators with which users can create custom reports with tables, charts, or maps. These live reports can then be saved, shared between users, and embedded as widgets on websites or blogs. A year later, DataBank is multilingual, offering a multilingual interface across the different databases and fully-translated data from the World Development Indicators. We’ve asked one of the founding fathers of DataBank and Open Data’s Lead Information Officer, Reza Farivari, to tell us about the tool and what to expect in the future.
This is the first of a two-part blog series on offline open data pilots recently conducted in Indonesia and Kenya. Part one focuses on Indonesia, while the subsequent blog post will describe our findings in Kenya. This series is part of a larger project on the demand for open financial data being conducted by the World Bank Group Open Finances program and World Bank Institute’s Open Contracting Partnership.
Meet Gede Darmawan and Gede Sudiadnya, who live in the village of Desa Ban in Indonesia. These two young men were a part of a story of transformation, one that saw them turn from passive receivers of information to active engagers. It was a remarkable display of the potential power of open financial data.
Gede Darmawan (age 17), Gede Sudiadnya (age 22)
Developers, analysts and researchers often use our data through the APIs we provide. We’ve written about accessing World Bank data in Stata in the past, but I’m going to take a moment to survey the other language-specific libraries that I know of. From now on, unless I state otherwise, by “API”, I’m referring to our development indicators API.
I’ll list the libraries first, and then show some examples with a couple of them:
Python: The wbdata module by Oliver Sherouse offers easy access to all the data in our APIs. It also plays nicely with Wes McKinney’s superb ‘pandas’ analysis library. I’m less familiar with Matthew Duck’s wbpy module but it appears to offer similar functionality and also provides access to the Climate Data API.
R: The WDI module by Vincent Arel-Bundock offers convenient access to the data in our API and opens the door to using it with the awesome ggplot2 graphing library. You can also access the Climate Data API in R with rWBclimate.
Ruby: The world_bank_ruby gem by Justin Stoller has some nice features for bringing our data into Ruby.
Merchandise trade has become an increasingly important contributor to a country’s gross domestic product (GDP), particularly for developing countries. Before the global financial crisis hit in 2008, merchandise trade as a percent of GDP for low- and middle-income economies was 57 percent, about 5% higher than for high-income economies. This is very evident in Europe and Central Asia (ECA) where merchandise trade accounts for 73 percent of the developing region’s GDP. Many ECA countries including Hungary, Belarus, and Bulgaria have merchandise trade to GDP ratios above 100 percent (155, 136, and 114 percent respectively in 2011), meaning merchandise exports are a large contributor to their overall economy.
Data openness is receiving considerable interest globally over recent years. Several countries and organizations are engaged in global discussions in this area. The International Budget Partnership (IBP) is one of the largest forums for these discussions.
In April 2010, the World Bank made its development data available for download free of charge.(2) The Open Development Technology Alliance(3) (also known as the ICT Knowledge Platform) was created to enhance accountability and improve the delivery and quality of public services through technology-enabled citizen engagement (e.g. using mobile phones, interactive-mapping and social media). The World Bank is also one of the international financial institutions taking the lead in the Global Initiative for Fiscal Transparency (GIFT) - an initiative that promotes budget transparency, public participation, and accountability globally.(4) BOOST is another useful tool developed by the World Bank for transforming detailed government expenditure data from FMIS databases into an easy-to-understand data set (XLS) for detailed analysis through pivot tables and geo-mapping tools.
On July 1, we updated the analytical country classification, which groups economies of the world into four categories based on 2012 GNI per capita estimates: low income, lower-middle income, upper-middle income, and high income. This has prompted some questions related to the review of this classification scheme, which we announced late last year and for which we solicited and received your feedback. I thought it would be useful to post an update.
Does open data have economic value beyond the benefits of transparency and accountability? Does it have the power to fuel new businesses and create new jobs? Does it have the potential to improve people's lives by powering new services and products? If so, what should the World Bank be doing to help this along? These were questions we had in mind as we set out to bring together open data entrepreneurs from across Latin America for an Open Data Business Models workshop in Montevideo, Uruguay.
Data on Millennium Development Goals (MDG) indicator trends for developing countries and for different groups of countries are curated in the World Development Indicator (WDI) database. Each year we use these data in the Global Monitoring Report (GMR) to track progress on the MDGs. Many colleagues, as well as non-Bank staff, approach us on a weekly basis with questions regarding where their region, or country, or sector stands in regard to achieving the core MDGs. Oftentimes in the same breath, they will also ask us whether or when we expect that a particular country or region will meet a certain MDG.
With less than 1,000 days remaining to the MDG deadline, work on the Post-2015 agenda is in full swing. In response to the growing demand for additional info about GMR analytics and the underlying data, we developed a suite of open and interactive data diagnostics dashboards available at: http://data.worldbank.org/mdgs. Below is an extract which summarizes the progress status towards meeting various MDGs among countries in various regions, income and other groups. Select different indicators and highlight categories of progress status to interact with the visualization.