Syndicate content

The Future of the Open Data Catalog

Alison J. Kwong's picture

The World Bank launched data.worldbank.org in April 2010, with over 2,000 World Development Indicators. Since its launch over two years ago, you can now find datasets that include Climate, Projects & Operations, Microdata, Finances and many other topics. As more data become available and a growing and diverse audience begins to use it, we have identified a set of priorities to manage this growth sustainably.

One essential piece of this strategy involves the World Bank’s open data catalog. What began as a modest collection of data resources has expanded to include over 90 data sets, APIs, over 8,000 indicators, and even other data catalogs. As it has grown, it’s become increasingly important that we make the catalog more technologically capable, but also more user friendly, readable and understandable.

What are we proposing?

Fully searchable.  The new data catalog will be fully searchable, both through full text search (like conventional search engines) and faceted search. That allows users to find datasets much more quickly, and filter results by categories such as data type, format, or coverage.

Federated. The current data catalog includes many different World Bank open data catalogs, such as Finances and Projects, but not the data sets they contain. The new data catalog will allow you to search multiple World Bank open data catalogs from a single location.

User friendly.  We are looking at making several improvements to the catalog’s user interface to make it easier to read and navigate, with improvements such as sortable columns and shortcuts to data query tools.

Better metadata.  More comprehensive and easier to read metadata, including a data catalog API and support for multiple languages, which will be phased in over time.

Expandable and scalable.  Support for additional data types, larger datasets, and greater numbers of datasets.

Moving forward

We would really value input from users as we take this work forward.  What could be improved? What works well and  what doesn’t work so well in the current catalog?  Have we missed any features that are important to you? Please let us know what you think – please leave a comment, or email us directly at data@worldbank.org.

 

*Edit: Bruno Sanchez A Nuno shares his thoughts on our plans on his blog.

 

Comments

Submitted by ramiro on
It would be great to have permalinks for data selections. To give an example: I select all countries, 10 indicators and the past 10 years and then download the data. At a later point I notice that I want to dismiss some of the selected indicators and instead add others. Right now I have to start over making my selections, if I closed the browser or tab. With a permalink I could go back to a previously selected state and make the necessary changes. Apart from that your plans sound great, especially the full text search.

Dear Ramiro, Thank you for your comment. You can save your created queries in the DataBank system and access them at a later time using a permanent link. In order to access an existing query, click on "Share this Query" icon and copy the link into a browser. An example would be a link similar to databank.worldbank.org/data/BSEC_People/id/5a244b13. Alternatively, you can find your query links from the "Saved Report" section located on the left navigation of the DataBank homepage. We hope this helps. regards.

Submitted by Amparo on
Dear Tim and Alison, Great work, congratulations. Interoperability of catalogs produced by other organizations and data portability will make data more useful: "more people can use it". To help achieve that goal we need standards. Socrata is proposing an approach to develop these standards collaboratively (for catalogs, APIs, search engines, etc.). I recommend you take a look at http://open-data-standards.github.com/.

Submitted by Gérard Chenais on
Opening World Bank data is fine; these data have been produced thanks to public money and providing equal access to statistics for all users is a good practice endorsed by many for national as well as for international statistics. Also, World Bank data are an element of a more global information system. So some links to other international data catalogues would be useful; otherwise users might think that data not on data.worldbank.org do not exist. Read about data on Wikipedia: ''Data is the lowest level of abstraction, information is the next level, and finally, knowledge is the highest level among all three. Data on its own carries no meaning.'' Here, what meaning is given to data compared to statistics? Sincèrement, Gérard Chenais

Submitted by Tim Herzog on

Hi Gérard,

Many thanks for your comments. There are two ways in which we are working to integrate data and statistics from other organizations. Many data series that are in the current catalog, particularly the World Development Indicators, are in fact sourced from organizations such as the UNSD or FAO, and we make special efforts to document these sources. For individual indicators such as Forest Area the citation is just under the description at the top of the page. For datasets, the "Source/Citation" metadata field provides this information where appropriate.

The second point of integration is to link our country pages to country-level open data portals where possible. This is still a work in progress, something we hope to roll out later this year.

This strategy is an integral piece, including the World Bank's open data catalog. What started as a modest data resources, has been expanded to include more than 90 data sets, bulk drugs, over 8000 indicators, even other data catalog. As it has grown, it has become an increasingly important technology capabilities, our catalog, but also more humane, to read and understand.

Add new comment