As December 4th approaches, I’m getting excited for the International Open Data Hackathon  and even more excited to see World Bank challenges and data featured in an event that will span 50 cities (and counting ) over 6 continents. It’s thrilling to consider what hackers and users working together might mash-up and what role we (as data providers) can play in giving people access to clean and interoperable data sets for their using. Let a thousand flowers bloom.
Having recently traveled in India and after meeting development folks of various stripes from economists in Delhi to social entrepreneurs in Hyderabad to geeks in Bangalore , I’m struck again by how important local data remains. It’s one thing to talk about global economic trends and macro indicators but quite another to understand what’s happening in one Indian state, say Andra Pradesh, compared to its neighbors. Imagine a citizen group comparing rainfall data between states , at the district level, compared to crop yields over two decades. That’s when things get interesting and potentially useful to users.
Much of the data we’ve “liberated” at the World Bank will be helpful to policy wonks, researchers, and perhaps students of global development but for most media organizations, civil society groups, and citizens, what matters most remains local or even hyper-local. So where’s that data and how do make it more widely available? Where’s the micro-data upon which national statistics are built and global indicators published? Why is it so seldom available? When will governments (rather than just international organizations) embrace Open Data and all the possibilities it affords?
Let’s start by acknowledging that data is political. To make information available on differential rates of development within a country is bound to create tension. Why are parts of countries developing faster than others? How are rates of “development” correlated to ethnic groups or faith communities? The truth is seldom clear but people regularly make claims based on anecdotal data or false information. Making information available to more people more regularly means more opportunities to cross check facts and allows a system of checks and balances to emerge.
It takes an ecosystem to build a well functioning democratic order. That ecosystem needs to provide reliable and clean fuel to many actors necessary to govern effectively. Data is fuel and reliable data in usable formats is refined fuel. I’m hoping the Bank’s Apps for Development Global Competition surfaces many national and local data sets that can be combined with global indicators to understand how things at the local level compare with global trends. I hope we also pinpoint where important gaps in data exist and where we need to work harder to find reliable information to solve important problems. In a world of data deluge, there is no shortage of information collected but parsing it, sorting it, and making it relevant locally is the real challenge. We want to hear from you, see your ideas, and work with you to make available the raw material to power the next economic or scientific revolution.