The final report from the Big Data for International Development DataDive came out a few days ago (see below) and the obvious question is what's next? Sure, the DataDive was a success in terms of the number and caliber of people that participated, the ambition and scope of the problems they worked on (mostly around better/faster/cheaper poverty measurement, and more effective/proactive fight against fraud and corruption), and the results that were achieved in a very short span of time (showing fairly conclusively that big data based approaches can be effectively applied in the context of international development). The report itself points out a few next steps (a data competition, specific action items against each project that the teams worked on, the need to embrace new types of data skills and techniques, and continued effort to open new and more diverse data from both private and public sources) but here is a look at some other themes that emerged during the dive that are probably also worth thinking about -
- Get serious about real-time decision-making and predictive analytics - the data dive provided further evidence that plenty of real-time data already exists that international development organizations don't harness effectively. Sure the data isn't perfect and it isn't always comprehensive or even especially easy to access but it is still usable and useful (as the price data project, for instance, showed) - especially to make short-term decisions and to listen better
- Work together and create partnerships - possibly the single most important reason the DataDive succeeded was that many groups came together in an informal partnership to conceptualize and run the series of associated events. Thank you very much UNDP (who organized their own DataDive in Vienna), QCRI, Global Pulse, and UNDB. Within the Bank itself, there was a remarkable level of support and cooperation from many different groups - which is more rare than it should be. Essential lesson - it's going to be better if you let more people in
- Think beyond your organization - the data dive clearly demonstrated the fact that internal, organizational data is useful only to a point - even to understand internal aspects of business. The debarred list exercise was a case in point - it was almost easier for the Bank to produce better 'internal' data using external sources. Given how internal/external boundaries are blurring in organizations and how closely groups inside and outside organizations like the Bank increasingly work, it may almost be worth looking harder outside to understand the what and why of what goes on inside
- Stop obsessing over data quality - we all know that data standards are the greatest thing since...well the last greatest thing and all of us can point to the millions of things we can and should do to improve data quality. And yes, somebody should do all of that. Don't wait though - data imperfections reflect the real world in some ways; recognize them and build them into your models (sometimes they might even improve them!) - use what you have, rather than wait for what you wish you had. Organizations like the Bank need to stop agonizing about the imperfections of data and start using some of it
- Encourage and support data publishers - the dive demonstrated once again that much of the data universe is still 'closed'. And sometimes for good reason - some data confers competitive advantage, other data raises the specter of privacy and regulatory issues, sometimes it is just hard and expensive to publish data, and often it is unclear what the value of publishing data might be. There is a strong case for the Bank and its partners (the UN Global Pulse in particular has done pioneering work in this space) to help create an environment (and perhaps even infrastructure) that helps open more and diverse data
This is just a starting list. How do you propose the Bank and its partners go from here?
If you missed the action, here is some more information --
- My recent blog on why Open data for business is suddenly the rage.
- Short recap blog - with links to raw project hackpads
- Chris Kreutz's recap of the data dive in Vienna
- Max Richman on scraping pricing data to measure poverty
- Francis Gagnon on better data and the power of data visualization
- Ben Ranoust on using visual analytics to probe risk factors influencing project outcomes
- Marc Maxson on auditing the world - the sequel
- Dennis McDonald on learning from data explorations
- Giulio Quaggiotto and yours truly on personal data philanthropy
- Milica Begovic, Giulio quaggiotto, and Ben Ranoust on social networking analysis for development
DC Big Data Exploration Final Report by World Bank Publications
Join the Conversation