Freely available data: The public good that keeps on giving?

|

This page in:

David Evans blogged last week on some interesting impact evaluation work presented at the annual conference at the Center for the Study of African Economies, in Oxford, UK. We were at the conference too, and enjoyed it at least as much as David did. We also recommend browsing the sessions for your areas of interest. One aspect of the conference that struck us was the geographic distribution of the papers, and how that appears to be related to the (public) availability of good data. So we started thinking about what it might be about certain types of data that could be leading to a stream of research.  
 
One case in point stood out.   We are both associated in different ways with the World Bank’s Living Standard Measurement Study – Integrated Surveys on Agriculture (LSMS-ISA) program, and we were particularly struck by the number of papers that were using these data – together we counted 23.    For those of you who don’t know it, the LSMS-ISA project works with national statistics offices in eight partner countries across Sub-Saharan Africa to design and implement systems of multi-topic, nationally representative panel household surveys with a strong focus on agriculture. To date the program has started disseminating data for 6 countries, among which at least two waves of panel data are available for 5 (including the second wave of Ethiopia data, which came out last week).  
 
Given these features, one can see a number of attributes that might help increase the research spillovers of a given dataset.   First, a bunch of them are panels and the joy of fixed effects and trends let us say a whole bunch more.    Second, the Statistical Offices, with a little help from their friends, go the extra mile to make sure these are freely available on the web and (this is key) are well documented.   For those of you who have done this with datasets, you know it’s not a trivial amount of work.   Finally, these are national in representation – little of what we as researchers collect these days actually let us say something about the whole country.  
 
OK, so national representativeness is good.   What about continental representativeness?   At the CSAE conference, we are looking at probably the largest collection of emerging research on economics in Africa in one place.   We took a deeper look at the agriculture and poverty streams of sessions to see what things looked like (yes these are arguably important topics for African development, but this is not a scientific sample – to get a more scientific view you might want to take a look at Das et. al.’s paper on the distribution of papers across the world).  
 
At the CSAE conference, in the agriculture and poverty sessions, there were papers on 18 countries (out of about 50 in SSA in total). Perhaps more telling is that Ghana and Kenya were the only two countries with more than two papers presented at these sessions, besides the LSMS-ISA ones. If one was to draw a Lorenz curve of the papers per country it would look more or less like this.
 
 
 
Surely one would not expect the papers to distribute evenly (particularly given our sample of one conference), but the impression we get is that work tends to be done where there are good, possibly panel data, or where RCT or lab-in-the-field projects are being set-up. Of the non-ISA countries Ghana was the most represented, with 7 papers. Ethiopia had 5 all non-ISA, papers. Both Ethiopia and Ghana have long been relatively well-endowed in longitudinal data covering agriculture. Kenya has a well-known panel dataset as well, and was represented with 3 papers (possibly fewer than other countries due to the fact that its use is restricted). In fact the two ISA datasets for which panel was not yet available at the time of submission (Ethiopia and Niger) were only used by participants at the conference for cross-country analyses. Nigeria matters a lot for poverty in Africa, given its size. The conference hosted 4 papers on poverty or agriculture in Nigeria, which is better than nothing, but not great. Three of these were using ISA data, the fourth a sample of 300 contract farms.  As the Nigeria panel ages and more waves come to fruition one expects more authors to use it, but donors who are serious about funding public goods may want to look to Nigeria for good (public) returns to their money.
 
Obviously, the LSMS-ISA and other national panel datasets are geared towards answering certain kinds of questions.    What struck us both was the diversity of questions being asked as well as the number of papers per dataset by people who didn’t collect the data, including many African (and Africa-based) researchers.    With spillovers like these, here’s hoping we start seeing the same kind of data for more countries.   The forthcoming LSMS-ISAs in Burkina Faso and Mali is a good start, but we’ve got 41 more countries to go.  
 

Authors

Alberto Zezza

Senior Economist, Development Data Group, World Bank

Join the Conversation

Anonymous
April 01, 2015

Thanks Markus for sharing. The good ground work you have been doing deserves an appreciation from all of us.