Syndicate content

data collection

Electronic versus paper-based data collection: reviewing the debate

This post was co-authored by Sacha Dray, Felipe Dunsch, and Marcus Holmlund.

Impact evaluation needs data, and often research teams collect this from scratch. Raw data fresh from the field is a bit like dirty laundry: it needs cleaning. Some stains are unavoidable – we all spill wine/sauce/coffee on ourselves from time to time, which is mildly frustrating but easily discarded as a fact of life, a random occurrence. But as these occurrences become regular we might begin to ask ourselves whether something is systematically wrong.

Issues of data collection and measurement

Berk Ozler's picture
About five years ago, soon after we started this blog, I wrote a blog post titled “Economists have experiments figured out. What’s next? (Hint: It’s Measurement)” Soon after the post, I had folks from IPA email me saying we should experiment with some important measurement issues, making use of IPA’s network of studies around the world.

A curated list of our postings on Measurement and Survey Design

David McKenzie's picture
This list is a companion to our curated list on technical topics. It puts together our posts on issues of measurement, survey design, sampling, survey checks, managing survey teams, reducing attrition, and all the behind-the-scenes work needed to get the data needed for impact evaluations.

Taming the Terra Incognita of PPPs: The case for data as an exploration tool

Fernanda Ruiz Nunez's picture
Image courtesy of
The PPP territory spans the globe, and the debate over its effectiveness as a financing tool to achieve development goals reaches equally far and wide.

​Most recently, the Financing for Development Conference in Addis Ababa, Ethiopia sparked even more discussion about the role of public-private partnerships. The official line, spoken in a multitude of tongues, is that PPPs have an important role to play, and results are dependent on projects being procured, managed and regulated well. But one thing is clear in every language: “results” are based mainly on anecdotal evidence and case studies where attribution remains dubious, and findings cannot be generalized as they depend on the particular characteristics of the specific projects.
We can do better. As economists, development professionals, finance experts, and explorers of new and creative solutions to solve the problem of poverty, we must do better. And we will – with better data.
Lack of data has constrained the empirical literature on PPPs, in turn constraining our ability to tap the territory of PPPs and its potential to transform markets. After all, what do we really know about the economic impact of PPPs? Our first-ever literature review, underway now (the first draft is available at, has laid an initial foundation for knowledge, and we have made the first draft available so that colleagues and interested individuals and organizations can contribute their data.

Can our parents collect reliable and timely price data?

Nada Hamadeh's picture

During the past few years, interest in high-frequency price data has grown steadily.  Recent major economic events - including the food crisis and the energy price surge – have increased the need for timely high-frequency data, openly available to all users.  Standard survey methods lag behind in meeting this demand, due to the high cost of collecting detailed sub-national data, the time delay usually associated with publishing the results, and the limitations to publishing detailed data. For example, although national consumer price indices (CPIs) are published on a monthly basis in most countries, national statistical offices do not release the underlying price data.

Crowd sourced price data

The many faces of corruption: The importance of digging deeper

Francesca Recanatini's picture

About a month ago two colleagues (Greg Kisunko and Steve Knack) posted a blog on “The many faces of corruption in the Russian Federation”. Their post, based on the elegant analysis of the 2011/2012 Russian BEEPS, underscores a point that many practitioners and researchers are now beginning to appreciate because of the availability of new, disaggregated data: corruption is not a homogenous phenomenon, but rather a term that encompasses many diverse phenomena that can have profoundly different impact on the growth and the development of a country. If we delve deeper into this disaggregated data, we observe that within the same country can coexist significantly different sub-national realities when it comes to the phenomenon we label “corruption”.

What can marketing experiments teach us about doing development research?

David McKenzie's picture

The March 2011 issue of the Harvard Business Review has “a step-by-step guide to smart business experiments” by Eric Anderson and Duncan Simester, two marketing professors who have done a number of experiments with large firms in the U.S. Their bottom line message for businesses is:

The 10-Cent GPS

Holly Krambeck's picture

We know that technology is not a panacea, that gadgetry and software are not always the right solutions for our transport problems. But how do we know – really know -- when technology is truly the wrong way to go – when, say, using an old-fashioned compass is genuinely better than a GPS?

Thanks to blogger Sebastiao Ferreira, writing for MIT’s CoLab Radio, I have learned about an intriguing phenomenon in Lima, where entrepreneur data collectors, named dateros, stand with clipboards along frequented informal microbus routes, collecting data on headways, passenger counts, and vehicle occupancy levels. The microbus drivers pay dateros about 10-cents per instant update, and they use the information to adjust their driving speed.  For example, if there is a full bus only a minute ahead of the driver’s vehicle, the driver will slow down, hoping to collect more passengers further down the route. In informal transit systems, where drivers’ incomes are directly tied to passenger counts, paying dateros is a good investment (Photo from MIT CoLab Radio).

If you think about it, use of dateros could be more efficient than traditional schedule or GPS-based dispatch, because the headways are dynamically and continuously updated to optimize the number of passengers transported at any given time of day.  According to Jeff Warren (a DIY cartography pioneer), the dateros have been praised as the “natural database, an ‘informal bank’ of transportation optimization data.”

Does this little-known practice call into question our traditional prescription for high-tech solutions to bus dispatch?

“They are sitting on a gold mine and don’t even know it….”

Holly Krambeck's picture

The other day, my colleague Roger Gorham, a transport economist working in Africa, shared with me an interesting story. He was in Lagos, meeting with stakeholders about setting up public-private partnerships for transport initiatives. One meeting revealed that, in an effort to improve service, a private entity had invested in new taxis for Lagos and in each had installed a GPS unit. This little revelation may not seem interesting, but it was very exciting to Roger, who also learned that the company has amassed more than 3 years of GPS tracking data for these taxis (which, incidentally, troll the city like perfect probes, nearly 24 hours a day, 7 days a week) and that this data could be made available to him, if he thought he might make some use of it.

Now, if you are reading this blog, chances are that you realize that with this kind of data and a little analysis, we can quickly and easily reveal powerful insights about a city’s transport network – when and where congestion occurs, average traffic volumes, key traffic generators (from taxi pick-up point data), occurrence of accidents and traffic blockages in real time, and even the estimated effects of congestion and drive cycle on fuel efficiency.

As Roger said, “They are sitting on a gold mine and don’t even know it….”