World Bank Blogs
Syndicate content

evaluation

So What do I take Away from The Great Evidence Debate? Final Thoughts (for now)

Duncan Green's picture

The trouble with hosting a massive argument, as this blog recently did on the results agenda (the most-read debate ever on this blog) is that I then have to make sense of it all, if only for my own peace of mind. So I’ve spent a happy few hours digesting 10 pages of original posts and 20 pages of top quality comments (I couldn’t face adding the twitter traffic).

(For those of you that missed the wonk-war, we had an initial critique of the results agenda from Chris Roche and Rosalind Eyben, a take-no-prisoners response from Chris Whitty and Stefan Dercon, then a final salvo from Roche and Eyben + lots of comments and an online poll. Epic.)

On the debate itself, I had a strong sense that it was unhelpfully entrenched throughout – the two sides were largely talking past each other,  accusing each other of ‘straw manism’ (with some justification) and lobbing in the odd cheap shot (my favourite, from Chris and Stefan ‘Please complete the sentence ‘More biased research is better because…’ – debaters take note). Commenter Marcus Jenal summed it up perfectly:

Numbers Are Never Enough (especially when dealing with Big Data)

Susan Moeller's picture

The newest trend in Big Data is the personal touch.  When both the New York Times and Fast Company have headlines that trumpet: “Sure, Big Data Is Great. But So Is Intuition.” (The Times) and “Without Human Insight, Big Data Is Just A Bunch Of Numbers.” (Fast Company) you know that a major trend is afoot.

So what’s up?

The claims for what Big Data can do have been extraordinary, witness Andrew McAfee and Erik Brynjolfsson’s seminal article in October in the Harvard Business Review: “Big Data: The Management Revolution,” which began with the showstopper:  “‘You can’t manage what you don’t measure.’”  It’s hard not to feel that Big Data will provide the solutions to everything after that statement.  As the HBR article noted:  “…the recent explosion of digital data is so important. Simply put, because of big data, managers can measure, and hence know, radically more about their businesses, and directly translate that knowledge into improved decision making and performance.”

Lant Pritchett v the Randomistas on the Nature of Evidence - Is a Wonkwar Brewing?

Duncan Green's picture

Recently I had a lot of conversations about evidence. First, one of the periodic retreats of Oxfam senior managers reviewed our work on livelihoods, humanitarian partnership and gender rights. The talk combined some quantitative work (for example the findings of our new ‘effectiveness reviews’), case studies, and the accumulated wisdom of our big cheeses. But the tacit hierarchy of these different kinds of knowledge worried me – anything with a number attached had a privileged position, however partial the number or questionable the process for arriving at it. In contrast, decades of experience were not even credited as ‘evidence’, but often written off as ‘opinion’. It felt like we were in danger of discounting our richest source of insight – gut feeling.

In this state of discomfort, I went off for lunch with Lant Pritchett (right – he seems to have forgiven me for my screw-up of a couple of years ago). He’s a brilliant and original thinker and speaker on any number of development issues, but I was most struck by the vehemence of his critique of the RCT randomistas and the quest for experimental certainty. Don’t get me (or him) wrong, he thinks the results agenda is crucial in ‘moving from an input orientation to a performance orientation’ and set out his views as long ago as 2002 in a paper called ‘It pays to be ignorant’, but he sees the current emphasis on RCTs as an example of the failings of ‘thin accountability’ compared to the thick version.

How to Evaluate Bias and the Messages in Photos

Susan Moeller's picture

Can you tell if a news outlet, an NGO or a government is picturing a person, an event or an issue fairly?  It can be very hard to assess visual “balance” when photos are scattered across a website, and appear sporadically over a span of time.  There may be an anecdotal impression that there is bias, but visual bias has been very difficult to document.

The social media site Pinterest is now making documentation possible.

Have you heard of Pinterest?  According to the site itself, it’s “a Virtual Pinboard” that lets users “organize and share all the beautiful things you find on the web. People use pinboards to plan their weddings, decorate their homes, and organize their favorite recipes.”  Doesn’t exactly sound like the kind of site that would help journalists or academics, governments or NGOs, does it?  But Pinterest is turning out to be a stealth tool for researchers.

Getting Evaluation Right: A Five Point Plan

Duncan Green's picture

Final (for now) evaluationtastic installment on Oxfam’s attempts to do public warts-and-all evaluations of randomly selected projects. This commentary comes from Dr Jyotsna Puri, Deputy Executive Director and Head of Evaluation of the International Initiative for Impact Evaluation (3ie)

Oxfam’s emphasis on quality evaluations is a step in the right direction. Implementing agencies rarely make an impassioned plea for evidence and rigor in their evidence collection, and worse, they hardly ever publish negative evaluations.  The internal wrangling and pressure to not publish these must have been so high:

  • ‘What will our donors say? How will we justify poor results to our funders and contributors?’
  • ‘It’s suicidal. Our competitors will flaunt these results and donors will flee.’
  • ‘Why must we put these online and why ‘traffic light’ them? Why not just publish the reports, let people wade through them and take away their own messages?’
  • ‘Our field managers will get upset, angry and discouraged when they read these.’
  • ‘These field managers on the ground are our colleagues. We can’t criticize them publicly… where’s the team spirit?’
  • ‘There are so many nuances on the ground. Detractors will mis-use these scores and ignore these ground realities.’

The zeitgeist may indeed be transparency, but few organizations are actually doing it.

What Do DFID Wonks Think of Oxfam’s Attempt to Measure its Effectiveness?

Duncan Green's picture

More DFIDistas on the blog: this time Nick York, DFID’s top evaluator and Caroline Hoy, who covers NGO evaluation, comment on Oxfam’s publication of a set of 26 warts-and-all programme effectiveness reviews.

Having seen Karl Hughes’s 3ie working paper on process tracing and talked to the team in Oxfam about evaluation approaches, Caroline Hoy (our lead on evaluation for NGOs) and I have been reading with considerable interest the set of papers that Jennie Richmond has shared with us on ‘Tackling the evaluation challenge – how do we know we are effective?’.

From DFID’s perspective, and now 2 years into the challenges of ‘embedding evaluation’ in a serious way into our own work, we know how difficult it often is to find reliable methods to identify what works and measure impact for complex development interventions.  Although it is relatively well understood how to apply standard techniques in some areas – such as health, social protection, water and sanitation and microfinance – there are whole swathes of development where we need to be quite innovative and creative in finding approaches to evaluation that can deal with the complexity of the issues and the nature of the programmes.  Many of these areas are where NGOs such as Oxfam do their best work.

When We (Rigorously) Measure Effectiveness, What Do We Find? Initial Results from an Oxfam Experiment

Duncan Green's picture

Guest post from ace evaluator Dr Karl Hughes (right, in the field. Literally.)

Just over a year ago now, I wrote a blog featured on FP2P – Can we demonstrate effectiveness without bankrupting our NGO and/or becoming a randomista? – about Oxfam’s attempt to up its game in understanding and demonstrating its effectiveness.  Here, I outlined our ambitious plan of ‘randomly selecting and then evaluating, using relatively rigorous methods by NGO standards, 40-ish mature interventions in various thematic areas’.  We have dubbed these ‘effectiveness reviews’.  Given that most NGOs are currently grappling with how to credibly demonstrate their effectiveness, our ‘global experiment’ has grabbed the attention of some eminent bloggers (see William Savedoff’s post for a recent example).  Now I’m back with an update.

ICT and rural education in China

Michael Trucano's picture

answers on how best to proceed may come in all shapes and sizesLast year on this blog, I asked a few questions (eLearning, Africa and ... China?) as a result of my participation in a related event in Dar Es Salaam where lots of my African colleagues were ‘talking about China’, but where few Chinese (researchers, practitioners, firms, officials) were present. This year's eLearning Africa event in Benin, in contrast, featured for the first time a delegation of researchers from China, a visit organized by the International Research and Training Centre for Rural Education (INRULED), a UNESCO research center headquartered at Beijing Normal University (with additional outposts at Baodin, Nanjing and Gansu). Hopefully this is just the beginning of a positive trend to open up access to knowledge about what is working (and isn’t working) related to ICT use in education in places in rural China that might more resemble certain situations and contexts in many developing countries than those drawn from experiences in, for example, Boston or Singapore (or from Shanghai and Beijing, for that matter). Establishing working level linkages between researchers and practitioners (and affiliated institutions) in China and Africa, can be vital to helping encourage such knowledge exchanges.

Let them eat laptops?*

Michael Trucano's picture

in my hand I have a very precious gift for youAs a result of reading the recent IDB study on the impact of the One Laptop Per Child project in Peru, my World Bank colleague Berk Ozler recently published a great post on the World Bank's Development Impact blog asking "One Laptop Per Child is not improving reading or math. But, are we learning enough from these evaluations?

Drawing insights from his readings of a few evaluations of technology use (one in Nepal [PDF] and one in Romania) he notes that, at quick glance, some large scale implementations of educational technologies are, for lack of a more technical term, rather a 'mess':

"The reason I call this a mess is because I am not sure (a) how the governments (and the organizations that help them) purchased a whole lot of these laptops to begin with and (b) why their evaluations have not been designed differently – to learn as much as we can from them on the potential of particular technologies in building human capital."

Three members of the team at IDB that led the OLPC Peru evaluation have responded ("One Laptop per Child revisited") in part to question (b) in the portion of Berk's informative and engaging post excerpted above.  I thought I'd try to try to help address question (a).

First let me say: I have no firsthand knowledge of the background to the OLPC Peru project specifically, nor of the motivations of various key actors instrumental in helping to decide to implement the program there as it was implemented, beyond what I have read about it online. (There is quite a lot written about this on the web; I won't attempt to summarize the many vibrant commentaries on this subject, but, for those who speak Spanish or who are handy with online translation tools, some time with your favorite search engine should unearth some related facts and a lot of opinions -- which I don't feel well-placed to evaluate in their specifics.) I have never worked in Peru, and have had only informal contact with some of the key people working on the project there.  The World Bank, while maintaining a regular dialogue with the Ministry of Education in Peru, was not to my knowledge involved in the OLPC project there in any substantive way. The World Bank itself is helping to evaluate a small OLPC pilot in Sri Lanka; a draft set of findings from that research is currently circulating and hopefully it will be released in the not too distant future.

That said, I *have* been involved in various capacities with *lots* of other large scale initiatives in other countries where lots of computers were purchased for use in schools and/or by students and/or teachers, and so I do feel I can offer some general comments based on this experience, in case it might of interest to anyone.

Ten things about computer use in schools that you don't want to hear (but I'll say them anyway)

Michael Trucano's picture

I don't want to hear thisAt an event last year in Uruguay for policymakers from around the world, a few experts who have worked in the field of technology use in education for a long time commented that there was, in their opinion and in contrast to their experiences even a few years ago, a surprising amount of consensus among the people gathered together on what was really important, what wasn't, and on ways to proceed (and not to proceed).  Over the past two years, I have increasingly made the same comment to myself when involved in similar discussions in other parts of the world.  At one level, this has been a welcome development.  People who work with the use of ICTs in education tend to be a highly connected bunch, and the diffusion of better (cheaper, faster) connectivity has helped to ensure that 'good practices and ideas' are shared with greater velocity than perhaps ever before.  Even some groups and people associated with the 'give kids computers, expect magic to happen' philosophy appear to have had some of their more extreme views tempered in recent years by the reality of actually trying to put this philosophy into practice.

That said, the fact that "everyone agrees about most everything" isn't always such a good thing.  Divergent opinions and voices are important, if only to help us reconsider why we believe what we believe. (They are also important because they might actually be right, of course, and all of the rest of us wrong, but that's another matter!) Even where there is an emerging consensus among leading thinkers and practitioners about what is critically important, this doesn't mean that what is actually being done reflects this consensus -- or indeed, that this consensus 'expert' opinion is relevant in all contexts.

Call for Applications: 2012 Summer Institute in Communication and Governance Reform

Johanna Martinsson's picture

The World Bank’s External Affairs Operational Communications Department, the World Bank Institute’s Governance Practice, the Annenberg School for Communication at the University of Pennsylvania, and the Annenberg School for Communication and Journalism at the University of Southern California are currently accepting applications for the 2012 Summer Institute in Communication and Governance Reform, to be held from June 16 to 27, 2012, at the University of Southern California in Los Angeles.

The 12-day course will equip participants with knowledge about the most recent advances in communication and proven techniques in reform implementation. Participants will develop core competencies essential to bringing about real change, leading to development results in a wide range of sectors.  The course seeks to impart critical skills in the following key areas:

Evaluating One Laptop Per Child (OLPC) in Peru

Michael Trucano's picture

learning learningFew would argue against the notion that the One Laptop Per Child project (OLPC, originally referred to by many as the '$100 laptop project') has been the most high profile educational technology initiative for developing countries over the past half-decade or so. It has garnered more media attention, and incited more passions (pro and con), than any other program of its kind. What was 'new' when OLPC was announced back in 2005 has become part of mainstream discussions in many places today (although it is perhaps interesting to note that, to some extent, the media attention around the Khan Academy is crowding into the space in the popular consciousness that OLPC used to occupy), and debates around its model have animated policymakers, educators, academics, and the general public in way that perhaps no other educational technology initiative has ever done. Given that there is no shortage of places to find information and debate about OLPC, this blog has discussed it only a few times, usually in the context of talking about Plan Ceibal in Uruguay, where the small green and white OLPC XO laptops are potent symbols of the ambitious program that has made that small South American country a destination for many around the world seeking insight into how to roll out so-called 1-to-1 computing initiatives in schools very quickly, and to see what the results of such ambition might be.

The largest OLPC program to date, however, has not been in Uruguay, but rather in Peru, and many OLPC supporters have argued that the true test of the OLPC approach is perhaps best studied there, given its greater fealty to the underlying pedagogical philosophies at the heart of OLPC and its focus on rural, less advantaged communities. Close to a million laptops are meant to have been distributed there to students to date (902,000 is the commonly reported figure, although I am not sure if this includes the tens of thousands of laptops that were destroyed in the recent fire at a Ministry of Education warehouse). What do we know about the impact of this ambitious program?

Surveying ICT use in education in Brazil

Michael Trucano's picture

Brazilian students queuing for their daily bread -- will their daily Internet be far behind?An on-going series in the New York Times ('Grading the digital school') is exploring the impact of educational technology programs in U.S. schools. One recent article in this series noted that "Hope and enthusiasm are soaring here. But not test scores." This phenomenon is not limited to schools in rich countries like the United States, of course:

"Although the government has invested resources in ensuring the broad use of ICT in education, the results of this use in meeting the goals and targets of educational programs are, however, virtually unknown."

This statement, which could apply to scores of countries around the world, can be found near the very start of TIC Educação 2010 ("ICT Education 2010"), a fascinating new survey on the use of ICTs in Brazilian schools.

The perfectionists versus the reductionists

Markus Goldstein's picture

coauthored with Jishnu Das

Women perform 66 percent of the world’s work, and produce 50 percent of the food, yet earn only 10 percent of the income…. 

--Former President Bill Clinton addressing the annual meeting of the Clinton Global Initiative (September 2009)

Impressive, heart-wrenching, charity-inducing, get off your sofa and go do something heartbreaking.

But Wrong.

More on Indices: Evaluating the Evaluators

Shanthi Kalathil's picture

Building partly on a previous post on the value of indices, I'm highlighting this week a new edited volume published by Peter Lang Press, entitled Measures of Press Freedom and Media Contributions to Development: Evaluating the Evaluators. This rich and informative collection of essays, edited by Monroe Price, Susan Abbott and Libby Morgan, focuses a spotlight on well known indices in the area of press freedom and media independence, raising valuable questions about what the indices are measuring, what they are not measuring, and the linkage between assistance to independent media and democratization. I've contributed a chapter to this volume, as have expert colleagues such as Guobin Yang, Andrew Puddephat, Lee Becker and Tudor Vlad, Craig LaMay, fellow CommGAP blogger Silvio Waisbord, and many others.


Pages