Syndicate content

evaluation

Will Midterm Evaluations Become the Dinosaurs of Development?

Milica Begovic's picture

I argued a few months back that information we get from story-telling is fundamentally different to what we get from polls and surveys. If we can’t predict what’s coming next, then we have to continuously work to understand what has and is happening today. (See: Patterns of voices from the Balkans – working with UNDP)

Methods we’re all used to using (surveys, mid-term evaluations) are ill prepared to do that for us and increasingly act as our blindfolds.

Why stories?

As I started working through the stories we collected, this question has become even stronger.

To give you some background, we started testing whether stories could help us:

So What do I take Away from The Great Evidence Debate? Final Thoughts (for now)

Duncan Green's picture

The trouble with hosting a massive argument, as this blog recently did on the results agenda (the most-read debate ever on this blog) is that I then have to make sense of it all, if only for my own peace of mind. So I’ve spent a happy few hours digesting 10 pages of original posts and 20 pages of top quality comments (I couldn’t face adding the twitter traffic).

(For those of you that missed the wonk-war, we had an initial critique of the results agenda from Chris Roche and Rosalind Eyben, a take-no-prisoners response from Chris Whitty and Stefan Dercon, then a final salvo from Roche and Eyben + lots of comments and an online poll. Epic.)

On the debate itself, I had a strong sense that it was unhelpfully entrenched throughout – the two sides were largely talking past each other,  accusing each other of ‘straw manism’ (with some justification) and lobbing in the odd cheap shot (my favourite, from Chris and Stefan ‘Please complete the sentence ‘More biased research is better because…’ – debaters take note). Commenter Marcus Jenal summed it up perfectly:

Lant Pritchett v the Randomistas on the Nature of Evidence - Is a Wonkwar Brewing?

Duncan Green's picture

Recently I had a lot of conversations about evidence. First, one of the periodic retreats of Oxfam senior managers reviewed our work on livelihoods, humanitarian partnership and gender rights. The talk combined some quantitative work (for example the findings of our new ‘effectiveness reviews’), case studies, and the accumulated wisdom of our big cheeses. But the tacit hierarchy of these different kinds of knowledge worried me – anything with a number attached had a privileged position, however partial the number or questionable the process for arriving at it. In contrast, decades of experience were not even credited as ‘evidence’, but often written off as ‘opinion’. It felt like we were in danger of discounting our richest source of insight – gut feeling.

In this state of discomfort, I went off for lunch with Lant Pritchett (right – he seems to have forgiven me for my screw-up of a couple of years ago). He’s a brilliant and original thinker and speaker on any number of development issues, but I was most struck by the vehemence of his critique of the RCT randomistas and the quest for experimental certainty. Don’t get me (or him) wrong, he thinks the results agenda is crucial in ‘moving from an input orientation to a performance orientation’ and set out his views as long ago as 2002 in a paper called ‘It pays to be ignorant’, but he sees the current emphasis on RCTs as an example of the failings of ‘thin accountability’ compared to the thick version.

How to Evaluate Bias and the Messages in Photos

Susan Moeller's picture

Can you tell if a news outlet, an NGO or a government is picturing a person, an event or an issue fairly?  It can be very hard to assess visual “balance” when photos are scattered across a website, and appear sporadically over a span of time.  There may be an anecdotal impression that there is bias, but visual bias has been very difficult to document.

The social media site Pinterest is now making documentation possible.

Have you heard of Pinterest?  According to the site itself, it’s “a Virtual Pinboard” that lets users “organize and share all the beautiful things you find on the web. People use pinboards to plan their weddings, decorate their homes, and organize their favorite recipes.”  Doesn’t exactly sound like the kind of site that would help journalists or academics, governments or NGOs, does it?  But Pinterest is turning out to be a stealth tool for researchers.

Getting Evaluation Right: A Five Point Plan

Duncan Green's picture

Final (for now) evaluationtastic installment on Oxfam’s attempts to do public warts-and-all evaluations of randomly selected projects. This commentary comes from Dr Jyotsna Puri, Deputy Executive Director and Head of Evaluation of the International Initiative for Impact Evaluation (3ie)

Oxfam’s emphasis on quality evaluations is a step in the right direction. Implementing agencies rarely make an impassioned plea for evidence and rigor in their evidence collection, and worse, they hardly ever publish negative evaluations.  The internal wrangling and pressure to not publish these must have been so high:

  • ‘What will our donors say? How will we justify poor results to our funders and contributors?’
  • ‘It’s suicidal. Our competitors will flaunt these results and donors will flee.’
  • ‘Why must we put these online and why ‘traffic light’ them? Why not just publish the reports, let people wade through them and take away their own messages?’
  • ‘Our field managers will get upset, angry and discouraged when they read these.’
  • ‘These field managers on the ground are our colleagues. We can’t criticize them publicly… where’s the team spirit?’
  • ‘There are so many nuances on the ground. Detractors will mis-use these scores and ignore these ground realities.’

The zeitgeist may indeed be transparency, but few organizations are actually doing it.

What Do DFID Wonks Think of Oxfam’s Attempt to Measure its Effectiveness?

Duncan Green's picture

More DFIDistas on the blog: this time Nick York, DFID’s top evaluator and Caroline Hoy, who covers NGO evaluation, comment on Oxfam’s publication of a set of 26 warts-and-all programme effectiveness reviews.

Having seen Karl Hughes’s 3ie working paper on process tracing and talked to the team in Oxfam about evaluation approaches, Caroline Hoy (our lead on evaluation for NGOs) and I have been reading with considerable interest the set of papers that Jennie Richmond has shared with us on ‘Tackling the evaluation challenge – how do we know we are effective?’.

From DFID’s perspective, and now 2 years into the challenges of ‘embedding evaluation’ in a serious way into our own work, we know how difficult it often is to find reliable methods to identify what works and measure impact for complex development interventions.  Although it is relatively well understood how to apply standard techniques in some areas – such as health, social protection, water and sanitation and microfinance – there are whole swathes of development where we need to be quite innovative and creative in finding approaches to evaluation that can deal with the complexity of the issues and the nature of the programmes.  Many of these areas are where NGOs such as Oxfam do their best work.

When We (Rigorously) Measure Effectiveness, What Do We Find? Initial Results from an Oxfam Experiment

Duncan Green's picture

Guest post from ace evaluator Dr Karl Hughes (right, in the field. Literally.)

Just over a year ago now, I wrote a blog featured on FP2P – Can we demonstrate effectiveness without bankrupting our NGO and/or becoming a randomista? – about Oxfam’s attempt to up its game in understanding and demonstrating its effectiveness.  Here, I outlined our ambitious plan of ‘randomly selecting and then evaluating, using relatively rigorous methods by NGO standards, 40-ish mature interventions in various thematic areas’.  We have dubbed these ‘effectiveness reviews’.  Given that most NGOs are currently grappling with how to credibly demonstrate their effectiveness, our ‘global experiment’ has grabbed the attention of some eminent bloggers (see William Savedoff’s post for a recent example).  Now I’m back with an update.

ICT and rural education in China

Michael Trucano's picture

answers on how best to proceed may come in all shapes and sizesLast year on this blog, I asked a few questions (eLearning, Africa and ... China?) as a result of my participation in a related event in Dar Es Salaam where lots of my African colleagues were ‘talking about China’, but where few Chinese (researchers, practitioners, firms, officials) were present. This year's eLearning Africa event in Benin, in contrast, featured for the first time a delegation of researchers from China, a visit organized by the International Research and Training Centre for Rural Education (INRULED), a UNESCO research center headquartered at Beijing Normal University (with additional outposts at Baodin, Nanjing and Gansu). Hopefully this is just the beginning of a positive trend to open up access to knowledge about what is working (and isn’t working) related to ICT use in education in places in rural China that might more resemble certain situations and contexts in many developing countries than those drawn from experiences in, for example, Boston or Singapore (or from Shanghai and Beijing, for that matter). Establishing working level linkages between researchers and practitioners (and affiliated institutions) in China and Africa, can be vital to helping encourage such knowledge exchanges.

Let them eat laptops?*

Michael Trucano's picture

in my hand I have a very precious gift for youAs a result of reading the recent IDB study on the impact of the One Laptop Per Child project in Peru, my World Bank colleague Berk Ozler recently published a great post on the World Bank's Development Impact blog asking "One Laptop Per Child is not improving reading or math. But, are we learning enough from these evaluations?

Drawing insights from his readings of a few evaluations of technology use (one in Nepal [PDF] and one in Romania) he notes that, at quick glance, some large scale implementations of educational technologies are, for lack of a more technical term, rather a 'mess':

"The reason I call this a mess is because I am not sure (a) how the governments (and the organizations that help them) purchased a whole lot of these laptops to begin with and (b) why their evaluations have not been designed differently – to learn as much as we can from them on the potential of particular technologies in building human capital."

Three members of the team at IDB that led the OLPC Peru evaluation have responded ("One Laptop per Child revisited") in part to question (b) in the portion of Berk's informative and engaging post excerpted above.  I thought I'd try to try to help address question (a).

First let me say: I have no firsthand knowledge of the background to the OLPC Peru project specifically, nor of the motivations of various key actors instrumental in helping to decide to implement the program there as it was implemented, beyond what I have read about it online. (There is quite a lot written about this on the web; I won't attempt to summarize the many vibrant commentaries on this subject, but, for those who speak Spanish or who are handy with online translation tools, some time with your favorite search engine should unearth some related facts and a lot of opinions -- which I don't feel well-placed to evaluate in their specifics.) I have never worked in Peru, and have had only informal contact with some of the key people working on the project there.  The World Bank, while maintaining a regular dialogue with the Ministry of Education in Peru, was not to my knowledge involved in the OLPC project there in any substantive way. The World Bank itself is helping to evaluate a small OLPC pilot in Sri Lanka; a draft set of findings from that research is currently circulating and hopefully it will be released in the not too distant future.

That said, I *have* been involved in various capacities with *lots* of other large scale initiatives in other countries where lots of computers were purchased for use in schools and/or by students and/or teachers, and so I do feel I can offer some general comments based on this experience, in case it might of interest to anyone.

Ten things about computer use in schools that you don't want to hear (but I'll say them anyway)

Michael Trucano's picture

I don't want to hear thisAt an event last year in Uruguay for policymakers from around the world, a few experts who have worked in the field of technology use in education for a long time commented that there was, in their opinion and in contrast to their experiences even a few years ago, a surprising amount of consensus among the people gathered together on what was really important, what wasn't, and on ways to proceed (and not to proceed).  Over the past two years, I have increasingly made the same comment to myself when involved in similar discussions in other parts of the world.  At one level, this has been a welcome development.  People who work with the use of ICTs in education tend to be a highly connected bunch, and the diffusion of better (cheaper, faster) connectivity has helped to ensure that 'good practices and ideas' are shared with greater velocity than perhaps ever before.  Even some groups and people associated with the 'give kids computers, expect magic to happen' philosophy appear to have had some of their more extreme views tempered in recent years by the reality of actually trying to put this philosophy into practice.

That said, the fact that "everyone agrees about most everything" isn't always such a good thing.  Divergent opinions and voices are important, if only to help us reconsider why we believe what we believe. (They are also important because they might actually be right, of course, and all of the rest of us wrong, but that's another matter!) Even where there is an emerging consensus among leading thinkers and practitioners about what is critically important, this doesn't mean that what is actually being done reflects this consensus -- or indeed, that this consensus 'expert' opinion is relevant in all contexts.


Pages