Collecting data in education can be a tricky business. After spending considerable resources to design a representative study, enlist and train data collectors, and organize the logistics of data collection, we want to ensure that we capture as true a picture of the situation on the ground as possible. This can be particularly challenging when we attempt to measure complex concepts, such as child development, learning outcomes, or the quality of an educational environment.
Data can be biased by many factors. For example, the very act of observation by itself can influence behavior. How can we expect a teacher to behave “normally” when outsiders sit in her or his classroom taking detailed notes about everything they do? Social desirability bias, where subjects seek to represent themselves in the most positive light, is another common challenge. Asking a teacher, “Do you hit children in your classroom?” may elicit an intense denial, even if the teacher still has a cane in one hand and the ear of a misbehaving child in another.
Assessments make a lot of people nervous, and I’m not just talking about the students who have to take them. As a psychometrician (assessment expert) and World Bank staffer, I’ve worked on assessment projects in more than 30 countries around the world over the past 10 years. Time and again, I’ve found great interest in student assessment as a tool for monitoring and supporting student learning coupled with great unease over how exactly to go about ‘doing’ an assessment.
Testing: For better and/or for worse, many education systems are built around it (although some perhaps not as much as others). In numerous countries, a long-standing joke asks whether 'MOE' stands for 'Ministry of Education', or 'Ministry of Examination'? (This joke is not meant to be funny, of course.)
'Testing' is a source of and trigger for controversies of all different sorts, in different places around the world. The word 'standardized' is considered negative by many people when it is used to modify 'testing', but it is perhaps worth noting that a lack of 'standardized' tests can have important implications for equity. Within the U.S., the Obama administration recently declared that students should spend no more than 2% of classroom instruction time taking tests. However one feels about the wisdom of setting such hard targets (one could argue, for instance, that it's not 'testing' per se that's the problem, to the extent that it is indeed a problem, but rather 'bad testing') and the various types of time accounting shenanigans that might predictably emerge so that the letter but not the spirit of such declarations are met (a school could be creative about what it feels constitutes a 'test' or 'instruction time', for example), there is no denying the centrality of testing to approaches to education in schools around the world.
'Testing' means different things to different people. There are important distinctions between assessments that are formative (i.e. low stakes means to provide feedback to teachers and students on how much students are learning, as a way to identify strengths and weaknesses and act accordingly) and those that are summative (e.g. high stakes final exams).
It's also potentially worth noting that tests can be utilized not only as means of assessment, but explicitly as tools to help learning we well (an approach sometimes called 'studying by testing'; here's an interesting related paper: When Does Testing Enhance Retention? A Distribution-Based Interpretation of Retrieval as a Memory Modifier [pdf]).
The point here is not to get into a debate about testing, as illuminating and energetic (or frustrating and political) as such a debate might be. Rather, it is to shine a light on some related things happening at the frontier of activities and experiences in this area that are comparatively little known in most of the world but which may be increasingly relevant to many education systems in the coming years.
The nature of tests and testing is changing, enabled in large part by new technologies. (Side note: One way to predict where there are going to be large upcoming public sector procurement activities to provide computing equipment and connectivity to schools is to identify places where big reforms around standardized testing are underway.) While there continues to be growing interest (and hype, and discussion, and confusion) surrounding the potential for technology to enable more 'personalized learning', less remarked on in many quarters is the potential rise in more personalized testing.
The science fiction author William Gibson has famously observed that, The future is already here, it's just not evenly distributed. When it comes to educational technology use around the world, there are lots of interesting 'innovations at the edges' that are happening far away from the spots where one might reflexively look (like Seoul, Silicon Valley or Shanghai, to cite just a few examples that begin with the letter 'S') to learn practical lessons about what might be coming next, and how this may come to pass.
When it comes to testing, one such place is ... Georgia. This is not the Georgia in the southern United States (capital = Atlanta, where people play football while wearing helmets), but rather the small, mountainous country that borders the Black Sea which emerged as a result of the breakup of the Soviet Union (capital = Tbilisi, where people play football with their feet).
Georgia is the first country in the world to utilize computer adaptive testing for all of its school leaving examinations.
What does this mean,
what does it look like in practice,
what is there to learn from this experience,
and why should we care?
Are assessments and standardized tests critical to measuring the effectiveness of educational systems? How can communities demand accountability from local schools? Suvojit Chattopadhyay argues that assesments can serve as a lever to improving education.
In our last two blogs, we spoke about why measurement is key for development professionals and what should we measure? and about some take-aways from the medical profession on the measurement of competencies and performance. In this blog, we discuss specific ways we can use those lessons and apply them to the development sector.
As we discussed, in the medical world, lessons learned in competency and performance measurement relate to:
The focus on competencies, performance, and the space in between
Competence being specific to situations and existing on a continuum
Assessment as a program of activity that uses multi-source qualitative and quantitative information
The importance of the reproducibility of assessments
Encouraging the use of a portfolio.
But how can the above be specifically applied to development? Development practitioners can certainly take a page from the medical profession, as the stakes for getting measurement right are no less than bettering the lives of those who live on less than a dollar a day.
Over the past several decades, developing countries have made remarkable progress in achieving quantitative education targets. Since the turn of the millennium, almost 50 million children around the world have gained access to basic education – and most are reaching completion. But as recent PISA data shows, this is not typically the case for qualitative improvements in education. A persistent learning gap remains for an estimated 250 million children who are unable to read and do math, even after spending three or more years in the classroom.
The headlines started to stream as soon as the PISA results were in: “Asian countries top OECD's latest PISA survey.” “Poor academic standards.” “Students score below international averages.” It depends on the country, of course. A time to celebrate for some, a time to lament for others.
I recently came across a report card from my secondary school days in Ireland. It was an interesting read. My progress in areas as diverse as mathematics, singing, Irish language, and physical education was reported on in the form of marks, grades, and narrative feedback. Some teachers provided little information on my learning. Others went into detail. I was impressed by the number of areas in which my progress had been assessed (less so by my lack of singing ability, which, evidently, had been spotted early on!).
Flash forward to 2013, and there is a conversation raging in the development community about how to measure and report on learning globally. A huge concern is the fact that too often children leave school without acquiring the basic knowledge and skills they need to lead productive lives. To make matters worse, there is a global data gap on learning that is impeding efforts to better understand this crisis and how to achieve learning for all.
Most educational interventions are widely considered successful if they increase test-scores -- which indicate cognitive ability. Presumably, this is because higher test-scores in school imply gains such as higher wages later on.
However, non-cognitive outcomes also matter---a lot.
All eyes are focused on South Africa this year: it both hosts the World Cup and celebrates its 20th anniversary since the end of apartheid when Nelson Mandela walked those historic steps to freedom. In post-aparteid South Africa, education promised to hold part of the answer towards creating a fairer society. Development through education – would lead to freedom. The burning question remains - has this been achieved?
In a 2007 World Bank publication, Shafika Isaacs summarized the desired changes South Africa hoped to undertake: