You Can’t Manage What You Don’t Measure

When it comes to measuring student learning outcomes, you often hear critics refrain “you can’t fatten a cow by weighing him all the time,” in an attempt to say that you cannot truly educate students by spending all the time getting ready for testing and recording test scores. Of course not. But as the management guru Peter Drucker famously said, “If you can't measure it, you can't manage it.”

If you don’t measure, then how do you know how you are doing? How do you know if you are doing well? Or poorly? Without adequate information about learning outcomes, students, families, teachers cannot properly decide on what actions should be taken to improve learning outcomes. And improving cognitive skills is important for economic development.

In my experience visiting schools over the years, I have spoken to teachers who use assessment results to gauge their teaching and decide on allocating inputs. I have also been to schools where teachers have no test scores. How are they expected to make choices about what they do in the classroom without any information?

I suppose the criticism against measurement is directed at high stakes testing and over-reliance on assessments (see for example). 

But high-stakes testing is one way of using information to improve outcomes. There is some evidence that in some states in the United States such as Florida and New York school actors respond through accountability systems. Another high-stakes example has come to be known as “naming and shaming” school leaders. This has been used in Great Britain and a policy change led to a unique natural experiment in England and Wales. Prior to devolution in 1999 in Great Britain, the governance of schools (and hospitals) in England and Wales was similar. After devolution, the funding and organization continued to be similar, but the two governments adopted different policies in the pursuit of common objectives. A study of these two “natural experiments” compared outcomes in the two countries before and after these policy changes. The governance model of “trust and altruism” resulted in worse reported performance in Wales as compared with England on what were each government's key objectives. “Naming and shaming” school leaders worked in England, as compared with Wales, resulting in improved examination performance.

In school systems where parents choose schools, information is vital for the decision. But it can also be used as an accountability measure and prod providers into improving outcomes. In the Netherlands, the school quality scores not only improve school choice, they also lead to school improvement. Both average grades and the number of diplomas awarded increase after receiving a negative score, and these responses cannot be attributed to gaming activities of the school. For schools that receive the most negative ranking, the short-term effects (one year after a change in the ranking of schools) of quality transparency on final exam grades equal 10 to 30 percent of a standard deviation increase.

But what about simply providing information on learning outcomes? Would that be enough to improve what is going on?

The use of information from international student assessments helped reform an education system. It wasn’t that there was no interest in learning. It’s a case of lack of information. Over the past two decades, the Jordanian education system has made significant advances. Net enrollment in basic education increased from 89 percent in 2000 to 97 percent in 2012. Transition rates to secondary education increased from 63 to 79 percent during the same period. At the same time, Jordan made significant gains on international surveys of student achievement, with a particularly impressive gain of almost 30 points on the science portion of the Third International Mathematics and Science Study (TIMSS). Benchmarking their education system and constant feedback between researchers and policymakers contributed to this achievement.

Jordan was the first Arab country to participate in an international student assessment. This took place at the same time that the country launched its comprehensive system reform. The assessment results were alarming as performance was extremely poor. As a follow up, Jordan sped up their efforts on reforming the education system. The curriculum was targeted, reviewed, and new textbooks were developed. Teacher qualifications were reviewed and evaluated, and teacher upgrading through a university bridging program was implemented. Benchmarks for 13-year-olds’ achievement were established. Jordanian authorities developed a feedback loop between those researching the education system and those implementing change through to teachers. Teachers were supported with guides and feedback. In fact, teacher confidence was one of the factors associated with improvements in learning outcomes.

But even just information, even low stakes testing can lead to improvements. This is the case in Mexico prior to the introduction of national and universal student assessments. Holding everything else constant, states with tests and accountability systems performed significantly better than states without tests. Furthermore, such simple accountability measures are demonstrably cost-effective measures for improving outcomes. Even in Finland, where there is no high-states testing until the end of secondary school, assessments are used to improve learning and they are “encouraging and supportive by nature.”

This conforms to international evidence. Differences in educational institutions explain the large international differences in student performance in cognitive achievement tests.

Moreover, test based accountability – be it high stakes, low stakes, or simply information – is cost-effective. "Even if accountability costs were 10 times as large as they are, they would still not amount to 1 percent of the cost of public education!" argues Caroline Hoxby in an influential paper. According to the Association of American Publishers, total revenues from the sales of tests, related teaching materials, and services amounted to $234 million in 2000. Hoxby calculates that the revenues amount to less than $5 a student. In relation to the overall average cost of educating a child, payments to all test makers represented just 0.07 percent (seven-hundredths of 1 percent) of the cost of basic education in the United States.

Globally, it has been shown that testing is among the least expensive innovations in education reform. In fact, in no country does testing cost more than 0.3 percent of the national education budget at the basic education level.

But measure what is important. That is, tests should inform teachers about how their students are progressing and this feedback should be timely and useful. In other words, avoid teaching to the “bad” test. Policymakers have a role to play, too, as Hoxby points out, as they should “encourage teaching the curriculum, but they should discourage teaching the test.”

So, test and be tested; manage what you measure; and let’s improve learning outcomes for all children.

Harry A. Patrinos

Senior Adviser, Education

