In recent weeks, several articles have appeared in the main U.S. newspapers– including the Washington Post and the New York Times – discussing the potential benefits and pitfalls of the Los Angeles Times’ decision to publish performance data on individual teachers. Together with an economist, LA Times’ reporters used long-existing data on student test scores by teacher over time, to estimate individual teachers’ “value-added”, that is, the change in a student’s test score in the year that they had a specific teacher, attributing this change to the teacher’s effectiveness. They found enormous variation in the change in scores of students of particular teachers, and published the names of some teachers – both the “best” and “worst”. Further, the paper announced that it will soon release the approximate rankings of all individual teachers in LA.
Will public accountability of individual teacher performance contribute to improve education quality in Los Angeles? Is this something that other education systems around the world struggling with finding options to raise teaching quality and student learning outcomes consider?
Public Accountability Of Individual Teacher Performance - The Case
Unfortunately, we don’t have an empirical evidence base to answer this question yet. But there are some legitimate reasons, and a growing body of research, that suggests that public accountability of individual teacher performance may not be conducive to improving teaching and learning.
First, there are concerns that student test scores may not be a fair measure with which to judge teacher effectiveness. Importantly, research evidence suggests that changes in a student’s test scores during one school year may not accurately reflect the effectiveness of a student’s teacher. While undoubtedly what the teacher knows and is able to do in classrooms can have strong impacts on how much his/her students learn in one year, student learning also results from other factors, including: the previous school experience of the student (did she have a strong/weak teacher in the previous grade, did he have strong/weak teachers in all his previous years in school?); the support he/she receives at home (do her parents encourage school work and support the student by reading, helping out in homework, etc.); what are the other students like who are in the same classroom (are they high-achieving, have discipline issues?).
Moreover, even if the change in student test scores during one year could be accurately attributed to the effort of his/her teacher, research has shown that a substantial part of the variation in test scores does not reflect what the student truly knows and is able to do, but is affected by random variables, such as construction noise outside the school on the day of the assessment. A measure is not considered very reliable when the proportion of “true” variation to “error” variation is relatively low. When this is the case, economists speak of the indicator as not providing a sufficiently strong “signal” of the underlying variable being measured (in this case, teacher effectiveness), or that the indicator is “noisy” (and has a low signal-to-noise ratio). Kane and Staiger, in their 2002 paper “The Promise and Pitfalls of Using Imprecise School Accountability Measures” showed that measures of learning gains, or value-added measures of learning, such as those being estimated by the LA times, are even more noisy than average test scores in one year because the measurement problems described above are compounded when taking the difference between two noisy measures.
In addition, research evidence also indicates that because the classroom composition of individual teachers varies from year to year, teacher effectiveness can vary from year to year as well. Anyone who has been a teacher can probably recall that year in which a student, let’s call him Johnny (they are usually boys), was so disruptive that the teacher could not get as much instructional time as she did in other years.
Does this mean that policy makers should not use measures of student learning to hold teachers accountable? I don’t think so. But I do think that caution is warranted. Because of the high noise in student test score levels and gains, they tend to be more useful for identifying the very worst and best performing teachers. As Joel Klein, the superintendent of New York City’s public schools wisely puts it in a recent NY Times Magazine article, “I wouldn’t try to make big distinctions between the 47th and 55th percentiles.”
But as in most professions, a great majority of teachers are neither the top-performers nor the bottom-performers. And these are the teachers that are with our children every day, in most classrooms throughout the globe. They are the ones who can really make a massive difference in improving the learning outcomes of a majority of students. Using test score gains from several years – instead of only the gains in one year –, researchers argue may help identify the “true” part of student learning that is attributable to one teacher. Also, making public information on school test scores can –but does not always - act as an incentive for teachers and lead to learning gains.
Further, if ultimately the policy goal is to improve teaching and learning, then it may be more effective in most education systems to focus on the use of student assessment results to provide teachers with deep and comprehensive information on how each of their students performed on the various contents of a test, and to support them in analyzing the data to acquire specific content and pedagogic skills to work with their students to improve. But this, no doubt, is a much more difficult task than ranking teachers based on students’ test scores.
Photo credit: Scott Wallace/World Bank