If you pay teachers based on performance do you get better teachers, or do they work harder?


This page in:

If a country announces a policy that it will pay teachers based on some measure of performance two main things could happen. One they would get different (maybe better, maybe worse) people applying for teaching jobs. And two, the teachers they hire could work harder. And either or both of these could translate into different learning outcomes for students.

These questions are at the heart of a neat new paper by Claire Leaver, Owen Ozier, Pieter Serneels and Andrew Zeitlin. They set up an experiment in partnership with the Rwanda Education Board and the Ministry of Education. It’s a two-tier experiment. In part 1, different labor markets for teachers (district-subject combinations) are randomized into either a fixed wage contract (a 20,000 RWF top up) or a pay for performance contract (P4P). The P4P contract gives a bonus of 100,000 RWF (or 15 percent of the annual salary) to teachers who score in the top 20 percent based on presence in the classroom, preparation (lesson plans), pedagogy (captured in classroom observation) and performance (measured through test scores of their students). The first 3 criteria get 50 percent weight, and performance gets the other half.  

In stage 2, Leaver and co. take all of the schools in which an upper-primary teacher recruited under the first stage had applied for and been placed and then rerandomize each school into either a P4P contract or fixed wage contract. This lets them separate out the composition effects of recruitment from the effort of folks on the job. And lest you be worried that some people get the short end of the stick in this rerandomization, they provide a retention bonus (promised ex ante) so that all folks (no matter their initial assignment and beliefs on probabilities) are made whole.

Before turning to the results, its worth discussing two methodological dimensions that are important here. First, because Leaver and co. are randomizing at the level of the market (in the first stage), their ex ante power will be low. Part of their insurance for this is a very careful, thoughtful and actually instructive pre-analysis plan (indeed, they blogged about it last year).

Leaver and co. also collect a ton of data. They start with data on teachers, where they have the teachers training college exam data, as well as the exams they have to sit at the district level in order to get these jobs. Once teachers show up on the job, Leaver and co. collect school surveys (where they get range of administrative data on teachers), teacher surveys (which include not only demographics and background, but also personality traits, self esteem and other attributes), and a set of lab in the field games that they play with teachers (more on these later).

That covers the teachers’ attributes. For the students, Leaver and co. developed subject and grade-based tests based on the national curricula. These are given to a random subset of students over the 3 rounds that Leaver and co. are collecting data.

For the teacher-on-the-job data, Leaver and co. (via the government and IPA) conduct spot checks to measure teacher presence, review a set of lesson plans, and have an observer sit through and measure different activities during a 45-minute lesson. In the summary statistics, it’s interesting to see that in year 1 (which comprised two rounds of data collection) the teachers are there 96-97 percent of the time, but they only have lesson plans 54 percent of the time. That shifts in year two (which only has one round) when teachers are present 90 percent of the time, but have lesson plans 79 percent of the time.

So what do they find? The best way to describe their results is by their hypotheses (in many cases they use more than one test and data source – I am going to focus on the top-level results here so that this post isn’t as long as the paper):

1. Advertised P4P induces differential application qualities. Based on the training college final exam scores the answer seems to be no. This also seems to hold for some other measures. And, moreover, there is no difference in the volume of applications (although this is less precise).

2. Advertised P4P affects the observable skills of recruits placed in schools. Using the teacher skill assessment, they find no significant difference.

3. Advertised P4P induces differentially ‘intrinsically’ motivated recruits to be placed in schools. This is where the lab-in-the-field games come into play. In particular, they play a framed version of the dictator game, where teachers have to choose between keeping money and allocating it towards providing school supply packets to students. And here they find something: the teachers recruited under the P4P offer provide 10 percentage points less of the pot to the students.

4. Advertised P4P induces the selection of higher (or lower) value-added teachers. Now Leaver and co. turn to student outcomes, using the tests that they gave them. And they can’t reject the null that there is no impact of advertised P4P on the student outcomes. 

So, looking at recruitment effects as a whole, they don’t find anything in terms of ability, but do find teachers who are more likely to keep money rather than handing it over to their students.

5. Experienced P4P creates incentives which contribute to higher (or lower) teacher value-added. Recall that once teachers were on the job, they might have gotten a different contract type – this will let Leaver and co. separate out the recruitment from on-the-job effects. Here again, Leaver and co. turn to the student test results. And boom, students with teachers getting a pay for performance contract perform better. This effect turns out to be small in year one and larger in year two. In the second year, the impact is equivalent to moving a student from the 50th to 56th percentile of test scores: “a modest but certainly economically meaningful result.”

6. Selection and incentive effects are apparent in the teacher performance metrics. Here Leaver and co. see significant increases in teacher presence and classroom practices in the group that got the P4P contracts.

Leaver and co. then depart from the pre-analysis plan and look at the dynamics of teacher retention and composition over the two years for which they have data. First of all, there is no significant difference in the retention rate across P4P versus fixed wage schools, they both lose about 20 percent of their teachers. Beyond the average rate of attrition, there is no clear evidence that the P4P contract induces teachers of different skill or different intrinsic motivation to stick around.

So, to sum it all up: pay for performance doesn’t get you different skilled teachers at the start. The only big difference is that they are more likely to keep money for themselves versus giving it to their students – so potentially some kind of different internal motivation. Once on the job, though, pay for performance seems to induce somewhat higher effort (in some dimensions) from teachers, with resulting learning outcomes from students. How these will play out over a longer time horizon, as this new form of contract sticks around, is an interesting question.   



Markus Goldstein

Lead Economist, Africa Gender Innovation Lab and Chief Economists Office

Join the Conversation

Nic Spaull
July 03, 2019

I saw Andrew Zeitlin present this at the RISE conference in DC earlier this month and one of the comments (by Rafa De Hoyos) was that this type of a system requires accurate value-added scores on which to base teacher incentives. There is nowhere in Africa that currently has national or even regional value-added scores which could be used for pay-for-performance. Implementing such a national testing system (which may be a good idea for lots of other reasons), and specifically one that can do VA (no small feat) is so unrealistic as to make a policy based on it essentially unworkable. Novel research but unclear how this could/would ever be implemented in the real world.

Owen Ozier
July 05, 2019

Dear Nic,
Thanks for raising those important points, and thanks for attending the RISE presentation. Two important points of context. One: repeated annual assessments can be used as the basis for calculating value added in the incentive system we studied; indeed, we used repeated annual assessments in just this way in the study in Rwanda. Two: just last month, the Rwandan Ministry of Education announced plans to build a system of “comprehensive assessment” that will include repeated annual assessments prepared and conducted at national level by the Rwanda Education Board. We think this timing is illustrative: it is helpful to have this kind of research finding in hand at the time when such systems are being established.
Best regards,

September 10, 2019

Good idéa for our country

Abdourahamane koulibaly
September 10, 2019

L'éducation est notre espoir du développement durable
Homme bien formé pour évité la dépravation
Sans cause
Favoriser L'éducation pour maintenir tous les hommes dans l'égalité
Payé très bien les enseignants

September 11, 2019

Teachers are the instrument of change in this world filled with a lot uncertainty...
Life can again be better people are taught the ideal knowledge
Teachers should be care for just like the way we care for politicians

Iniobong Akpan
September 11, 2019

Paying teachers based on performance does not help in any form improve performance of students or the educational system but rather paying teachers equally and giving incentives to them from time to time can motivate them to performed better and so doing the educational system can improve. Teachers are the pivot point at which every educational system wheels. To avoid sentiment and prejudice in the performance of their duty, let every teacher be given equal opportunity and payment.

Take a case of Nigerian teachers, because the government is under paying teachers and have not provided any good incentives for them and their works are not well appreciated by the public that's why the professionals are leaving the job to join politics and some have to travel outside any joined the profession outside where their efforts are well received, recognized and appreciated by the government and parents. A country that don't regards her teachers and teaching profession as anything is bound to loose everything even the brighter future.

Under played teachers cannot give their full attention and energy to the work but rather will channel it to something else that is more productive and beneficiary to them. Give them house, car and Good environment to live as part of the incentives and you will see the miraculous outcome.


Tarfa Awai
September 11, 2019

Point raise here are very important, most especially in underdeveloped countries.

In Africa, they are a few passionate teachers (That is, people who teach for the love of teaching). For many teachers in Africa, teaching is a profession as well as a major source of income.

Most teachers turn not to improve/develop on their teaching skills and are inactive beacuse, even if they do, the hard work and dedication will add nothing to their take home pay.

If payment of teachers is based on performance, most teachers will work harder, improve constantly on their teaching skills and most will dedicate more time to teaching.

If you pay teachers based on performance, they will work harder.