I’m working on an impact evaluation in Colombia right now, and we are in the process of looking at baseline data from firms. The data are a bit noisy at the moment, and part of what makes it hard for me to look at is that many of the costs are in the millions (e.g. an energy bill of 1,442,990). The exchange rate currently is 1USD = 2055 Colombian pesos.
This is not the first country I’ve worked in where there are way too many zeros in the currency, but it is an order of magnitude higher than the 131 Sri Lankan Rupees to the dollar or 167 Nigerian Naira to the dollar. I have a hypothesis that data entry errors are more common when the amounts to be written have a lot of zeros in them, and thought I’d float this idea here, discuss what I’ve been trying to do to deal with it, and see whether anyone else has come up with good solutions with dealing with this issue.
A lot of the surveys I am working with at the moment are surveys of firms. These firms are pretty heterogeneous, so it is not uncommon for one of them to have sales of 1,000,000 while another has sales of only 100,000. As a result, if a zero gets left on or accidentally added to a number while getting recorded in a survey, it is not always easy to spot.
The solution I’ve come up with and am using in Nigeria is to not only have the survey enumerator record the amount, but then also record the range this amount is in and also write the amount in words. This provides checks which can be used to identify mis-entered data. Here’s an example for sales:
If you are using electronic surveys, then you can try to employ a number of other consistency checks (e.g. cross-referencing sales against expenses, profits, and perhaps the last round’s sales to make sure the numbers aren’t out of line). These have the advantage of being less time-consuming than the above approach – which I only use for key outcomes, not for every single monetary answer.
Of course my ideal solution is for countries to just have less zeros in their currency. But I guess I should be careful what I wish for, since changing from one currency to another can yield a whole lot of extra confusion – we experienced this in Ghana, where the currency was changed such that 10,000 old cedis equals one new cedi. The problem then was that many of the firm owners and enumerators would still think in terms of old cedis, and think in units of 1000s of these – so when you asked them sales, and they said 40, you weren’t sure if they meant 40 new cedis, or 40,000 old cedis, which aren’t the same thing.
Anyone else experienced these issues and have better ways of dealing with the problem?
This is not the first country I’ve worked in where there are way too many zeros in the currency, but it is an order of magnitude higher than the 131 Sri Lankan Rupees to the dollar or 167 Nigerian Naira to the dollar. I have a hypothesis that data entry errors are more common when the amounts to be written have a lot of zeros in them, and thought I’d float this idea here, discuss what I’ve been trying to do to deal with it, and see whether anyone else has come up with good solutions with dealing with this issue.
A lot of the surveys I am working with at the moment are surveys of firms. These firms are pretty heterogeneous, so it is not uncommon for one of them to have sales of 1,000,000 while another has sales of only 100,000. As a result, if a zero gets left on or accidentally added to a number while getting recorded in a survey, it is not always easy to spot.
The solution I’ve come up with and am using in Nigeria is to not only have the survey enumerator record the amount, but then also record the range this amount is in and also write the amount in words. This provides checks which can be used to identify mis-entered data. Here’s an example for sales:
If you are using electronic surveys, then you can try to employ a number of other consistency checks (e.g. cross-referencing sales against expenses, profits, and perhaps the last round’s sales to make sure the numbers aren’t out of line). These have the advantage of being less time-consuming than the above approach – which I only use for key outcomes, not for every single monetary answer.
Of course my ideal solution is for countries to just have less zeros in their currency. But I guess I should be careful what I wish for, since changing from one currency to another can yield a whole lot of extra confusion – we experienced this in Ghana, where the currency was changed such that 10,000 old cedis equals one new cedi. The problem then was that many of the firm owners and enumerators would still think in terms of old cedis, and think in units of 1000s of these – so when you asked them sales, and they said 40, you weren’t sure if they meant 40 new cedis, or 40,000 old cedis, which aren’t the same thing.
Anyone else experienced these issues and have better ways of dealing with the problem?
Join the Conversation