Capturing cost data: a first-mile problem


This page in:

Before we bought our house, my husband and I knew the price. The real estate agent wasn’t allowed to give us a back-of-the-envelope estimate right at the end of the process. She wasn't allowed to just declare that the house was low cost, affordable, or sustainable for our budget. We knew the exact price of each house she showed us, not just how many bedrooms each had, what the walls were made of, or the quality of the public schools in the neighborhood. There are even laws that stipulate how transparent she and the bank offering us credit needed to be about the total cost of buying a home.

Somehow we don't hold ourselves to the same standard when we're asking a government to invest in a particular program or policy (which, by the way, are way more expensive than a house, even a house in the DC metro area).  We spend a great deal of our efforts making sure our estimates of benefits are valid and precise; then we either present some quick estimate of costs in the final section of our papers or reports or we avoid this altogether and just declare from the outset that we are testing something is low-cost or scalable. Then, for some mysterious reason, we feel we're in a position to offer policy advice.

I have blogged about the absence of cost estimates before, and Dave Evans has also offered some hypotheses for why we don’t regularly feature costing analyses in our impact evaluations. JPAL provides guidance for conducting cost effectiveness analysis, and USAID has cost reporting guidance for its education programs.  At the Strategic Impact Evaluation Fund (SIEF), we require that teams that receive funding collect cost data and present a costing analysis in their end line reports.

Despite all of this, we still find ourselves in a situation where we have sloppy estimates of costs (if we have them at all) that play a very minor role in our overall assessment of whether a program or policy was successful. What's going on?  I believe we have a first-mile problem. We don’t know how to get started.  While there's plenty of guidance on how to conduct a costing analysis, all of it assumes that you have already collected the cost data. The cost data, you need, however, isn't just lying around all in one place waiting for you to pick it up; you somehow have to go capture it, and there's very limited guidance on how you should go about doing this. There are some templates in the form of complicated spreadsheets with macros that are sure to give you an awful headache, but again, they assume you know how to find the data that must be filled in.

To remedy this, we at SIEF – in coordination other groups interested in supporting greater use of costs - are developing guidance and case studies on capturing cost data. We want to describe the process that takes you from a from a blank spreadsheet to a final estimate of total cost. With the International Rescue Committee, we have already published a brief note that outlines some of the basic ideas. The first point is that cost data needs to be disaggregated. It's not enough to know how much was spent on salaries in total, for example. You want to know how many people were needed to make an intervention happen and what their rates were. This isn’t important just for verifying the accuracy of more aggregate data; it’s also necessary for estimating costs for a scaled-up version of the program or for implementation in another geographic area where labor costs might differ. Disaggregation can also help identify opportunities to save money in future iterations of the program.

Second, cost data should be intervention-specific. The data must include the costs of all inputs that were required to make an intervention happen and must not include the costs of inputs that served other related programs. This turns out to be complicated for inputs like management and labor that are often shared across multiple programs. You want to capture the time and effort people have spent on the program you're costing and not the time and effort they spent on other programs. You also don't want to say , "Oh, these are fixed costs; these people would be paid anyway in the absence of my program, so I'm not going to count the costs of their time." When they're working on your program, they're not working on something else. Unless you assume they would otherwise do nothing in the time they're working on your program, then it's important to count their time as a cost. Similarly, if delivery of your program requires vehicles and fuel and other programs also use these inputs, it will be important to apportion of fraction of these costs to the cost of your program.

Therefore, a third point we make is that cost data cannot be limited to financial data. To get disaggregated and intervention-specific data, you may need to interview people, to get out and see the program in action, and to rely on regular monitoring and evaluation data. You might even need to collect time and effort data to accurately apportion expensive inputs like management and labor to your program. Rarely is this kind of information waiting for you neatly in the budget or financial records of program implementers. My colleague Sam Fishman blogged about his experience in Bangladesh costing an additional year of preschool. He first asked for the implementer's financial records, and by the time he stepped everyone through the process of disaggregating the data and making it intervention-specific, the estimated total cost of the program had quintupled.

To do all of this, we argue, planning for cost data collection ideally starts before implementation of an intervention and data gets collected in real time. There are some expensive inputs like staff and management time that suffer from significant recall bias and thus asking people to remember how much they worked on something a year or more later can't be a good idea (see Kathleen Beegle's illuminating blog on the challenges of measuring labor in surveys). Collecting data in real time also allows you to take advantage of other data collection efforts that might accompany a project like regular monitoring. It's much easier to add some fields to a monitoring instrument than to conduct an entirely separate survey just on costs.

We hope to show by example in our guidance. If you have good case studies in mind (that show what can go right and what can go wrong when trying to estimate costs), please send them our way (write to and use the word "Costs" in the subject line). Please also get in touch to let us know what challenges you have faced when trying to capture cost data or when trying to get started.

As a starting point, here a few challenges that researchers have noted that can come up when costing the programs they are evaluating and our suggested ways of dealing with them:

  1. Government officials and program officials may be reluctant to share their salaries, or to have them made public.  
    Suggested solution: Government salaries, as well as program officials paid by aid agencies, are often public information, so there’s no need to ask people themselves to reveal their salaries. If an individual’s salary is not publicly available, then a schedule of average salaries by grade could provide a reasonable substitute. Moreover, as with any data, costing data when published typically does not have names attached but rather roles – for example, senior trainer – and this confidentiality should be emphasized at the time of data collection.
  2. How should the time of researchers designing and monitoring a program be counted? Many impact evaluations involve substantial time from field coordinators, university professors, and other researchers. Some of this time concerns program design and implementation (e.g. how to ensure take-up, actual supervision and monitoring in the field), while some relates to the impact evaluation. Sometimes all of these costs are paid by one source, but sometimes they are the responsibility of different funding streams. It is often unclear how much to include in program costs. 
    Suggested solution: This is a more general problem related to allocating costs when resources are shared – in this case, researcher time is shared across about program design, program supervision, and evaluation. Typically, we leave out the costs of evaluation when costing programs because most interventions do not come an evaluation (particularly when they are scaled). Thus, we need to identify how much researcher time has gone into the design and implementation of the intervention itself. To do this, researchers will need to keep track of this, ideally monthly. For example, during the design phase, we might spend 80 percent of our working hours on the project, with half of the time spent on the design of the program and half on the design of the evaluation. During other phases like implementation, we might spend considerably less time on the project overall and most of this might be on supervision tasks. Keeping track of the activities, the overall percentage of time, and the allocation of time across activities would not only help with an accurate estimation of costs but would also help with costing a version of the program in which non-researchers take on some of the activities.
  3. Concerns about how the cost data will be used. Many funders have  budgeting rules that could make it much harder to give a $20 take-up subsidy to a program participant or to buy a $1,000 laptop than to spend $300,000 on salaries, or they might have other limits on how funds can be spent that don’t match the needs of program implementation. Bundling these items into lump-sum delivery contracts allows for programs to be implemented efficiently, but researchers may be concerned that very disaggregated cost data will leave them vulnerable to line-item audits. 
    Suggested solution: The good news (which is really bad news for the costing agenda) is that typical financial reporting to donors is not that helpful for costing a program. Costing data that includes all of the ingredients required to make an intervention happen, their quantities and prices, and their timing usually must be tracked separately. Since we are advocating starting the costing process early – before program implementation even begins – this could be a good way to start a conversation with a donor about the financial reporting you will have to do and whether the budgeting rules are realistic for the intervention that is being implemented. Moreover, in the guidance, we hope to include messages to donors so that their financial reporting does not just aid their fiduciary oversight but also contributes to global public goods like a set of costing estimates and necessary ingredients for an intervention implemented in different contexts.


Alaka Holla

SIEF Program Manager

April 29, 2019

Alaska, this looks promising. To be honest though, I am disappointed that you chose not to continue the collaboration with Brookings that we had started with Joost with identical objectives and plans. And you make no mention of the work done and the case studies already completed.

Willyanne DeCormier Plosky
April 30, 2019

The Global Health Cost Consortium (GHCC) certainly agrees that sound global health policy depends on high quality, standardized, and accessible cost information. The GHCC was established to provide decision-makers with improved resources to estimate the costs of HIV and tuberculosis (TB) programs. A Reference Case for Estimating the Costs of Global Health Services and Interventions can be found on the GHCC website at: . The Reference Case provides a set of seventeen principles and methodological guidance to help ensure that the process of cost estimation is clearly conveyed and reflects best practices. The Reference Case is targeted to both producers of cost data and users of cost data. A learning module building on the Reference Case principles entitled "Essential Component of Health Priority Setting: Best Practices in Understanding and Interpreting Cost Data" was developed for the World Bank 2018 Skills Building Program on Big Data, Artificial Intelligence and Decision Science in Health and Nutrition. It is available at: . The learning module also references another GHCC resource, the Unit Cost Study Repository, which is a centralized source of standardized and context-specific HIV and TB intervention cost data extracted from published and grey literature. The UCSR is accessible at:

David Harold Chester
July 15, 2019

The problems of accountancy in national economics of our social system begin when we assume that we have a good knowledge about of what it comprises. But in fact we fail to properly understand it and how it works, and it is necessary to view the whole situation as broadly as possible, before getting into its details of the kinds given in this article. The following essay outlines how this more general knowledge can be obtained.

Making Macroeconomics a Much More Exact Science

Today macroeconomics is treated as an inexact subject within the humanities, because at a first look it appears to be a very complex and easily confused matter. But this attitude does not give it fair justice--we should be trying to find a better way to approach and examine the topic, in a better way that avoids these problems of complexity and confusion. Suppose we ask ourselves the question: “how many different KINDS of financial transactions can occur within our society?” Then the simple and direct answer shows that that only a limited number of them are possible.

Although our social system comprises of many millions of participants, to answer this question properly we should be ready to consider the aggregates of all the various kinds of functions (no matter who performs them), and then to idealize these activities so that they fall into some more general terms, expressing the different types of social transactions into what becomes a relatively small number. Here, each activity is found to apply between a particular pair of agents or entities—with each entity having its individual properties. Then to cover the whole social system of a country, the author finds that it takes only 19 kinds of flows of money for the mutual activity in the transfer of goods, services, access rights, taxes, credit, investment, use of valuable legal documents, etc. Also these flows pass between only 6 different representative agents or entities.

The analysis that led to this initially unexpected result was prepared by the author and it may be found in his working paper (on the internet) as SSRN 2865571 “Einstein’s Criterion Applied to Logical Macroeconomics Modeling”. In this model these double flows of money verses goods, etc., are shown to pass between only 6 kinds of role-playing entities. Of course, there are a number of different configurations that are possible for this type of simplification, but if one tries to eliminate all the unnecessary complications and sticks to the more basic activities, then these particular quantities and flows provide the most concise result, and yet it is presentable in a fully comprehensive seamless manner that is suitable for further analysis.

Surprisingly, past representation of our social system by this kind of an interpretation model has not been properly examined nor even presented before. Previously, other partial versions have been modeled (using 4 entities, as by Professor Hudson), but they are inexact due to either their being over-simplified. Alternatively, in the case of econometrics, the representations are far too complicated and almost impossible to follow. These two reasons of over-simplification and over-complexity are why there is this non-scientific confusion by many economists and their failure to obtain a good understanding about how the whole system works.

The model being described here in this paper is unique, in being the first to include, along with some additional aspects, all the 3 factors of production, of Adam Smith's “Wealth of Nations” book of 1776. These factors of production are Land, Labor and Capital and along with their returns of Ground-Rent, Wages and Interest/Dividends, respectively. All of them are all included in this presentation diagram.

(Economics’ historians will recall, as originally explained by Adam Smith and David Ricardo, the prescribed independent functions of landlords and capitalists. The former persons rent and speculate in land values whilst the latter are owners of the durable capital goods in industry, which may be hired out. Regrettably these different functions were deliberately combined for political reasons, by John Bates Clark and company about 1900, resulting in the neglect of their different influences on our social system.)

The diagram of this model is in my paper (noted above). A mention of the related teaching process is also provided in my short working SSRN 2600103 “A Mechanical Model for Teaching Macroeconomics”. With this model in its different forms, the various parts and activities of the Big Picture of our social system can be properly identified and defined. Subsequently by analysis, the way our social system works can then be properly calculated and illustrated.

This analysis is introduced by the mathematics and logic that was devised by Nobel Laureate Wassiley W. Leontief, when he invented the important "Input-Output" matrix methodology (that he applied it to the production sector only). This short-hand method of modeling the whole system replaces the above-mentioned block-and-flow diagram. It enables one to really get to grips with what is going-on within our social system. Subsequently it will be found that it is the topology of the matrix which actually provides the key to this. The logic and math is not hard and is suitable for high-school students, who have been shown the basic properties of square matrices.

By this technique it is comparatively easy to introduce a change to a pre-set social system that is theoretically in equilibrium (even though we know that this ideal is never actually attained--it being a convenient way to begin the study). This change will then create an imbalance and we need to regain equilibrium again. Thus, sudden changes or policy decisions may be simulated and the effects of them determined, which will point the way to what policy is best. In my book about it, (see below) 3 changes associated with taxation are investigated in hand-worked numerical examples. In fact when I first worked it out, the irrefutable logical results were a surprise, even to me!

Developments of these ideas about making our subject more truly scientific (thereby avoiding the past pseudo-science being taught at universities), may be found in my recent book: “Consequential Macroeconomics—Rationalizing About How Our Social System Works”. Please write to me at for a free e-copy of this 310 page book and for additional information.