A snow storm was barreling toward New York City and the roster of attendees at the UN Statistical Committee meeting—myself included—fully expected that all flights would be canceled. Fifty statisticians made the same calculation—to find the closest bar. I headed to the Vienna Café in the UN headquarters building, a place which affords one the rarified opportunity to socialize with high-level government officials from around the world. On my way in, I recognized the Director-General of a statistics office from an African country and we spoke. I mentioned several statistical programs that donors were planning to finance in his country. He expressed enthusiasm about these projects but voiced an increasingly familiar note of concern about long term sustainability of his agency in general. He fretted that his entire statistical office would collapse without donor support. He admitted that most of the demand for data was coming from the donors themselves, as indicators for their own reporting and planning; the country’s own government had much less interest in data or statistics.
True Demand for Data
The complexity of a system determines the amount of data that system needs to function. Simple organisms have few neurons while the nervous systems of higher mammals contain dozens of billions of neurons. Economic efficiency improves as people and firms specialize in increasing number of different activities, suggesting that economic development is associated with the complexity that emerges from the interactions between the economic actors (Hidalgo and Hausmann 2009). Hunter-gatherer tribes survived for millennia without numbers. The math and statistics originated in agricultural economies of Mesopotamia and ancient Egypt, largely in response to bureaucratic needs of measuring of plots of land and taxation of individuals. With occasional pullbacks, the amount of data and its use have been growing at an accelerated rate as world economies become more complex and intertwined. The data generated in 2017 exceeded the amount of all data generated over the last 5,000 years (Jones 2018).
If data is one of the inputs into the aggregate production function of an economy, the demand for data in a country would depend on the levels of other inputs. This means that there should be an optimal amount of data which puts a country’s economy on its production possibility frontier. Put another way, this is the true demand for data. The true demand for data could come from three main consumers (IDB 2018). The state needs data to develop policies and programs and to monitor the execution of these programs; society and business need data for the decision making; non-government entities use data to hold governments accountable and to evaluate the government performance.
At any point in time, the true demand for data is different both in relative and absolute terms for countries with different levels of development and, correspondingly, complexity. The Economic Complexity Index (ECI) measures the relative knowledge intensity of an economy (MIT 2018).
For example, the amount of data demanded in New Zealand (ECI rank of 47; population 4.7M; per capita GDP $39,000 PPT) could be drastically different from that amount in Mauritania (ECI rank of 121; population 4.5M; per capita GDP $1,100 PPT) (WDI 2018).
Distorted Markets for Data
Development practitioners around the world harmonize on a common refrain: many low- and middle-income countries lack much-needed data. A range of international development organizations, donors, and agencies thus push for more and better-quality data across the developing world. This includes allocating resources for statistical capacity building programs and financing surveys, among a range of other activities. But, why do market forces in developing countries fail to generate the quantity of data required to maximize production?
Why could data be underprovisioned? Some governments are threatened by data. Voting numbers, rates of economic growth, the number of people living in poverty—these realities are harder to obscure when good quality data is publicly available (Dargent at al 2018). Other reasons may be less obvious: simply put, lots of decision-makers and officials fail to understand the utility of data and so resources flow elsewhere. Even if the usefulness of data is well understood, political pressures might divert investments from data to more immediate priorities.
If the state is underperforming or lacking the requisite technical strength, international organizations might act as partial substitutes for government institutions’ role of developing and monitoring policies and programs. In the absence of strong civil society, international agencies could play the role of transparency or accountability proxies for independent NGOs. To perform these functions, they need data and such “donor demand for data” could temporarily replace the domestic data demand. The rationale for these interventions is that in response to donor demand the local statistical capacity will grow to the level when the national statistical systems are sustainable on the long run.
Markets for data could suffer from statistical “Dutch disease”—donors and international actors often collect data to meet their own needs or as a global public good. When this happens, it does so at the expense of data and systems for which there are true, domestic needs. Donors are willing to pay a higher price than local governments are for the same data. The cost of data collection could be inflated by the agencies’ competition for scarce statistical expertise within the developing country itself at levels that are difficult to reduce. For example, per household costs in the recent official surveys in some African countries exceeded $400 (Kilic et al 2017). It is questionable whether the statistical agencies in developing countries could continue operating at these prices without external funding.
Data might also be overprovisioned. The growing number of actors all working in the same context forces high transaction costs on country officials and might subordinate the true demand for data to successive waves of external priorities (OECD 2017). Uncoordinated donor efforts could lead to a multitude of often similar surveys conducted at the same time, a proliferation of training workshops and overinvestment in the hardware and software for the statistical offices (Sanna and Mc Donnell 2017). While individual, short-term policy and advocacy needs may be met this way, it is probably creating structures which are not sustainable—or necessary—as governments and civil society increase their capabilities to perform their rightful roles and donors step back.
A country’s economy suffers from the lack of data. Too much data for a given level of development might not help either. Should Mauritania aim at New Zealand’s levels of data production, or would much more modest levels suffice to meet its current needs? What is Mauritania’s true demand for data?
I do not have answers to these questions. While the central premise of donors and international agencies—that most developing countries fail to produce sufficient data on their own and need support—is accurate, I believe that the heterogeneity of demands for data across different countries should be acknowledged and planned for. The development of methodologies which help to assess a country’s true demand for data could improve the effectiveness and sustainability of data policies, technical support, and financing, toward helping each country to reach its optimal level of data. Priorities should be given to programs which help to generate local demand for data, for example, data literacy programs and initiatives designed to remove the barriers for wider groups of populations to understand and use data. However, in doing so, a much higher level of donor coordination is required to minimize the distortions of the local markets and to assure the long-term sustainability of the local statistical systems.