“Public revenue is frequently earmarked by governments for … a specific … expenditure. This paper examines whether [earmarking] appears to be effective in practice in achieving its objective of increasing expenditure. Cross-section analysis shows that there is no statistically-significant difference in the proportion of road expenditure in gross investment between earmarking and non-earmarking countries. Time series analysis lend[s] support that earmarking is associated with a somewhat higher degree of road expenditure.”
Eklund, Per (1967). “Earmarking of Taxes for Highways in Developing Countries”
International Bank for Reconstruction and Development Economics Department Working Paper No. 1
“This paper studies the impacts of the large-scale Road Sector Development Program in Ethiopia on local economic activity. It exploits spatial and temporal variation in road upgrades across Ethiopia. The findings show that road upgrades contributed to increases in local economic activity. However … gains from road upgrades are concentrated in areas with moderate-to-high initial levels of economic activity. The results suggest that Ethiopia’s ambitious road infrastructure development program overall increased local economic activity and urbanization, but that it also had important distributional implications…”
Alder, Simon, Kevin Croke, Alice Duhaut, Robert Marty and Ariana Vaisey (2022). “The Impact of Ethiopia’s Road Investment Program on Economic Development and Land Use: Evidence from Satellite Data”
World Bank Policy Research Working Paper No. 10,000
The two working papers1 quoted above serve as milestones along a path of over 50 years of economic policy research at the World Bank. The first quotation paraphrases from the introduction of the very first entry in the first official working paper series of the World Bank, the Economics Department Working Paper Series which began in 1967 and published 741 working papers through 1986. The second is from the abstract of the 10,000th entry in the World Bank’s Policy Research Working Paper (PRWP) series, which was published earlier this month.
The PRWP series began in 1988 following the consolidation of several earlier specialized working paper series at the World Bank (see Figure 1). The PRWP series publishes research by World Bank staff and consultants with a stated mission to “disseminate the findings of work in progress and to encourage the exchange of ideas among Bank staff and others interested in development issues”. Over 8000 individuals have authored or co-authored PRWPs, a list that includes Nobel Laureates Douglass North, James Heckman, Elinor Ostrom, Joseph Stiglitz, Angus Deaton, Paul Romer, Michael Kremer, Esther Duflo, and Abhijit Banerjee (click on their names to see their entries in the PRWP series). The most prolific PRWP authors, in terms of most papers written, include Martin Ravallion, Asli Demirguc-Kunt, and David McKenzie, all current or former staff of the research department of the World Bank, with over 100 PRWPs each.
Although separated by five-and-a-half decades of evolution in economic thinking, empirical techniques, data availability, and computational tools, the two papers excerpted above share features that are hallmarks of World Bank research. They address important practical questions facing development policymakers: how to finance road expenditures, and how roads affect local economic development. They both marshal empirical evidence based on modern techniques: both papers use difference-in-difference estimators (although the terminology was probably not in use at the time of writing of the earlier paper), among other empirical tools. The conclusions are nuanced: in the former, earmarking seems to matter within countries over time, but not across countries, while in the latter the effects of roads depend on initial conditions. The findings are caveated where needed: the earmarking study notes the importance of influential observations in generating significant results, contains a discussion of causal interpretation, and has an annex discussion of spurious correlation, while the Ethiopia roads paper features a careful discussion of the limitations of each identification strategy. Both papers take data access and replicability seriously: the full dataset is reproduced in an annex table in the earmarking paper, while the replication data and code for the Ethiopia roads paper are available on a World Bank GitHub repository here. Finally, both papers are connected to World Bank operations: the earmarking paper notes how the Bank’s lending in the sector facilitated access to the data used in the study, while the Ethiopia Road Sector Development Program studied in the second paper was supported by no fewer than five World Bank projects approved in 1998, 2003, 2004, 2007, and 2009.
Of course, there are also important differences between these two papers. The earlier one features hand-drawn graphs on graph paper with typewritten labels, and given the costs of implementing the calculations for a single regression equation at the time, it seems unlikely that many specifications were estimated beyond those that appear in the paper. The later one benefits from the full suite of modern tools and data, including detailed geospatial data based on satellite imagery (although it should be noted that even as early as 1971 the World Bank was beginning to explore the potential of satellite data for development policy analysis).
A focus on development issues and developing countries
While the PRWP series is younger and smaller than the NBER Working Paper Series (which began in 1973 and recently crossed the 30,000 paper mark) and the CEPR Discussion Paper Series (which began in 1984 and recently crossed the 17,000 paper mark), it is the largest working paper series focused on development economics and developing countries. The next closest working paper series in terms of development focus is probably the IMF Working Paper Series with nearly 7500 entries since 1986. However, given the IMF’s narrower topical focus (primarily monetary, fiscal, financial and external sector policy issues) and broader geographical mandate (including advanced economies), it does not have the same concentration on development economics and developing economies as the PRWP series.
The PRWP series’ emphasis on development issues and developing countries is readily-apparent from Figure 2 and Figure 3, which track the evolution over time of the most frequent topics and countries covered in the PRWPs since 19882. Topics such as trade, finance, macroeconomics, and fiscal policy featured prominently in the PRWP series in 1988, together with education, health, and agriculture. The concentration of PRWPs across topics was fairly high: trade alone accounted for 15% of the content of PRWPs in 1988. Topics like poverty, the environment, social protection and governance were relatively less frequent in the early years of the PRWP series, but expanded in importance over time. As of now, trade accounts for around 8% of the cumulative content of the PRWP series, while poverty represents 5% of the cumulative content. Other topics such as governance, gender, and environment have also seen big increases in their coverage in the PRWP series over the past three decades. Another notable pattern is the sharp increase in content related to data and surveys, which mirrors a trend in the wider economics literature to more empirical research described in Angrist et. al. (2017).
In terms of country coverage, in 1988 just four of the top 20 most-mentioned countries in the PRWPs were advanced economies (Japan at number 3, Germany at 15, France at 16 and the United States at 18). Aside from Japan, the top five most-mentioned countries were large developing countries: Mexico, Brazil, Thailand, and Indonesia. By 2022, China and India had joined Mexico, Brazil, and Indonesia to make up the top five most-mentioned countries in the PRWPs, while Japan (at position 13) was the only advanced economy remaining in the top-20 list. Among developing countries there are disparities in country coverage: East Asian, South Asian, and Latin American countries feature prominently in the top-20 list which also includes 6 economies in Sub-Saharan Africa – however no countries from the Middle East and North Africa region make the top-20 list. For a more systematic analysis of the frequency of economic research on developing countries, and the emphasis on advanced economies in top journals in economics, see this blog and paper (first published in the PRWP series of course…).
Measuring PRWP influence (imperfectly) with downloads
A 2014 PRWP posed the question “Which World Bank Reports Are Widely Read?” in its title, and answered the question with an analysis of download statistics available at the time for all World Bank reports (not just PRWPs). Inevitably, the paper went on to become one of the most-downloaded PRWPs of all time.
Downloads are a widely-used measure of the influence of research papers, because download data is readily accessible, and provides a rough indication of whether users are interested enough in a paper to click on it. However, downloads are an imperfect measure, as some papers can be highly influential in a specific field or in a specific country without attracting widespread popular interest and associated downloads, while conversely communications effort devoted to a report can lead to high downloads but not necessarily high influence.
With these caveats in mind, the full list of top-20 most-downloaded PRWPs appears in Table 1. The left side lists top papers by total lifetime downloads. Since older papers have had more time to accumulate downloads, the right side of Table 1 adjusts for this by listing top papers by downloads per year since publication3.
At first glance, these highly-downloaded papers are an eclectic mix, but a closer look reveals some patterns. Papers that introduce new datasets are disproportionately represented in Table 1, accounting for 7 of the top 20 papers by total downloads, and 4 of the top 20 papers by downloads per year. This is not so surprising: users of popular datasets such as the Global Findex will naturally want to refer to the documentation of the data in the accompanying working paper for details, and users of the World Bank’s ubiquitous classification of low/middle/high-income countries will be interested in how these categories are defined.
Papers that deal with highly-topical current events are also frequently represented in Table 1. This includes not just seven recent papers dealing with various aspects of the COVID-19 pandemic in the most-downloaded-per-year category, but a 2008 paper on the spike in global food prices that occurred in the same year. Papers that deal with issues of long-standing interest to the World Bank also appear among the most-downloaded papers. Most notably, five of the papers in the table deal with global poverty and global inequality and thus are tightly linked to the World Bank’s twin goals, while the PRWP with the most total downloads is a survey of the evidence on education and development. And finally, for various reasons some PRWPs have attracted considerable media interest at the time of their publication, which translated into higher downloads as media consumers followed the links to the underlying papers. Examples of these include the 2014 paper on which World Bank reports are widely read mentioned above, as well as the 2020 paper on elite capture and foreign aid, which tops the list of most-downloaded papers per year of paper life with nearly 45,000 downloads per year on average over 2020 and 2021.
How does the distribution of downloads look beyond the “tail” of such highly-downloaded papers? Figure 4 summarizes the data and also provides a comparison with the NBER Working Paper Series, which is the largest and highly prestigious working paper series in economics. The figure shows the distribution of downloads per year across all PRWPs (left panel)4 and NBER Working Papers (right panel).4 Starting at the top end of the distribution, around 4 percent of PRWPs have more than 1000 downloads per year. The much larger NBER working paper series has a higher share of “greatest hits”, with 8 percent of papers gathering more than 1000 downloads per year. There are also some differences at the bottom end of the distribution: about 10 percent of PRWPs have 20 or fewer downloads per year, while the corresponding figure for the NBER is 2 percent. Aside from these differences at the top and at the bottom, the distribution of the roughly 90 percent of remaining working papers is quite similar across the two series. This suggests a comparable pattern of influence, subject of course to the caveats around this measure noted above.
Figure 4: Working Paper Downloads
Note: The height of each bar shows the share of working papers with the range of downloads per year indicated on the horizontal axis. Each bar is an increment of 20 downloads. The last bar shows the fraction of papers with more than 1000 downloads per year. Population refers to papers published between 1988 and 2021.
The next 10,000 working papers
It took 34 years since 1988 for the PRWP series to publish its first 10,000 working papers. After a slightly slower pace at the beginning, since 2000 the PRWP has published around 350 papers year. At this rate, PRWP No. 20,000 will be published somewhere around 2050.
Much is likely to change in development economics research over the next three decades, just as much has changed since the days of hand-drawn graphs and typewritten manuscripts in IBRD Economics Department Working Paper No. 1 back in 1967. Much is also likely to change in the technology available to empirical researchers in development economics, when one considers that in 1988 when the PRWP series began, the Intel 386 processor was the industry standard for personal computers, laptops were an exotic item, and backup storage was in the form of 3.5in floppy disks with 1.4MB of capacity apiece. The availability of data and the conceptual tools to analyze it are also likely to continue to change drastically, when one considers that empirical analysis in IBRD Economics Department Working Paper No. 1 was based on ordinary least squares performed on a grand total of 47 country-year datapoints, while today it is not uncommon for datasets to run into the millions of observations and sophisticated machine learning algorithms are routinely deployed alongside more traditional econometric techniques.
However, it is likely that many aspects of PRWP No. 20,000 will be quite familiar to us today, just as many features of IBRD Economics Department Working Paper No. 1 resonate today more than 50 years after it was written. These durable features of applied development economics research at the World Bank will continue to be a focus on practical questions relevant to policymakers, carefully nuanced and caveated empirical analysis, a premium on identifying causal effects, and a close connection to World Bank operations.
1. Roula Yazigi provided the same outstanding logistical support in preparing this blog as she has provided to the PRWP series since 2007. Thanks also to Luiza Cardoso, Alice Duhaut, Olivier Dupriez, Elisa Liberatori-Prati, Norman Loayza, Bob Malloy, and Carmen Reinhart for helpful comments.
2. I am very grateful to Olivier Dupriez, Aivin Solatorio, and Kamwoo Lee for the text analytics they performed to generate these graphs. The country names frequency is extracted from full-text searches of all PRWPs, and the graph reports the count of mentions of each country across all PRWPs. The topics analysis is based on unsupervised topic analysis of the full text of each PRWP, and then the top topics were then manually labelled (and in some cases combined) for clarity as shown in the graph. The data reflects the proportion of the content of each PRWP assigned to each topic by the unsupervised analysis. The total proportions sum to less than one since the top 20 topics account for only 60 percent of the total content.
3. The downloads data described here aggregate downloads from three distinct platforms: The World Bank’s “Documents and Reports (DR)” and “Open Knowledge Repository (OKR)”, as well as the “Social Science Research Network (SSRN)”. I am very grateful to Luiza Cardoso for scraping the download data from SSRN, which captures downloads from that platform since the mid-2000s. The download data from the OKR (DR) include downloads since 2012 (2014), and the OKR contains only documents published from 2002 onwards. Downloads per year of publication exclude papers published in 2022. The download data is current as of late January 2022.
4. A couple of notes on the data are important. NBER download data is available since 2011, and covers all NBER papers back to the 1970s. To make this more comparable with the PRWP data, (a) I consider only downloads since 2014 (since the Documents and Reports downloads data starts in 2014, and (b) I drop NBER papers before 1988 since this is the year the PRWP series starts. I am very grateful to Dan Feenberg at the NBER for sharing the NBER download statistics.