Published on Development Impact

# Timeless principles for better figure design

The Visual Display of Quantitative Information by Edward Tufte shaped my thinking when I first encountered it as a graduate student (your library may have a copy). Though its 1983 publication date makes it old enough to assume universal use of ink printing (as opposed to reading a paper on a computer), its principles and advice remain helpful to economists. While I feel we have internalized some of Tufte’s best practices, and we regularly avoid the worst ones (come on, 1970s newspaper infographic designers!), there are still some impactful insights that those who have not read the book could benefit from.

Tufte’s book advocates for information density, within reason. Data should be displayed, and non-data should be omitted. If your goal is to compare two numbers, a sentence will do. If a figure could be a table, it probably should be. Tables are better for displaying exact numerical values (than, for example, a plot of regression coefficients).

If the underlying data are rich enough to justify a figure, the figure should tell a story, enable comparisons, and show the data. Figure I of Tamma Carleton et al. (2022; QJE) exemplifies Tufte’s advice. While more beautiful figures exist, including in other sections of Carleton et al., Figure I is a concise representation of the book’s principles. It is “graphically excellent”: “that which gives the viewer the greatest number of ideas in the shortest time with the least ink in the smallest space” (p. 51).

Figure I Heterogeneity in the Mortality-Temperature Relationship (Age > 64 Mortality Rate). Source: Q J Econ, Volume 137, Issue 4, November 2022.

Small multiples

Figure I is what Tufte calls a “small multiple”. It has nine subfigures, arranged in three rows and three columns, but because the x and y variables are the same in each subfigure, once the reader learns how to interpret one of them, they can quickly understand the others. Let’s consider a single subfigure. The x-variable is temperature, and the y-variable is mortality among people aged 65 and older. The main object of interest is the estimated relationship between temperature and mortality, displayed as a smooth black line. Light grey shading indicates statistical uncertainty, and a thin red line at y = 0 marks no change in mortality. There is little non-data, such as a background rectangular grid. Finally, textual labels in the upper left corner indicate the fraction of the data used to estimate the temperature-mortality relationship and the projected future data fraction.

Telling a story and enabling comparisons

Figure I tells a clear narrative about the relationship between temperature, mortality, and adaptation across different income levels and climates. Subfigures are indexed by average income (rows) and climate (columns). Within a row, moving from left to right reveals how the temperature-mortality relationship compares in colder vs. hotter climates. So holding income fixed, the flattening of the right tail of the temperature-mortality relationship tells us that hot days kill more old people in cold countries than in hot countries (because people in countries with hotter climates are better adapted to extreme heat). Similarly, cold days kill fewer people in cold countries than in hot countries. Within a column, moving from bottom to top shows how the temperature-mortality relationship changes as a function of income. Holding climate fixed, the flattening of the right tail of the temperature-mortality relationship indicates that hot days kill more people in low-income countries than in high-income countries, because people in high-income countries are better adapted to extreme heat.

Textual labeling

Three layers of labels surround the figure. First, x- and y-axis ticks and numeric values, which are the same in every subfigure. Second, x- and y-axis titles. The x-axis title is repeated three times, once underneath each column of subfigures. The y-axis title is repeated six times: to the left and right of each outer column’s subfigures. While this repetition reduces the figure’s “data-ink ratio” – the “proportion of a graphic’s ink devoted to the non-redundant display of data-information” (p. 93) – it is justified because it reduces the reader’s “eye work” (if the uncommon right-hand-side y-axis ticks, values, and titles were removed, the reader would have to scan back and forth between the right-hand-side subfigures and the remaining left-hand-side labels). Third, the outermost layer indicates the terciles corresponding to each column and row (high income, middle income, and low income for the rows and cold, temperate, and hot for the columns). Economists underutilize textual labeling. The font size is large enough that even a referee born before 1983 can read all labels easily without complaining. And the entire figure takes only one-third of a page!

• To increase your data-ink ratio, ask yourself of each figure element: can I erase this without losing clarity?
• Use color sparingly. Different shades of grey are easily distinguishable and have an ordinal meaning, unlike most colors. When using color, opt for colorblind-friendly palettes.
• Avoid using color to display two dimensions of data simultaneously, as it creates a challenging puzzle for readers to decipher.
• A good ratio is figures that are approximately 50% wider than they are tall; detecting changes over the horizon is easier for our eyes.
• Embedding figures and tables in the text, near where they are discussed, is friendlier to your reader than putting them at the end of your working paper.
• Cranky humor abounds: “A table is nearly always better than a dumb pie chart; the only worse design than a pie chart is several of them…Given their low data-density and failure to order numbers along a visual dimension, pie charts should never be used” (p. 178).
• Our design goal for the display of information is “the revelation of the complex” (p. 191).

### Gabriel Englander

Economist, Development Research Group, World Bank

## Join the Conversation

The content of this field is kept private and will not be shown publicly
Remaining characters: 1000