Published on Development Impact

Mismeasured weather and the challenges of satellite data

This page in:
Development Impact logo

As I’ve written about here before, Earth Observation (EO) data is playing an ever larger role in economics research. From climate adaptation to food security to migration, more and more of our work now relies on EO data—gridded, satellite-based measurements of the what is happening on earth. For researchers working in data-poor settings, EO datasets have become indispensable. They offer geographic coverage, consistency, and regular temporal frequency unmatched by most surface level observations. The suite of EO based data products that provide information about weather conditions have been especially important.

But as with any data source these products can have errors. It’s well-known that there can be important measurement errors in these datasets that are not always accounted for by researchers. But what does that measurement error amount to in practice? And specifically what does it mean in the context of selecting among the many available weather re-analysis products?

That’s the question a new paper from Anna Josephson, Jeffrey Michler, Talip Kilic, and Siobhan Murray seeks to answer. Taking advantage of access to precise location data from the LSMS survey from five countries in Sub-Saharan Africa they examine the performance of nine of the most commonly used EO weather datasets.

What they find underlines the fact that EO data is not plug and play. Depending on which dataset you use, the estimated impact of weather on agricultural outcomes varies, sometimes wildly. In most cases they find that at least two datasets yield estimated effects with opposite signs. In almost all cases at least one dataset yields statistically significant results while others yield statistical nulls.

The many sources of weather data

Imagine you’re trying to measure how droughts affect crop yields. The LSMS provides detailed household-level data from rural Malawi, for example, but there are not ground weather stations in each sampled village. To overcome this obstacle you can turn to on publicly available EO datasets that cover the whole country.

But which one do you use?

There’s CHIRPS. There’s TAMSAT. There’s ERA5, MERRA-2, GPCC, MSWEP, UDEL, and more. They all report rainfall and temperature over gridded surfaces. This means that their estimated weather measures can be extracted to provide information on weather at any defined point on the surface. But they differ in spatial resolution, temporal smoothing, input sources, and statistical modeling techniques.

The authors of this paper ask a simple, obvious in hindsight, but crucial question: Do these datasets give broadly the same answers when used to assess crop yields?

The answer, in brief, is no. Not only do these data not provide the same ultimate result when used for regression analysis but the authors document that the descriptive statistics they produce can also vary substantially. In general these differences bewteen datasets are somewhat stable across different locations in Africa, ERA5 for example often reports much more rainfall than other sources of data. But that is not true everywhere, in Niger for example ERA5 reports substantially less rainfall than any other data.

Considering temperature instead of precipitation there is more consistency across the data products. This may reflect the relatively consistent temperatures across most of the countries in the sample because of their location in the tropics or sub-tropics. It would be useful for researchers to see whether this consistency in temperature persists in a sample that included countries with more seasonal variation in temperature.

Nine Datasets, One Question, Many Answers

To test the imact of these differences on analytical results the authors use data from Malawi, Ethiopia, Niger, Nigeria, and Tanzania, and pair household-level crop production with weather variables derived from nine EO datasets. They focus on four commonly used weather measures: total rainfall, number of dry days, mean seasonal temperature, and growing degree days.

They then run standard regressions: how do changes in these weather variables affect plot-level agricultural output?

In theory, if the data are accurately measuring the same underlying reality, the regression results should be broadly consistent. In practice, they’re not.

For example, when regressing output on total seasonal rainfall changing the EO source can change the sign and the significance of the estimated coefficient. The same regression can yield positive and significant results, negative and significant results, or statistically insignificant results depending on the EO source selected. Interestingly, the pattern of results across different EO products tends to be substantially different in regressions with and without location fixed effects.

This isn’t an isolated result. Across all five countries and all four weather metrics, the estimated coefficients vary considerably depending on the dataset used.

Why Does This Happen?

There are several reasons that may explain the divergence across data products.

First, EO data are not direct observations. They’re modeled estimates based on a mix of satellite sensors, reanalysis products, and interpolation algorithms. For instance, CHIRPS blends satellite imagery with ground station data. ERA5 is based on climate reanalysis models assimilating multiple sources. Some rely more heavily on satellite input; others on historical station data. The assumptions embedded in these methods matter and they differ across products.

Second, EO datasets vary in spatial and temporal resolution. Some have finer grids or daily estimates, others offer monthly aggregates. Aggregating to the household level (often by averaging values within a radius of the household GPS location) can introduce further divergence, especially in spatially heterogeneous climates.

Third, ground-truthing is limited in many of the countries in Sub-Saharan Africa where the density of weather stations is low. That means EO models are trained on sparse data and validated with imperfect benchmarks.

The combination of these things results in datasets that provide precise estimates but may not be accurate. Unfortunately, it is exactly the thing that makes these data useful, their ability to provide information from information poor locations, that makes it hard to verify their accuracy.

What Should Researchers Do?

These challenges are not a reason to stop using weather reanalysis data or other kinds of EO data. These data are still some of the best tools we have for studying a variety of questions at scale in data poor environments. But this paper offers a strong case for more transparency, caution, and replication. Here’s what that might look like in practice:

  1. Justify Your Dataset Choice
    Researchers should explain why they selected a particular EO product—especially if others are available. Was it resolution? Validation studies? Computational ease? Say so.
  2. Run Robustness Checks Across Datasets
    If feasible, re-estimate your key models using multiple EO datasets. If results diverge, say so—and discuss why. This is especially important for policy-relevant findings.
  3. Push for Better Validation in the Global South
    Ultimately, the EO community needs more ground station data to improve model calibration. Investments in weather infrastructure—combined with open data sharing—are critical public goods.
  4. Collaborate Across Disciplines
    Climate scientists, agronomists, and economists often work in silos. But improving EO use in economic research requires input from all three. Interdisciplinary collaboration can help researchers understand the limits and strengths of various data sources.

A great strength of EO data is its ready availability and, increasingly, ease of use. But this paper is a good reminder that these data products are not, generally, direct observation. They are estimates and are subject to the same errors as any other estimate.

Until direct observations improve in many of the places we work these EO products are the best option. And they will continue to improve. But they should not be used uncritically and, perhaps, should not be used individually. 


Patrick Behrer

Economist, Development Research Group

Join the Conversation

The content of this field is kept private and will not be shown publicly
Remaining characters: 1000