Measuring Yields from Space
This page in:
This post is co-authored with Marshall Burke.
One morning last August a number of economists, engineers, Silicon Valley players, donors, and policymakers met on the UC-Berkeley campus to discuss frontier topics in measuring development outcomes. The idea behind the event was not that economists could ask experts to create measurement tools they need, but instead that measurement scientists could tell economists about what was going on at the frontier of measuring development-related outcomes. Instead of waiting for pilot results, we decided to blog about some of these ideas and get inputs from Development Impact readers. In this series, we start with recent progress on measuring (“remote-sensing”) agricultural crop yields from space.
Why satellite-based remote sensing?
The potential ability to use satellites to measure common development outcomes of interest excites researchers and practitioners for a number of reasons, chief among them the amount of time and money we typically have to spend to measure these outcomes the “traditional” way (e.g. conducting surveys of households or firms). Instead of writing large grants, spending days traveling to remote field sites, hiring and training enumerators, and dealing with inevitable survey hiccups, what if instead you could sit at home in your pajamas and, with a few clicks of a mouse, download the data you needed to study the impacts of a particular program or intervention?
The vision of this “remote-sensing” based approach to research is clearly intoxicating, and is being bolstered by the vast amount of high-resolution satellite imagery that is now being acquired and made available. The recent rise of “nano-“ or “micro-” satellite technology – basically, fleets of cheap, small satellites that image the earth in high temporal and spatial resolution (e.g. our partner Skybox) – could hold particular promise for measuring the types of outcomes that development folks often care about. This is perhaps most obviously true in agriculture, where unlike in the manufacturing sector, most production takes place outside.
How does it work?
For most agricultural crops – particularly the staple crops grown by African smallholders, such as maize – pretty much anyone can look at a field and see the basic difference between a healthy highly productive crop and low-yielding crop that is nutrient or moisture stressed. One main clue is color: healthy vegetation reflects and absorbs different wavelengths of light than less-healthy vegetation, which is why leaves on healthy maize plants look deep green and leaves on stressed or dead plants look brown. Sensors on satellites can also discern these differences in the visual wavelengths, but they also measure differences at other wavelengths, and this turns out to be particularly useful for agriculture. Healthy vegetation, in turns out, absorbs light in the visible spectrum and reflects strongly in the near infrared (which the human eye can’t see), and simple ratios of reflectance at these two wavelengths form the basis of most satellite-based measures of vegetative vigor – e.g. the Normalized Difference Vegetation Index, or NDVI, which many have likely heard of. High ratios basically tell you that you’re looking at plants with a lot of big, healthy leaves.
The trick is then to be able to map these satellite-derived vegetation indices into measures of crop yields. There are two basic approaches (see this nice review article by David Lobell for more detail). The first combines satellite vegetation indices with on-farm yield observations as collected from the typical household or agricultural surveys. By regressing the “true” survey-based yield measure on the satellite-based vegetation index, you get an estimated relationship between the two that can then be applied to other agricultural plots that you observe in the satellite data but did not survey on the ground. The second approach combines the satellite data with pre-existing estimates of the relationship between final yield and vegetative vigor under various growing conditions (often as derived from a crop simulation model, which you can think of as an agronomist’s version of a structural model). Applying satellite reflectance measures to these relationships can then be used to estimate yield on a given plot. A nice feature of this second approach is that it is often straightforward to account for the role of other time-varying factors (e.g. weather) that also affect the relationship between vegetation and final yield.
How well does it work?
These approaches have mainly been applied to larger farm plots in the developed and developing world, at least in part because until very recently the available satellite imagery was generally too coarse to resolve the very small plot sizes (e.g. less than half an acre) common in much of Africa. For instance, the resolution of the MODIS sensor is 250m x 250m, meaning one pixel would cover more than 15 one-acre plots. Nevertheless, these approaches have been shown to work surprisingly well on these larger fields. Below is a plot from the work of David Lobell and co-authors, showing the relationship between predicted and observed yields for wheat in Northern Mexico, where average plot size is > 20 hectares, equivalent to about 50 one-acre plots.
Plot: wheat yields in northern Mexico, from Lobell et al 2005.
Although success is somewhat in the eyes of the beholder here, the fit between observed and satellite-predicted yields is pretty good in both of these cases, with overall R2s of 0.63 in the US case and 0.78 in the Mexico case. And, at least in both of these cases, the “ground truth” yield data was not actually used to construct the yield prediction – i.e. they are using the second approach described above. This was possible in this setting because these were crops and growing conditions for which scientists have a good mechanistic understanding of how final yield relates to vegetative vigor.
From rich to poor, big to small
Applying these approaches in the locations of primary interest to Development Impact readers (e.g. smallholder plots in Africa) has been harder. This is not only because of the much smaller plot sizes, and thus the difficulty (impossibility, often) of resolving them in existing satellite imagery, but also because of a lack of either (i) ground truth data to develop the satellite-based predictions, and/or (ii) a satisfactory mechanistic understanding in these environments of how to map yields to reflectance measures.
New data sources from both the ground and sky are starting to make this possible. Sensors on the new micro-satellites mentioned above often have sub-meter resolutions, meaning smallholder plots are now visible from space (a half-acre plot would be covered by over 2000 pixels). Furthermore, this imagery is being acquired often enough to ensure at least a few cloud-free images during the growing season -- not a small problem in the rainy tropics.
Working with collaborators in Kenya, Uganda, and Rwanda, we are linking this new imagery with ground-based yield data we are collecting to understand whether the satellite data can capably predict yields on heterogeneous smallholder plots. Below is a map of some of the smallholder maize fields we have mapped and are tracking in Western Kenya, as part of an ongoing experiment with smallholder farmers in the region.
Some of the long run goals of this work are to (i) allow researchers who have already have information on plot boundaries and crop choice to use satellite images to estimate yields, and (ii) to allow researchers who do not have plot boundaries but who are interested in broader-scale agricultural performance (eg. at the village or district level) a way to track yields at that scale. This work is ongoing, but given the experience in developed countries, we are hopeful.
Some challenges.
Nevertheless, there are clear challenges to making this approach work at scale, and clear limitations (at least in the near term) to what this technology can provide. Here are a few of the main challenges:
- Which boundaries and which crops. To measure outcomes at the level of the individual farm plot, satellite-based measures will be most easily employable if the researcher already knows the plot boundaries and knows what crop is being grown. As satellite imagery improves and as computer vision algorithms are developed to remotely identify plot boundaries, both of these constraints will likely be relaxed, but the researcher will still need some ground information on which plots belong to whom.
- Measurement error. Even with plot boundaries in hand, the fact that satellite imagery will not be able to perfectly predict yields means that using satellite-predicted yields as an outcome will likely reduce statistical power (although it’s not immediately clear how much noisier satellite estimates will be, given that survey based measures of these outcomes – e.g. from farmer self reports – are likely also measured with error.) This almost certainly means that this technology will not be equipped to discern small effects in the smaller-sized ag RCTs that often get run.
- Moving beyond yield. Finally, even with plot boundaries in hand and well-powered study, satellites are going to have a hard time measuring many of the other outcomes we care about – things like profits or consumption expenditure. Satellites might in the near term be able to get at related outcomes such as assets (something we’re also working on), but it’s clearly going to be hard to observe most household expenditures directly.
Second, we still have a surprisingly poor understanding of why some farms, and some farmers, appear to be so much more productive than others. Is it the case that relatively time-invariant factors like soil type and farmer ability explain most of the observed variation, or are time-varying factors like weather more important? Satellite data might be particularly useful for this question (the Lobell paper again provides nice examples), because you can assemble huge samples of farm plots that can then be easily followed over time. Satellite data in this setting therefore might afford more power, and you can do it all in your pajamas.
So we are again hopeful that this approach will yield dividends, but would love to hear from DI readers about how they think this or related technologies might be most useful in measuring development impact.
Nice clear description of the problems and approach. One outcome of better agricultural systems can be better water quality due to less erosion. This can be observed remotely and may serve as an indirect measure of nutrient retention. Improved soil conditions such as increased organic matter may themselves create observable signals.
It is a laudable scientific trial but it is also worth noting that this type of innovation requires several pilot testing before bringing it to the fore. Considering so many flaws that may arise by solely depending on the remote sensing,it would be more scientific to compare many results of the remote sensing with observed ground truth statistical data before pitching into conclusion. Nevertheless, even after thorough comparison we may not even get 100% reliable results from the use of technology. We should not also subject the use of conventional ground survey into total extinction as this reckons with empathy, face to face rapport and first hand information researchers develop for the poor through seeing their true live situations in their various villages. Since seeing they say is believing, the true live situation accounts of the farmers can only be drawn through the use of survey if we really want to do justice to the welfare of the small scale farmers. In terms of cost-benefit ratio, i think real ground survey is cost effective.
From your recent findings it is obvious that the use of remote sensing favours big farmlands i.e. highly mechanized commercial farms rolling to large number of hectares which is under close vigilance of technology supervision. Unlike the subsistence farmers who labour very hard,distant from improved seeds and agrochemicals, their evergreen farmland vegetations may not always justify their yields as you have said and using this as a parameter to judge their farm situations may be far from truth.
In conclusion, we may need to ruminate on so many factors most especially when it is necessary to research on using sophisticated technology for comparison between big and small farms to get same results.
As someone who has spent much of the past 40 years studying smallholder farmers, I am always intrigued by the possibilities for using technology to make the task easier. If this pans out, it could be great for national crop forecasting services. But it may tell you very little about smallholders' agricultural livelihoods. Why? Using the maize example, most farmers interplant or underplant maize with a whole range of crops...groundnuts, pumpkins, squash, etc. Many of these 'secondary' crops have positive interactions with the growing maize: weed suppression, nitrogen availability, etc. Yet they are probably not detected by satellite imagery, however refined. The maize canopy ensures that. So a good satellite forecast of maize yield can be helpful for the famine watchers, but it may at the end of the day tell you very little about household-level food security.