Published on Data Blog

Timely decisions to fight human capital shocks: fusing survey and satellite data for nowcasting food security in Malawi

This page in:
Timely decisions to fight human capital shocks: fusing survey and satellite data for nowcasting food security in Malawi People at the streetmarket of Namitete, in Malawi / Photo: Shutterstock

When price, income, and climate shocks hit, households often cut meals, delay care, or withdraw children from school, which can cause irreparable damage to productivity and well-being. By the time administrative data confirms the damage, the window for low-cost intervention has often closed. 

New evidence from rural southern Malawi shows how a light, regular household panel — paired with publicly available geospatial data — can inform timely, policy-relevant decisions for food insecurity. This study — a collaboration between the World Bank and researchers from Cornell University, George Washington University, and Swarthmore College — highlights that satellite data can augment the value of surveys but do not replace their ‘ground truth’.

 

What we studied

We use a high-frequency panel tracking roughly 4,500 households across 180 enumeration areas in ten southern districts of rural Malawi from 2020 to 2023. Local enumerators visited the same communities in monthly waves and constructed village-level indicators of food security and human-capital stress: the Food Consumption Score (FCS), the Household Hunger Scale (HHS), self-reported illness (overall and under-5), and schooling withdrawals due to shocks. Each wave was spatially matched to freely available Earth observation features such as rainfall, vegetation, heat stress, and night-time lights. 

We then tested these typical policy questions:

  • Cross-sectional “spatial fill-in.” If you have a recent survey round in some villages, can the satellite data predict into other villages?

  • Short-horizon “nowcasts.” If recent survey data is not available, can a six-month old survey plus newer satellite data approximate current conditions in sampled villages?

  • Spatio-temporal scaling. If recent survey data is not available, can a six month old survey plus newer satellite data approximate current conditions in unsampled villages?  

We tried three types of models: Ordinary least squares, Post-LASSO regression (a simple form of machine learning) and XGBoost (a more complex tree-based form of machine learning). We test accuracy using cross-validation, by withholding data from model training and then using it as a benchmark for evaluation. 

Image Interviews for the Rapid Feedback Monitoring System / Photo: Joana Upton

 

What we found

  1. Geospatial data did not predict well across space in this setting. 
    When trying to predict across rural districts using only geospatial features, models performed poorly—sometimes collapsing to near-constant predictions. This differs greatly from other cases, largely but not exclusively predicting measures of wealth and poverty, that have found that geospatial data is highly predictive across space. The likely reason that satellite data predicts poorly in this case is that the sample included only rural villages, and the spatial features used here were unable to capture differences in outcomes across these rural villages.  

     

  2. Short-horizon nowcasts of food insecurity are feasible if you feed them with local history. 
    Using lagged village outcomes plus current Earth observation features, six-month nowcasts achieved strong predictive power for FCS (around two-thirds of the variation explained) and solid performance for HHS (about one-half). That is enough to prioritize districts or communities for cash or in-kind top-ups, to adjust school-meal coverage, or to pre-position supplies before the lean season peaks. 

  3. Illness and school disruptions remain hard to predict. 
    Even with lagged local outcomes, models explained only a small share of variation in illness (overall and under-5) and in schooling withdrawals. To become operationally useful, these domains likely require complementary signals, such as facility caseloads, outbreak alerts, WASH (water, sanitation and hygiene) outages, and local closure data. 

  4. Linear models selected using LASSO performed well for predicting in this small, high-frequency panel. 
    Across tasks, Post-LASSO typically edged out tree-based models and was far more stable than ordinary least squares. In a larger sample with more villages, tree-based models such as extreme gradient boosting may generate more accurate predictions than Post-LASSO. 

     

  5. Seasonality is first-order and visible. 
    In southern Malawi, food stress peaks in January–February just before the harvest. Models trained without attention to seasonal structure under-perform precisely when programs need them most. Including monthly fixed effects and ensuring that training data span multiple cycles materially improves reliability.

 

Why the results matter

Early action relies on early, credible signals. Phone-based surveys often miss poor, phone-less households; administrative and survey data come late; and satellite-only proxies can mislead. A minimal in-person panel, i.e. brief modules, repeated in the same communities, creates a “memory” in the data that satellite data can use and sharpen. Frequent repeat visits also enable the model to learn from persistent changes.

Budgets are limited. In fragile or disaster-prone settings, sustaining a small, regular panel may be more cost-effective than purchasing expensive Earth observation data. The evidence suggests that freely available earth observations data can add value when recent local survey data exist. 

Programs can move from reactive to anticipatory. Village-level nowcasts with uncertainty bands can feed existing decision processes for example for food security analysis, shock-responsive social protection, school-feeding adjustments, and public-health pre-positioning, weeks to months earlier than today’s typical practice.

 

What development policy can do now 

  1. Stand up a “minimum viable monitoring loop.” 
    Revisit the same e.g. 100–300 communities monthly or bi-monthly with a 10–15-minute core module covering food/well-being, recent shocks, and prices). This keeps costs low and preserves the lag structure that powers nowcasts.

  2. Fuse but not replace. 
    Combine six-month lags of village outcomes with as many freely available earth observation features that can easily be obtained, selected using machine learning. We are still learning which features predict well in which contexts, and new candidate features and models are being made available at a rapid pace.  

  3. Validate the way you will operate. 
    If field-to-dashboard lags are three months, test models with three-month gaps. Report rank correlations and uncertainty, not just R², to support triage decisions rather than point predictions.

  4. First target food security then focus on health and education. 
    Use nowcasts to guide geographic prioritization of food support while investing in richer, context-specific data streams (facility caseloads, prices, attendance, closures) to strengthen illness and schooling predictions.

  5. Align programs to the seasonal clock. 
    Schedule top-ups, school-meal expansions, and social and behavioral change campaigns ahead of the seasonal stress peak and use nowcasts to adjust in real time when shocks (for example, flooding or heat waves) shift the pattern.

  6. Plan for governance and safeguards. 
    Community-level risk maps can speed up help—but also risk stigma and targeting errors. Built in transparency, grievance redress, and periodic third-party review can mitigate these risks.

     

Caveat and next steps 

These results come from a specific context -- rural southern Malawi – and may not apply elsewhere. Three priorities for learning-by-doing stand out: (i) expand this type of high-frequency data collection, and test these methods in other contexts as well as urban areas, (ii) integrate market price and health-facility data to strengthen predictions outside food security, (iii) experiment with newly available features and foundation models to further improve predictive performance.  

 

Bottom line

To indicate acute risks to human capital and realize timely investments, we can now create ‘smarter surveys’, that is, light, repeated panel surveys that are supplemented with satellite data. With that foundation, governments and partners can nowcast village-level food insecurity, ill health, and disrupted schooling at useful accuracy several months ahead—buying time to act before human capital erodes.


Michael Weber

Senior Economist, Human Capital Project

Join the Conversation

The content of this field is kept private and will not be shown publicly
Remaining characters: 1000