Published on Data Blog

geeLite: an R package for tracking remote sensing data locally

This page in:
geeLite: an R package for tracking remote sensing data locally Turning satellite data into local insights with geeLite / Source: Kurbucz, M. T., & Andrée, B. P. J. (2025). Building and Managing Local Databases from Google Earth Engine with the geeLite R Package. Policy Research Working Paper; 11115. © World Bank

In the era of big data and cloud computing, Google Earth Engine (GEE) has become one of the most transformative platforms for accessing satellite imagery and geospatial analysis. By offering petabytes of Earth observation data and scalable processing power, GEE has empowered researchers, policymakers, and NGOs to monitor global trends — from climate change to deforestation — with unprecedented ease. 

Yet, working with GEE can be challenging — especially for users without expertise in cloud computing. Managing large geospatial datasets, extracting time-series indicators, and keeping local records updated over time often requires complex coding and advanced technical skills. As the demand for accessible, reproducible, and offline-compatible workflows grows, so does the need for lightweight tools that simplify the process. geeLite was developed to meet this need. 

 

What is geeLite 

geeLite is an open-source R package designed to simplify the process of building, maintaining, and updating local databases of indicators derived from GEE. Built on top of the rgee package [2], it serves as a bridge between GEE and your local machine—requiring only minimal configuration while automating the rest. 

Whether you’re tracking vegetation trends in drought-prone areas, analyzing land cover change, or monitoring climate risk indicators for development projects, geeLite helps transform cloud-based satellite data into local, actionable insights. It streamlines both preprocessing and postprocessing steps, turning complex Earth observation workflows into reproducible pipelines. 

geeLite uses a hexagonal grid system (Uber’s H3) to generate consistent, multi-resolution spatial features and stores outputs in a lightweight SQLite database. It also supports large-scale data extraction via Google Drive, making it accessible, automatable, and offline-ready for users across disciplines.

 

Key features 

geeLite offers a set of practical features designed to support scalable, reproducible use of satellite data in R workflows. These include: 

  • Minimal configuration: A single JSON file defines your region(s), dataset(s), variables, resolution, and time range — no need for custom scripts or shapefiles. 

  • Authentication flexibility: Supports both browser-based and service account logins — suitable for personal use or unattended server-side workflows. 

  • Offline-ready storage: Outputs are saved in lightweight, portable SQLite databases that can be queried or shared without an internet connection. 

  • Hex-based spatial aggregation: Uses Uber’s H3 spatial grid system to divide regions into consistent hexagonal units, enabling multi-resolution analysis. 

  • Custom postprocessing support: Users can define transformation or feature engineering logic in external scripts and apply it automatically during data read-in. 

  • Drive mode for large-scale exports: For larger datasets, geeLite can export data in parallel via Google Drive before downloading and processing locally. 

  • Command-line automation: All major functions can be run from the CLI, making it easy to schedule updates or embed geeLite in broader pipelines. 

Shape

Why we built geeLite 

Across development sectors, from disaster preparedness to service delivery, the ability to track risks in real time has become not just a technical challenge but a strategic imperative. As global volatility increases and development gains become more fragile, decision-makers need faster, more granular, and more accessible data to guide their responses. Yet much of the current architecture of development data remains rooted in legacy systems: periodic surveys, expert assessments, and slow feedback loops that lag behind rapidly evolving crises. 

One area where this disconnect is especially stark — and where innovation is accelerating — is food and nutrition security. In crisis-affected regions, months can mean the difference between early intervention and widespread hunger. But traditional monitoring tools are often too slow or too limited in coverage to detect emerging risks before they escalate. Meanwhile, the drivers of food insecurity — conflict, displacement, climate shocks, and economic instability — are becoming more frequent, more intertwined, and more unpredictable. 

This changing landscape has catalyzed a shift toward real-time, observation-based monitoring systems powered by remote sensing and other high-frequency data. The goal is clear: replace static snapshots with dynamic surveillance systems that integrate diverse indicators — like market prices, rainfall anomalies, vegetation health, and human mobility — in near real time. These systems promise not only faster detection of risks but also more transparent and coordinated responses that can save lives and protect development gains. 

Get an in-depth look at how AI and machine learning are reshaping food security analytics to deliver actionable information for humanitarian aid, policy-making, and crisis response.

This shift is exemplified by the World Bank-led Joint Monitoring Report (JMR) — a bimonthly, datadriven monitoring tool coproduced with FAO, WFP, UNICEF, WHO, and ACAPS. The JMR has emerged as a key innovation in the early detection of food and nutrition insecurity, combining a core set of quantitative indicators to generate robust alerts ahead of crises, including indicators derived from Earth Observation.

And that’s where geeLite comes in. Designed as an R package for realtime monitoring of Earth Observation data via Google Earth Engine, geeLite was developed to accelerate real-time monitoring efforts — making it easier for analysts and policymakers to build, customize, and deploy remotesensingbased monitoring applications across development domains.

 

Installing

You can install geeLite directly from GitHub using the following commands:

# Install from GitHub 
install.packages("devtools") 
devtools::install_github("mtkurbucz/geeLite") 
 
# Set up the required GEE Python environment 
geeLite::gee_install() 

Note: geeLite depends on the rgee package and requires a Conda environment with earthengine-api, ee_extra, and numpy Python packages.

 

Usage Example

Let’s walk through a real-world example using geeLite to track NDVI (Normalized Difference Vegetation Index) across Somalia and Yemen.

Image Workflow of geeLite and the folder structure of the generated database / Source: Kurbucz, M. T., & Andrée, B. P. J. (2025). Building and Managing Local Databases from Google Earth Engine with the geeLite R Package. Policy Research Working Paper; 11115. © World Bank

 

Configure the dataset

First, you define your region and desired dataset in a config file. For this example, we’re using the MODIS NDVI dataset and aggregating mean and standard deviation values over H3 hexagonal cells at resolution level 3 (~12,393 km² per hexagon).

set_config( 
  path = "path/to/db", 
  regions = c("SO", "YE"), 
  source = list( 
    "MODIS/061/MOD13A2" = list( 
      "NDVI" = c("mean", "sd") 
    ) 
  ), 
  resol = 3, 
  start = "2010-01-01" 
)  

 

Build the database

Once configured, run the following to fetch data from GEE:

run_geelite(path = "path/to/db")

This command:

  • Authenticates your GEE account,

  • Divides the region into hexagonal cells,

  • Extracts NDVI values for each cell and date, 

  • Saves the results into a local SQLite database.

Modify the configuration (optional)

If changes are needed to the configuration file, you can use the following function. This command replaces the NDVI standard deviation with the maximum NDVI value.

modify_config(path = path, 
              keys = list( 
                c("source", "MODIS/061/MOD13A2", "NDVI") 
              ), 
              new_values = list( 
                c("mean", "max") 
              )) 

 

Update the database (optional)

If you’ve updated the configuration or want to fetch new data, just run the same function you used to build the database:

run_geelite(path = "path/to/db")

 

Viewing the data in R

geeLite stores the results as a spatio-temporal database. You can access and analyze it like this:

# Load and aggregate data to monthly frequency 
db <- read_db(path = "path/to/db", freq = "month") 
names(db)  # Shows tables: grid + NDVI data

You’ll get:

  • db$grid: Spatial metadata with hex IDs and geometries.

  • db$MODIS/061/MOD13A2/NDVI/mean: Monthly mean of mean NDVI values per hexagon.

  • db$MODIS/061/MOD13A2/NDVI/max: Monthly mean of maximum NDVI values per hexagon. 

Visualization example

You can map any time slice using leaflet package:

library(sf) 
library(dplyr) 
library(leaflet) 
 
# Merge grid with NDVI values 
sf <- merge(db$grid, db$`MODIS/061/MOD13A2/NDVI/mean`, by = "id") 
ndvi <- sf$`2020-01-01` 
# Plot 
leaflet(sf) %>% 
  addTiles() %>% 
  addPolygons( 
    fillColor = colorNumeric("viridis", domain = ndvi)(ndvi), 
    weight = 1, 
    color = "#333333", 
    fillOpacity = 0.9 
  ) %>% 
  addScaleBar() %>% 
  addLegend(pal = colorNumeric("viridis", ndvi), values = ndvi, title = "NDVI")


Image Visualizing mean NDVI values in the generated dataset / Source: Kurbucz, M. T., & Andrée, B. P. J. (2025). Building and Managing Local Databases from Google Earth Engine with the geeLite R Package. Policy Research Working Paper; 11115. © World Bank.

 

Additional features

geeLite provides a range of additional features to support and streamline your analysis pipeline. For full details, please refer to [1].
 

Drive mode

To efficiently handle large data requests, geeLite offers a drive mode. In this mode, data are exported from Google Earth Engine to Google Drive in parallel batches, then automatically imported into your local SQLite database. Before using this mode, ensure that you have sufficient storage available in your Google Drive account.

# Collect and store data using drive mode 
run_geelite(path = path, mode = "drive")


Automation with CLI

geeLite includes built-in support for command-line automation, making it especially useful for running on servers or scheduling updates via cron jobs. The main functions can be executed directly from the command line using CLI-friendly scripts automatically generated in the database’s CLI folder. For example:

# Setting the CLI files 
Rscript /path/to/geeLite/cli/set_cli.R --path "path/to/db" 
 
# Change directory to where the database will be generated 
cd "path/to/db" 
 
# Setting the configuration file 
Rscript cli/set_config.R --regions "SO YE" --source "list('MODIS/061/MOD13A2' = list('NDVI' = c('mean', 'min')))" --resol 3 --start "2020-01-01" 
 
# Collecting GEE data based on the configuration file 
Rscript cli/run_geelite.R 
 
# Modifying the configuration file 
Rscript cli/modify_config.R --keys "list(c('source', 'MODIS/061/MOD13A2', 'NDVI'), c('source', 'MODIS/061/MOD13A2', 'EVI'))" --new_values "list(c('mean', 'min', 'max'), c('mean', 'sd'))" 
 
# Updating the database based on the configuration file 
Rscript cli/run_geelite.R


 

Schedule it with cron on Linux:

 

# Monthly update with Crontab: 
(crontab -l 2>/dev/null; echo "0 0 1 * * Rscript cli/run_geelite.R") | crontab -bash


 

Resources

Citation

If you use geeLite in your research, please cite:

Kurbucz, M. T., & Andrée, B. P. J. (2025). Building and Managing Local Databases from Google Earth Engine with the geeLite R Package. Policy Research Working Paper; 11115. © World Bank. http://hdl.handle.net/10986/43165 License: CC BY 3.0 IGO.

References

[1] Kurbucz, M. T., & Andrée, B. P. J. (2025). Building and Managing Local Databases from Google Earth Engine with the geeLite R Package. Policy Research Working Paper; 11115. © World Bank. http://hdl.handle.net/10986/43165 License: CC BY 3.0 IGO.

[2] Aybar, C., Wu, Q., Bautista, L., Yali, R., & Barja, A. (2020). rgee: An R package for interacting with Google Earth Engine. Journal of Open Source Software, 5(51), 2272. 


Marcell T. Kurbucz

Research Fellow, Institute for Global Prosperity at University College London (UCL)

Bo Andree

Data Scientist, Development Data Group, World Bank

Join the Conversation

The content of this field is kept private and will not be shown publicly
Remaining characters: 1000