Can we rely only on satellite? How accurate are these results?
It is standard practice in classification studies (particularly academic ones) to assess accuracy from behind a computer. Analysts traditionally pick a random selection of points and visually inspect the classified output with the raw imagery. However, these maps are meant to be left in the hands of local governments, and not published in academic journals.
So, it’s important to learn how well the resulting maps reflect the reality on the ground.
Having used the algorithm to classify land cover in 10 secondary cities in Central America, we were determined to learn if the buildings identified by the algorithm were in fact ‘industrial’ or ‘residential’. So the team packed their bags for San Isidro, Costa Rica and Santa Ana, El Salvador.
Upon arrival, each city was divided up into 100x100 meter blocks. Focusing primarily on the built-up environment, roughly 50 of those blocks were picked for validation. The image below shows the city of San Isidro with a 2km buffer circling around its central business district. The black boxes represent the validation sites the team visited.
|Land Cover validation: A sample of 100m blocks that were picked to visit in San Isidro, Costa Rica. At each site, the semi-automated land cover classification map was compared to what the team observed on the ground using laptops and the Waypoint mobile app (available for Android and iOS).|
At each site, the team captured several photos and recorded whether the majority of the land was accurately reflected in the land cover map. Discussions with local government officials, taxi drivers, and policeman provided additional insight.
Three lessons learned:
- Lesson #1: While the imagery used in this study is slightly coarser than the 0.5m imagery visible in Google Earth, it was still granular enough to detect building sizes and distinguish between paved and unpaved roads. This is particularly important given the cost savings involved. Very High resolution imagery (.5m) imagery can cost upwards of 15 dollars per square kilometers, while the 1.5m resolution imagery used in this study cost close to 3 dollars per square kilometer.
- Lesson #2: In secondary cities in Latin America, where large slums are not very pronounced, this classifier was much better at detecting commercial or industrial buildings than distinguishing between irregular residential and regular residential. However in larger cities like Nairobi and Dar es Salaam, where slums are larger and have distinct physical characteristics, the algorithm has a much easier time identifying irregular residential neighborhoods.
- Lesson #3: While the focus of this study was on the built-up environment, the team noticed that accuracy of the vegetation and bare soil was very high.
- High resolution satellite imagery (preferably less than 2m).
- A personal computer or laptop with enough computing power (at least 8GB of RAM)
- Install Python 2.7
- Download the open-source algorithm at: https://github.com/jgrss/spfeas
- Allocate 1-2 staff weeks per city for data processing and analysis.
- Leave some space for field work. It is always a good idea to validate and ground truth your results.
Acknowledgments. This work was possible thanks to the generous contribution of the United Kingdom’s Department for International Development (DFID) and the Swiss State Secretariat of Economic Affairs through the Multi-Donor Trust Fund for Sustainable Urban Development. This work builds on the lessons learned from the Spatial Development of African Cities Project.