Global land cover monitoring and updating: big data challenges

Dainius Masiliūnas

Playlists: 'foss4g2019' videos starting here / audio / related events

The Copernicus Global Land Service (CGLS) aims to provide yearly-updated land cover maps, including land cover fractions (i.e. the fraction of each land cover class within each pixel). It makes use of FOSS software (Python, R, GDAL) for all data processing steps. Our work in the project focuses on two aspects: land cover fraction mapping at the continental level (Africa) and yearly map updating.
For land cover fraction mapping, to select the most appropriate method, a number of machine learning algorithms were run on over 300 covariates: Proba-V image time series (100m, 4 bands), DEM, soil and climate properties. Random Forest performed the best with RMSE=16.6, MAE=9.2, and 68±4% overall accuracy.
To generate yearly map updates, we investigated different vegetation indices and break detection methods, as well as their scalability to big data (MODIS time series, 2009-2019). All change detection methods tended to overestimate change. Proba-V Mission Exploitation Platform Spark cluster was used to run chunked jobs in parallel.
In the future, CGLS aims to move to Sentinel-1+2 for land cover classification and Landsat 5+7+8 for change detection for higher spatial and spectral resolution, but that makes big data challenges even bigger.