Author Archives: UmaiRabbani

About UmaiRabbani

I received my Ph.D from Charles Sturt University, Australia in 2014. At the CSU's School of Environmental Science I worked mainly in the field of remote sensing and applications in spatial hydrological modeling. Currently associated with Space and Upper Atmosphere Research Commission (SUPARCO) I am working in spatial technologies in Disaster Management. My research interests involve Remote sensing, hydrological and environmental modeling, machine learning for image processing, high-performance computing for remote sensing and structure-from-motion. However, the only ultimate objective of my life is to be able pay a tribute by releasing a cover of Pink Floyd-Meddle album.

Processing Terabytes of Satellite Imagery in Google Earth Engine: Crisis Response for 2015 Flood Season in Pakistan

Since the launch of Landsat-1 in the early `70s, a continuous observation of the globe from satellites has generated unprecedented volumes of remote sensing data. Spanning across the last 40 years, the USGS Landsat program represents the longest running record of the landscape of our planet. The change in its data distribution policy during the last decade has allowed earth scientists across the globe to benefit from this invaluable archive. Similarly, the daily global coverage of the MODIS instrument on-board NASA’s Terra and Aqua satellites is relentlessly generating data products since 2000.

Traditionally, processing and analysis of datasets spread across large spatial or temporal scales has been a bottleneck for large-scale environmental monitoring. Carrying out analysis and research on gigabytes of satellite data generated weekly for a period of 10-15 years (or more) would pose a serious data handling and computational nightmare. In comes Google. A new project called the Google Earth Engine (GEE) attempts to solve just that! Google Earth Engine is a platform that brings together the enormous archive of current and historical satellite imagery, and provides tools for visualization and analysis. This enables earth scientists everywhere to leapfrog over the computational-barriers to the science-with-remote-sensing. It allows the EE trusted users to use Google’s extensive cloud computing resources to analyze and interpret satellite imagery. It can also be used through its API available in both Javascript and Python. Google has provided a web based IDE to use Javascript API called the Earth Engine Playground. There is a wealth of algorithms available to perform image maths, spatial filtering, calibrations, geometry operations and machine learning tasks, and the list is growing.

Google EE public catalog currently stores more than 5 petabytes of data in the form of 5+ million images of 200+ datasets adding 4000+ new images every day. To name a few, the archive includes Landsat (raw, TOA, SR, composites, NDVI etc), MODIS daily products (NBAR, LST, NDVI etc), terrain (SRTM, GTOPO, NED), atmospheric datasets (NOAA NCEP, OMI, GFS), land cover and other datasets (GlobCover, NLCD, WorldPOP etc). This rich treat of datasets made accessible is enough to make any Remote Sensing scientist, enthusiast, developer and spatial data analyst salivate and drool!

In a discussion with Simon Ilyushchenko, an engineer on the Google Earth Engine team, he mentions “We currently run daily ingestion for many of the datasets we host, including Landsat 7, Landsat 8, several MODIS products and a number of weather & climate datasets.” Discussing Google EE’s latest collaboration with European institutions for the availability of Copernicus Sentinel-1 data through Earth Engine, he added, “We are downloading all of the Sentinel-1 GRD data and have started ingesting it, but we’re still experimenting to determine what processing & calibration steps have to be applied to the data before it’ll be ready to use.  We hope to stabilize this by Q4 and then we’ll make the whole collection available, with automatic daily updates. In addition to Sentinel-1, there are a number of other large datasets that we’re looking into including VIIRS, GOES and AVHRR, but we’re constantly adding smaller datasets. We generally decide which datasets to ingest based on user input and votes on our issue tracker, giving priority to those datasets that will be the most useful to the widest audience.”

Figure 1: A country-level pre-monsoon NDWI layer created using quality-pixel cloud-free composite (01-May-2015 to 30-June-2015). Snapshot from SACRED-SUPARCO's DisasterWatch platform.

Figure 1: A country-level pre-monsoon NDWI layer created using quality-pixel cloud-free composite (01-May-2015 to 30-June-2015). Snapshot from SACRED-SUPARCO’s DisasterWatch platform.

Earth Engine became particularly useful for our team at SUPARCO’s Space Application Center for Response in Emergencies and Disasters (SACRED). This year, the runoff from monsoonal rains in Pakistan compounded the peak snow-melt flows in the Indus river resulting in “High” to “Very High” flood levels in the lower Indus river. The floods wreaked havoc in upper catchments of Indus River and its western tributaries while subsequent riverine floods affected large swathes of land in the Indus floodplains. SACRED-SUPARCO’s DisasterWatch platform was used to share updated analysis and spatial information extracted from various satellite-based datasets and technologies. While DisasterWatch aims to provide the latest satellite-based information and analysis to disaster management stakeholders in the country, the acquisition, processing and analysis of satellite data from myriad sources in near real time is not a trivial task. Working in crisis response with great chunks of data from multiple sources of varying resolutions, any time saved is invaluable. Therefore, we decided to take advantage of the EE platform and offload the entire work-flows of open datasets (Landsat and MODIS) to the EE. Using EE we were able to develop, for example, a quality-pixel cloud-free composite of Pakistan using Landsat-8 pre-monsoon time-series (01-May-2015 to 30-June-2015) and extract river course and water bodies in a few seconds (Figure 1). Downloading and processing several gigabytes of scenes over such a large basin to come up with the same result would have taken days on individual machines.

Being able to handle GBs and TBs of data with a few lines of code and computing results with them within minutes is a dream-come-true for remote sensing scientists. It instantly takes away the weight of heavy data preprocessing off your mind and helps you free your mind to instantly generate ideas and better work-flows. To get a feel for the scale of computation power under your fingertips, let us calculate how much Landsat-8 data would be required in order to generate a water occurrence density layer of Pakistan for the last 3 years that represents the periods between concurrent monsoon seasons. In this scenario, for each year, we need all Landsat-8 scenes over the region acquired between October and June, then calculate TOA reflectances, mask cloudy pixels in all scenes using quality band, and generate a composite using median values. These few steps would require more than twice the storage space of raw data and many many hours of computing for a single year. In short, a simple density map of 3 years like the one shown in Figure 2 requires processing of roughly 2 TB of Landsat-8 data over Pakistan. This work-flow when ported to EE generates the desired results within minutes.

Figure 2: Water occurrence density map for years 2013 to 2015. Snapshot from SACRED-SUPARCO's DisasterWatch platform.

Figure 2: Water occurrence density map for years 2013 to 2015. Snapshot from SACRED-SUPARCO’s DisasterWatch platform.

During this year’s flooding, intense cloud cover started affecting our abilities to use remote sensing for emergency response effectively. It became inevitable to integrate Synthetic Aperture Radar (SAR) data in inundation analysis. Using Google Earth Engine, we were able to access calibrated backscatter from Sentinel-1 scenes over the flooded regions in the shortest possible time for flood detection. Google EE team’s extended assistance in timely ingestion of required Sentinel-1 data for emergency response in flood 2015, during the development and experimental phase of their Sentinel-1 ingestion, was highly commendable. Traditional work-flow would include preprocessing individual Sentinel-1 scene using the Sentinels Application Platform (SNAP). Google EE team’s prompt support during the flood season aided in near real-time analysis of multiple scenes over the Indus basin leading to timely dissemination of detailed inundation to flood managers across Pakistan (Figure 3). The support extended by the EE team saved all the time it would have taken to download gigabytes of Sentinel-1 data and processing of individual scenes. This enabled rapid inundation analysis using entire Sentinel-1 passes and information dissemination within a few hours.

Figure 3: Detailed inundation in the Indus river derived using Sentinel-1 scenes. Snapshot from SACRED-SUPARCO's DisasterWatch platform.

Figure 3: Detailed inundation in the Indus river derived using Sentinel-1 scenes. Snapshot from SACRED-SUPARCO’s DisasterWatch platform.

In short, the Google Earth Engine brings together over 4 decades of satellite imagery that is updated daily, and scientific algorithms to analyze that data by harnessing the computational power of the Google cloud. With more and more datasets being made available, and algorithms being developed with the help of a growing community, the applications of this platform are immense. Bringing together datasets from multiple sources to solve scientific problems has never been uncomplicated and effortless, what one can now create with the GEE is what one can translate from the mind to the code.

About this post: This is a guest post by Dr. Umair Rabbani. Learn more about this blog’s authors here

Advertisements