Featured post

Processing Terabytes of Satellite Imagery in Google Earth Engine: Crisis Response for 2015 Flood Season in Pakistan

Since the launch of Landsat-1 in the early `70s, a continuous observation of the globe from satellites has generated unprecedented volumes of remote sensing data. Spanning across the last 40 years, the USGS Landsat program represents the longest running record of the landscape of our planet. The change in its data distribution policy during the last decade has allowed earth scientists across the globe to benefit from this invaluable archive. Similarly, the daily global coverage of the MODIS instrument on-board NASA’s Terra and Aqua satellites is relentlessly generating data products since 2000.

Traditionally, processing and analysis of datasets spread across large spatial or temporal scales has been a bottleneck for large-scale environmental monitoring. Carrying out analysis and research on gigabytes of satellite data generated weekly for a period of 10-15 years (or more) would pose a serious data handling and computational nightmare. In comes Google. A new project called the Google Earth Engine (GEE) attempts to solve just that! Google Earth Engine is a platform that brings together the enormous archive of current and historical satellite imagery, and provides tools for visualization and analysis. This enables earth scientists everywhere to leapfrog over the computational-barriers to the science-with-remote-sensing. It allows the EE trusted users to use Google’s extensive cloud computing resources to analyze and interpret satellite imagery. It can also be used through its API available in both Javascript and Python. Google has provided a web based IDE to use Javascript API called the Earth Engine Playground. There is a wealth of algorithms available to perform image maths, spatial filtering, calibrations, geometry operations and machine learning tasks, and the list is growing.

Google EE public catalog currently stores more than 5 petabytes of data in the form of 5+ million images of 200+ datasets adding 4000+ new images every day. To name a few, the archive includes Landsat (raw, TOA, SR, composites, NDVI etc), MODIS daily products (NBAR, LST, NDVI etc), terrain (SRTM, GTOPO, NED), atmospheric datasets (NOAA NCEP, OMI, GFS), land cover and other datasets (GlobCover, NLCD, WorldPOP etc). This rich treat of datasets made accessible is enough to make any Remote Sensing scientist, enthusiast, developer and spatial data analyst salivate and drool!

In a discussion with Simon Ilyushchenko, an engineer on the Google Earth Engine team, he mentions “We currently run daily ingestion for many of the datasets we host, including Landsat 7, Landsat 8, several MODIS products and a number of weather & climate datasets.” Discussing Google EE’s latest collaboration with European institutions for the availability of Copernicus Sentinel-1 data through Earth Engine, he added, “We are downloading all of the Sentinel-1 GRD data and have started ingesting it, but we’re still experimenting to determine what processing & calibration steps have to be applied to the data before it’ll be ready to use.  We hope to stabilize this by Q4 and then we’ll make the whole collection available, with automatic daily updates. In addition to Sentinel-1, there are a number of other large datasets that we’re looking into including VIIRS, GOES and AVHRR, but we’re constantly adding smaller datasets. We generally decide which datasets to ingest based on user input and votes on our issue tracker, giving priority to those datasets that will be the most useful to the widest audience.”

Figure 1: A country-level pre-monsoon NDWI layer created using quality-pixel cloud-free composite (01-May-2015 to 30-June-2015). Snapshot from SACRED-SUPARCO's DisasterWatch platform.

Figure 1: A country-level pre-monsoon NDWI layer created using quality-pixel cloud-free composite (01-May-2015 to 30-June-2015). Snapshot from SACRED-SUPARCO’s DisasterWatch platform.

Earth Engine became particularly useful for our team at SUPARCO’s Space Application Center for Response in Emergencies and Disasters (SACRED). This year, the runoff from monsoonal rains in Pakistan compounded the peak snow-melt flows in the Indus river resulting in “High” to “Very High” flood levels in the lower Indus river. The floods wreaked havoc in upper catchments of Indus River and its western tributaries while subsequent riverine floods affected large swathes of land in the Indus floodplains. SACRED-SUPARCO’s DisasterWatch platform was used to share updated analysis and spatial information extracted from various satellite-based datasets and technologies. While DisasterWatch aims to provide the latest satellite-based information and analysis to disaster management stakeholders in the country, the acquisition, processing and analysis of satellite data from myriad sources in near real time is not a trivial task. Working in crisis response with great chunks of data from multiple sources of varying resolutions, any time saved is invaluable. Therefore, we decided to take advantage of the EE platform and offload the entire work-flows of open datasets (Landsat and MODIS) to the EE. Using EE we were able to develop, for example, a quality-pixel cloud-free composite of Pakistan using Landsat-8 pre-monsoon time-series (01-May-2015 to 30-June-2015) and extract river course and water bodies in a few seconds (Figure 1). Downloading and processing several gigabytes of scenes over such a large basin to come up with the same result would have taken days on individual machines.

Being able to handle GBs and TBs of data with a few lines of code and computing results with them within minutes is a dream-come-true for remote sensing scientists. It instantly takes away the weight of heavy data preprocessing off your mind and helps you free your mind to instantly generate ideas and better work-flows. To get a feel for the scale of computation power under your fingertips, let us calculate how much Landsat-8 data would be required in order to generate a water occurrence density layer of Pakistan for the last 3 years that represents the periods between concurrent monsoon seasons. In this scenario, for each year, we need all Landsat-8 scenes over the region acquired between October and June, then calculate TOA reflectances, mask cloudy pixels in all scenes using quality band, and generate a composite using median values. These few steps would require more than twice the storage space of raw data and many many hours of computing for a single year. In short, a simple density map of 3 years like the one shown in Figure 2 requires processing of roughly 2 TB of Landsat-8 data over Pakistan. This work-flow when ported to EE generates the desired results within minutes.

Figure 2: Water occurrence density map for years 2013 to 2015. Snapshot from SACRED-SUPARCO's DisasterWatch platform.

Figure 2: Water occurrence density map for years 2013 to 2015. Snapshot from SACRED-SUPARCO’s DisasterWatch platform.

During this year’s flooding, intense cloud cover started affecting our abilities to use remote sensing for emergency response effectively. It became inevitable to integrate Synthetic Aperture Radar (SAR) data in inundation analysis. Using Google Earth Engine, we were able to access calibrated backscatter from Sentinel-1 scenes over the flooded regions in the shortest possible time for flood detection. Google EE team’s extended assistance in timely ingestion of required Sentinel-1 data for emergency response in flood 2015, during the development and experimental phase of their Sentinel-1 ingestion, was highly commendable. Traditional work-flow would include preprocessing individual Sentinel-1 scene using the Sentinels Application Platform (SNAP). Google EE team’s prompt support during the flood season aided in near real-time analysis of multiple scenes over the Indus basin leading to timely dissemination of detailed inundation to flood managers across Pakistan (Figure 3). The support extended by the EE team saved all the time it would have taken to download gigabytes of Sentinel-1 data and processing of individual scenes. This enabled rapid inundation analysis using entire Sentinel-1 passes and information dissemination within a few hours.

Figure 3: Detailed inundation in the Indus river derived using Sentinel-1 scenes. Snapshot from SACRED-SUPARCO's DisasterWatch platform.

Figure 3: Detailed inundation in the Indus river derived using Sentinel-1 scenes. Snapshot from SACRED-SUPARCO’s DisasterWatch platform.

In short, the Google Earth Engine brings together over 4 decades of satellite imagery that is updated daily, and scientific algorithms to analyze that data by harnessing the computational power of the Google cloud. With more and more datasets being made available, and algorithms being developed with the help of a growing community, the applications of this platform are immense. Bringing together datasets from multiple sources to solve scientific problems has never been uncomplicated and effortless, what one can now create with the GEE is what one can translate from the mind to the code.

About this post: This is a guest post by Dr. Umair Rabbani. Learn more about this blog’s authors here

High-Resolution Population Density Mapping by Facebook and DigitalGlobe

Few months back, we all read the news that Facebook has utilized satellite imagery to generate an estimate of population density over different regions of the Earth. This task was accomplished by Facebook Connectivity Lab, with the goal to identifying possible connectivity options for high population density (urban areas) and low population density (rural areas). These connectivity options can range from Wi-Fi, cellular network, satellite communication, and even laser communication via drones.

Facebook Connectivity Lab found that current population density estimates from censuses are insufficient for this planning purpose, and resolved to make their own high spatial resolution population density estimates from satellite data. What they did was take their computer vision techniques developed for face recognition and photo tagging suggestions in images and applied the same algorithms to analyzing high-resolution satellite imagery (50 cm pixel size) from DigitalGlobe. DigitalGlobe’s Geospatial Big Data platform was made available to Facebook, along with their algorithms for mosaicking and atmospheric correction. The technical methodology employed by DigitalGlobe and Facebook Connectivity Lab, is detailed in this white paper by Facebook. DigitalGlobe’s high resolution satellite data from the past 5 years or so (imagery from high-resolution WorldView and GeoEye satellites), were utilized, and they only used cloud-free visible RGB bands. For cloudy imagery, third party population data was used to fill in the gaps. On this big geospatial dataset from DigitalGlobe, the Facebook team analyzed 20 countries, 21.6 million square km, and 350 TB of imagery using convolutional neural networks. Their final dataset has 5 m resolution, particularly focusing on rural and remote areas, and improves over previous countrywide population density estimates by multiple orders of magnitude.



Augmented Reality Sandboxes

Scientists have put Microsoft / Xbox Kinect sensors to great use over the years. One of these uses is in simulation of physical effects in terms of geography and mapping, ranging from topography to water flow. By now, many of these augmented reality interactive sandboxes are in action.

One of the most popular of these toolboxes is the aptly-named Augmented Reality Sandbox built by the UC Davis W.M. Keck Center for Active Visualization in the Earth Sciences for an NSF-funded project on informal science education. Learn the latest updates, and keep up with developments on the project page here and here. This sandbox works with a Kinect 3D camera and a projector to project a real-time colored topographic map with contour lines onto the sand surface. Mathematical GPU-based simulations govern the virtual water flow over the surface. This sandbox is already an interactive display at the University of California Davis Tahoe Environmental Research Center (TERC) and the Lawrence Hall of Science at University of California, Berkeley, among many other places. There are some cool demo videos for this sandbox, depicting real-time water flow simulation with respect to topography and virtual dam failure simulation, among others.

See a detailed article on Wired about this sandbox here.

A company from the Czech Republic offers their SandyStation augmented reality sandbox. Two good lists and descriptions of other virtual reality sandboxes all over the world are available here and here.

The Definition of Research

Recently I came across an interview of the renowned Pakistani physicist and social activist Dr. Parvez Hoodhbhoy in the MIT Technology Review Pakistan. Among many other interesting things discussed and mentioned there, I found his definition of research most interesting, which I am copying below:

Research in any professional field — mathematics or physics, molecular biology or engineering, economics or archaeology — does not have a unique, precise definition. But a tentative, exploratory definition might be that research is the discovery of new and interesting phenomena, creation of concepts that have explanatory or predictive power, or the making of new and useful inventions and processes. In the world of science, the researcher must certainly do something original, not merely repeat what is already known. Just doing something for the first time is not good enough to qualify as research. So, for example, one does not do meaningful research by gathering all kinds of butterflies and listing the number caught of each kind in a particular place at a particular time, etc. Nor does it come from making standard measurements, substituting one material after the other just because “it’s not been done before.” That’s mere alchemy, i.e., pretty useless.



Mapping 3 Decades of Global Surface Water Occurrence with Landsat

Recently, I posted an analysis of the Orbital Insight’s Global Water Reserves product, in which they use deep learning to automatically detect global surface water on a weekly to bi-weekly basis using Landsat images. In this post, I want to draw attention to work done by the European Commission’s Joint Research Centre (JRC) in which they used Google Earth Engine‘s extensive Landsat archive to derive global surface water occurrence map, along with probability and seasonality measures. They have used Landsat 5, 7, and 8 for this study.

This work by JRC is of a much more scientific nature than Orbital Insight’s global water mapping, giving the capability of study and analysis of river dynamics and morphology also. The study also reports some validation statistics.

See this amazing talk video on the study from the Google Earth Engine User Summit, Oct., 2015. The slide deck is available here.

Other research groups are also working on similar solutions; see, for example, this news report about Amy Hudson at the University of Maryland trying to use GEE in a similar manner to analyse global surface water dynamics using Landsat.

Sentinel Delivers Postcards from Space


Sentinel Hub Postcard for Islamabad, Pakistan, May 2016.

The Sentinel Hub webpage has an interesting postcards app, which lets you create a quick postcard from the Sentinel-2 imagery archive. The image can be downloaded or shared directly from the webpage. The image can be displayed in true color, enhanced color, NDVI, along with some other satellite-measured parameters.

Orbital Insight’s Global Water Reserves: Automatic Detection of Water in Landsat Imagery using Deep Learning

A few months ago, an article in MIT Technology Review showed how Orbital Insight utilized deep learning to automatically monitor and analyze water levels over the whole world on a weekly basis utilising publicly available Landsat 7 / 8 imagery.

It is interesting to note that on a basic level, detecting water in Landsat images is not a too complex problem, as water is known to have a very weak reflectance in NIR, and very often just using Band 4 in Landsat 7 can give a clear indication and differentiation of water from other land surface features.


Spectral reflection curve of water, soil and vegetation, overlaid with the spectral bands of Landsat 7. Source: http://www.seos-project.eu/modules/remotesensing/remotesensing-c01-p05.html

The amazing thing that Orbital Insight has done is to largely automate this whole processing, and build a process chain to utilizing the huge Landsat archive and do this on a running weekly to bi-weekly basis. I’m sure there is a certain degree of accuracy in this, which I hope will be reported somewhere soon (maybe it has been already, but I have not come across it). Cloud shadows and mountain shadows can give significant errors in detecting water in Landsat imagery. Orbital Insight is analyzing huge chunks of images, turning it practically into a big data problem, and the task of automatically adjusting the algorithm for multiple images in time and spread all over the world is a big achievement because of varying local conditions.

Learn more about Orbital Insight’s Global Water Reserves product here.

Example vertical profile of radar backscatter from F-SAR. Backscatter is scaled in shades of green. Solid  green lines represent liar-measured heights of forest floor and crown.
Image credit: DLR

Measuring 3D Forest Structure through Radar Remote Sensing


Polarimetric L-band F-SAR image of the study site in southeastern Bavaria, Germany. The image is shown in false color: forest areas appear green, while surfaces with low vegetation are shown in blue / red. Image credit: DLR

Radar remote sensing can enable us to see and construct a full 3D view of forest structure and trees. In a joint research study conducted last year, NASA and DLR proved this concept in airborne flights over a test region in southeastern Bavaria, Germany, where both agencies flew their own airborne radar sensors over a period of a few days. NASA flew its well-known L-band UAVSAR sensor, while DLR flew its F-SAR system. The F-SAR system is unique as it does coincident radar imaging at L-, C-, and X- bands. Radar remote sensing analysts know well that lower frequencies like L-band can penetrate right down to the forest floor, C-band frequencies penetrate the canopy to some extent, while X-band frequencies are reflected from the top of the tree canopy. Utilizing these three frequencies simultaneously for forest imaging allows full 3D mapping of the forest, from the upper section of the forest crown, canopy, branches, down to the under-canopy vegetation and forest floor.

See the DLR official press release for more info.


Example vertical profile of radar backscatter from F-SAR. Backscatter is scaled in shades of green. Solid green lines represent liar-measured heights of forest floor and crown. Image credit: DLR

Many other research groups are also pursuing similar goals to measure forests in 3D using SAR remote sensing. One such technique which can be applied to both airborne and spaceborne SAR sensors is POLinSAR (Polarimetric Interferometric SAR).

The Finnish Geodetic Institute is leading a research effort to measure 3D forest structure using a multiple active sensors, including SAR imagery from Sentinel-1, TerraSAR-X / TanDEM-X, and ALOS-2 PALSAR, along with optical satellite stereo imagery, and Airborne Laser Scanning (ALS). Learn more about their research here and here.