Featured post

Processing Terabytes of Satellite Imagery in Google Earth Engine: Crisis Response for 2015 Flood Season in Pakistan

Since the launch of Landsat-1 in the early `70s, a continuous observation of the globe from satellites has generated unprecedented volumes of remote sensing data. Spanning across the last 40 years, the USGS Landsat program represents the longest running record of the landscape of our planet. The change in its data distribution policy during the last decade has allowed earth scientists across the globe to benefit from this invaluable archive. Similarly, the daily global coverage of the MODIS instrument on-board NASA’s Terra and Aqua satellites is relentlessly generating data products since 2000.

Traditionally, processing and analysis of datasets spread across large spatial or temporal scales has been a bottleneck for large-scale environmental monitoring. Carrying out analysis and research on gigabytes of satellite data generated weekly for a period of 10-15 years (or more) would pose a serious data handling and computational nightmare. In comes Google. A new project called the Google Earth Engine (GEE) attempts to solve just that! Google Earth Engine is a platform that brings together the enormous archive of current and historical satellite imagery, and provides tools for visualization and analysis. This enables earth scientists everywhere to leapfrog over the computational-barriers to the science-with-remote-sensing. It allows the EE trusted users to use Google’s extensive cloud computing resources to analyze and interpret satellite imagery. It can also be used through its API available in both Javascript and Python. Google has provided a web based IDE to use Javascript API called the Earth Engine Playground. There is a wealth of algorithms available to perform image maths, spatial filtering, calibrations, geometry operations and machine learning tasks, and the list is growing.

Google EE public catalog currently stores more than 5 petabytes of data in the form of 5+ million images of 200+ datasets adding 4000+ new images every day. To name a few, the archive includes Landsat (raw, TOA, SR, composites, NDVI etc), MODIS daily products (NBAR, LST, NDVI etc), terrain (SRTM, GTOPO, NED), atmospheric datasets (NOAA NCEP, OMI, GFS), land cover and other datasets (GlobCover, NLCD, WorldPOP etc). This rich treat of datasets made accessible is enough to make any Remote Sensing scientist, enthusiast, developer and spatial data analyst salivate and drool!

In a discussion with Simon Ilyushchenko, an engineer on the Google Earth Engine team, he mentions “We currently run daily ingestion for many of the datasets we host, including Landsat 7, Landsat 8, several MODIS products and a number of weather & climate datasets.” Discussing Google EE’s latest collaboration with European institutions for the availability of Copernicus Sentinel-1 data through Earth Engine, he added, “We are downloading all of the Sentinel-1 GRD data and have started ingesting it, but we’re still experimenting to determine what processing & calibration steps have to be applied to the data before it’ll be ready to use.  We hope to stabilize this by Q4 and then we’ll make the whole collection available, with automatic daily updates. In addition to Sentinel-1, there are a number of other large datasets that we’re looking into including VIIRS, GOES and AVHRR, but we’re constantly adding smaller datasets. We generally decide which datasets to ingest based on user input and votes on our issue tracker, giving priority to those datasets that will be the most useful to the widest audience.”

Figure 1: A country-level pre-monsoon NDWI layer created using quality-pixel cloud-free composite (01-May-2015 to 30-June-2015). Snapshot from SACRED-SUPARCO's DisasterWatch platform.

Figure 1: A country-level pre-monsoon NDWI layer created using quality-pixel cloud-free composite (01-May-2015 to 30-June-2015). Snapshot from SACRED-SUPARCO’s DisasterWatch platform.

Earth Engine became particularly useful for our team at SUPARCO’s Space Application Center for Response in Emergencies and Disasters (SACRED). This year, the runoff from monsoonal rains in Pakistan compounded the peak snow-melt flows in the Indus river resulting in “High” to “Very High” flood levels in the lower Indus river. The floods wreaked havoc in upper catchments of Indus River and its western tributaries while subsequent riverine floods affected large swathes of land in the Indus floodplains. SACRED-SUPARCO’s DisasterWatch platform was used to share updated analysis and spatial information extracted from various satellite-based datasets and technologies. While DisasterWatch aims to provide the latest satellite-based information and analysis to disaster management stakeholders in the country, the acquisition, processing and analysis of satellite data from myriad sources in near real time is not a trivial task. Working in crisis response with great chunks of data from multiple sources of varying resolutions, any time saved is invaluable. Therefore, we decided to take advantage of the EE platform and offload the entire work-flows of open datasets (Landsat and MODIS) to the EE. Using EE we were able to develop, for example, a quality-pixel cloud-free composite of Pakistan using Landsat-8 pre-monsoon time-series (01-May-2015 to 30-June-2015) and extract river course and water bodies in a few seconds (Figure 1). Downloading and processing several gigabytes of scenes over such a large basin to come up with the same result would have taken days on individual machines.

Being able to handle GBs and TBs of data with a few lines of code and computing results with them within minutes is a dream-come-true for remote sensing scientists. It instantly takes away the weight of heavy data preprocessing off your mind and helps you free your mind to instantly generate ideas and better work-flows. To get a feel for the scale of computation power under your fingertips, let us calculate how much Landsat-8 data would be required in order to generate a water occurrence density layer of Pakistan for the last 3 years that represents the periods between concurrent monsoon seasons. In this scenario, for each year, we need all Landsat-8 scenes over the region acquired between October and June, then calculate TOA reflectances, mask cloudy pixels in all scenes using quality band, and generate a composite using median values. These few steps would require more than twice the storage space of raw data and many many hours of computing for a single year. In short, a simple density map of 3 years like the one shown in Figure 2 requires processing of roughly 2 TB of Landsat-8 data over Pakistan. This work-flow when ported to EE generates the desired results within minutes.

Figure 2: Water occurrence density map for years 2013 to 2015. Snapshot from SACRED-SUPARCO's DisasterWatch platform.

Figure 2: Water occurrence density map for years 2013 to 2015. Snapshot from SACRED-SUPARCO’s DisasterWatch platform.

During this year’s flooding, intense cloud cover started affecting our abilities to use remote sensing for emergency response effectively. It became inevitable to integrate Synthetic Aperture Radar (SAR) data in inundation analysis. Using Google Earth Engine, we were able to access calibrated backscatter from Sentinel-1 scenes over the flooded regions in the shortest possible time for flood detection. Google EE team’s extended assistance in timely ingestion of required Sentinel-1 data for emergency response in flood 2015, during the development and experimental phase of their Sentinel-1 ingestion, was highly commendable. Traditional work-flow would include preprocessing individual Sentinel-1 scene using the Sentinels Application Platform (SNAP). Google EE team’s prompt support during the flood season aided in near real-time analysis of multiple scenes over the Indus basin leading to timely dissemination of detailed inundation to flood managers across Pakistan (Figure 3). The support extended by the EE team saved all the time it would have taken to download gigabytes of Sentinel-1 data and processing of individual scenes. This enabled rapid inundation analysis using entire Sentinel-1 passes and information dissemination within a few hours.

Figure 3: Detailed inundation in the Indus river derived using Sentinel-1 scenes. Snapshot from SACRED-SUPARCO's DisasterWatch platform.

Figure 3: Detailed inundation in the Indus river derived using Sentinel-1 scenes. Snapshot from SACRED-SUPARCO’s DisasterWatch platform.

In short, the Google Earth Engine brings together over 4 decades of satellite imagery that is updated daily, and scientific algorithms to analyze that data by harnessing the computational power of the Google cloud. With more and more datasets being made available, and algorithms being developed with the help of a growing community, the applications of this platform are immense. Bringing together datasets from multiple sources to solve scientific problems has never been uncomplicated and effortless, what one can now create with the GEE is what one can translate from the mind to the code.

About this post: This is a guest post by Dr. Umair Rabbani. Learn more about this blog’s authors here

DG Launches SpaceNet, Opening Access to Hi-Res Satellite Imagery for Deep Learning Research

DigitalGlobe has recently launched SpaceNet, an online repository of satellite imagery and associated training data, for users to experiment with machine learning and deep learning algorithms. Spacenet has been launched as a collaboration between DigitalGlobe, CosmiQ Works, and NVIDIA, and is available as a public dataset on Amazon Web Services (AWS). As a first step, SpaceNet will contain DigitalGlobe’s high resolution multispectral imagery from their premier WorldView-2 satellite at its industry-leading full 8-band spectral resolution and over 200,000 curated building footbrints across Rio de Janeiro, Brazil. This is unprecedented: Never before has satellite imagery at such high resolution of 50 cm been released publicly with building annotations. The released dataset contains over 7000 images over Rio de Janeiro. The satellite imagery is being delivered in GeoTIFF format while the building footprints are in GeoJSON format.


True color WV-2 high resolution imagery sample from the SpaceNet repository, along with corresponding building footprints. Source: NVIDIA

According to SpaceNet:

This dataset is being made public to advance the development of algorithms to automatically extract geometric features such as roads, building footprints, and points of interest using satellite imagery.

Scripts are already cropping up on GitHub for manipulating and using the satellite imagery data on SpaceNet: see code examples from Development Seed here and from CosmiQ Works here. NVIDIA has also released a detailed case study of analysis of SpaceNet data using their Deep Learning GPU Training System (DIGITS) platform, demonstrating the power and capability of GPU-based deep learning algorithms applied over high resolution satellite imagery. Application examples include detection of each building as a separate object and determining a bounding box around it, and semantic segmentation to partition the image into regions of pixels that can be given a common label, such as “building”, “forest”, “road”, or “water”.

SpaceNet plans a massive increase in both images and labeled features to be made available over the platform in the future. Incidentally, the name SpaceNet is inspired from ImageNet, a similar database of images created to help spur early advancements in computer vision.

To read more about the launch of SpaceNet, see coverage on GISCafeTechCrunch, MIT Technology Review, and Popular Science.

SpaceNet datasets can be accessed on AWS here.


MODIStsp: R Package for Analysing MODIS Time Series Data

An R package for automating the processing and analysis of MODIS Land Products raster datasets has recently been released by Lorenzo Busetto and Luigi Ranghetti at the Institute for Electromagnetic Sensing of  Environment, National Research Council of Italy (IREA-CNR). This package, called MODIStsp, is available for download on GitHub. It provides a user-friendly GUI, batch processing utilities, and access to source code for user modification and customisation. MODIStsp has the capability of performing several preprocessing steps (e.g. download, mosaicing, reprojection and resize) on MODIS products, and on-the-fly computation of time series of Spectral Indexes

For more details and package release information, please visit Spatial Processing in R blog and see the paper published in Computers & Geosciences journal.


High-Resolution Population Density Mapping by Facebook and DigitalGlobe

Few months back, we all read the news that Facebook has utilized satellite imagery to generate an estimate of population density over different regions of the Earth. This task was accomplished by Facebook Connectivity Lab, with the goal to identifying possible connectivity options for high population density (urban areas) and low population density (rural areas). These connectivity options can range from Wi-Fi, cellular network, satellite communication, and even laser communication via drones.

Facebook Connectivity Lab found that current population density estimates from censuses are insufficient for this planning purpose, and resolved to make their own high spatial resolution population density estimates from satellite data. What they did was take their computer vision techniques developed for face recognition and photo tagging suggestions in images and applied the same algorithms to analyzing high-resolution satellite imagery (50 cm pixel size) from DigitalGlobe. DigitalGlobe’s Geospatial Big Data platform was made available to Facebook, along with their algorithms for mosaicking and atmospheric correction. The technical methodology employed by DigitalGlobe and Facebook Connectivity Lab, is detailed in this white paper by Facebook. DigitalGlobe’s high resolution satellite data from the past 5 years or so (imagery from high-resolution WorldView and GeoEye satellites), were utilized, and they only used cloud-free visible RGB bands. For cloudy imagery, third party population data was used to fill in the gaps. On this big geospatial dataset from DigitalGlobe, the Facebook team analyzed 20 countries, 21.6 million square km, and 350 TB of imagery using convolutional neural networks. Their final dataset has 5 m resolution, particularly focusing on rural and remote areas, and improves over previous countrywide population density estimates by multiple orders of magnitude.



Augmented Reality Sandboxes

Scientists have put Microsoft / Xbox Kinect sensors to great use over the years. One of these uses is in simulation of physical effects in terms of geography and mapping, ranging from topography to water flow. By now, many of these augmented reality interactive sandboxes are in action.

One of the most popular of these toolboxes is the aptly-named Augmented Reality Sandbox built by the UC Davis W.M. Keck Center for Active Visualization in the Earth Sciences for an NSF-funded project on informal science education. Learn the latest updates, and keep up with developments on the project page here and here. This sandbox works with a Kinect 3D camera and a projector to project a real-time colored topographic map with contour lines onto the sand surface. Mathematical GPU-based simulations govern the virtual water flow over the surface. This sandbox is already an interactive display at the University of California Davis Tahoe Environmental Research Center (TERC) and the Lawrence Hall of Science at University of California, Berkeley, among many other places. There are some cool demo videos for this sandbox, depicting real-time water flow simulation with respect to topography and virtual dam failure simulation, among others.

See a detailed article on Wired about this sandbox here.

A company from the Czech Republic offers their SandyStation augmented reality sandbox. Two good lists and descriptions of other virtual reality sandboxes all over the world are available here and here.

The Definition of Research

Recently I came across an interview of the renowned Pakistani physicist and social activist Dr. Parvez Hoodhbhoy in the MIT Technology Review Pakistan. Among many other interesting things discussed and mentioned there, I found his definition of research most interesting, which I am copying below:

Research in any professional field — mathematics or physics, molecular biology or engineering, economics or archaeology — does not have a unique, precise definition. But a tentative, exploratory definition might be that research is the discovery of new and interesting phenomena, creation of concepts that have explanatory or predictive power, or the making of new and useful inventions and processes. In the world of science, the researcher must certainly do something original, not merely repeat what is already known. Just doing something for the first time is not good enough to qualify as research. So, for example, one does not do meaningful research by gathering all kinds of butterflies and listing the number caught of each kind in a particular place at a particular time, etc. Nor does it come from making standard measurements, substituting one material after the other just because “it’s not been done before.” That’s mere alchemy, i.e., pretty useless.



Mapping 3 Decades of Global Surface Water Occurrence with Landsat

Recently, I posted an analysis of the Orbital Insight’s Global Water Reserves product, in which they use deep learning to automatically detect global surface water on a weekly to bi-weekly basis using Landsat images. In this post, I want to draw attention to work done by the European Commission’s Joint Research Centre (JRC) in which they used Google Earth Engine‘s extensive Landsat archive to derive global surface water occurrence map, along with probability and seasonality measures. They have used Landsat 5, 7, and 8 for this study.

This work by JRC is of a much more scientific nature than Orbital Insight’s global water mapping, giving the capability of study and analysis of river dynamics and morphology also. The study also reports some validation statistics.

See this amazing talk video on the study from the Google Earth Engine User Summit, Oct., 2015. The slide deck is available here.

Other research groups are also working on similar solutions; see, for example, this news report about Amy Hudson at the University of Maryland trying to use GEE in a similar manner to analyse global surface water dynamics using Landsat.

Sentinel Delivers Postcards from Space


Sentinel Hub Postcard for Islamabad, Pakistan, May 2016.

The Sentinel Hub webpage has an interesting postcards app, which lets you create a quick postcard from the Sentinel-2 imagery archive. The image can be downloaded or shared directly from the webpage. The image can be displayed in true color, enhanced color, NDVI, along with some other satellite-measured parameters.