martes, 3 de abril de 2018

Sentinel-2 imagery with Big Spatial Data software (Hadoop + HBase + GeoWave + GeoServer)


This tutorial shows results of publishing Sentinel-2 imagery with opensource Big Spatial Data software.

It continues this previous article focused on vector data. You can read it to learn about first steps with this BigData technology and to setup your own platform. Of course, you can read the fantastic GeoWave documentation.

Now we are going to use raster data, such as Sentinel-2 imagery (Level-2A) from Sentinel 2A and 2B satellites. The idea is to download and ingest Sentinel-2 data into GeoWave in order to offer new products processed from these images and finally publish them as OGC services.

Landsat, Sentinel-1 and Sentinel-2 data are available for anyone to use via many repositories; Open Access Hub, Sentinel-hub, Amazon S3, and so on. In most of them, only Sentinel-2 Level-1C (Top Of Atmosphere reflectance) imagery is freely accessible, onward processing from Level-1C to Level-2A (Bottom Of Atmosphere reflectance) is performed by the user using the Sentinel-2 Toolbox or similar. As you can know, this process is a CPU-RAM intensive task, and sometimes it takes many time too.

I know an interesting Sentinel-2 repository in France, Theia Land Data Center, that provides, among other things, Sentinel-2 Level-2A products transformed with MAJA processor. My idea is to download and directly consume these data in a GeoWave platform.

For that purpose I have developed an extension for GeoWave to directly ingest Theia data with similar tools that current Landast8 extension provides.


Sentinel-2 imagery


The Sentinel-2 mission (ESA) is a land monitoring constellation of two satellites that provide high resolution optical imagery and provide continuity for the current SPOT and Landsat missions.




The mission provides a global coverage of the Earth's land surface every 10 days with one satellite and 5 days with 2 satellites, making the data of great use in on-going studies. The satellites are equipped with the state-of-the-art MSI (Multispectral Imager) instrument, that offers high-resolution optical imagery.

Sentinel-2 delivers high-resolution optical images for land monitoring, emergency response and security services. The satellites carry a multispectral imager with a swath of 290 km. The imager provides a versatile set of 13 spectral bands spanning from the visible and near infrared to the shortwave infrared, featuring four spectral bands at 10 meter, six bands at 20 meter and three bands at 60 meter spatial resolution.

Learn more about Sentinel-2 data products.

Theia Land Data Center


The Theia Land Data Center is a French national inter-agency organization designed to foster the use of images issued from the space observation of land surfaces. Theia is offering scientific communities and public policy actors a broad range of images at different scales, methods and services.




The Sentinel-2 Level-2A data distributed by Theia (CNES) are based on Copernicus Sentinel-2 Level-1C data, but processed with MAJA processor.

Theia does not offer data of the whole world, but most of western Europe is covered:



It is adequate for my test, I am using it in my BigData platform to avoid the CPU-RAM intensive task of sen2cor tool by ESA, and also I am testing other fun ways.

Theia extension for GeoWave


We can download the Theia imagery with the great theia_download tool by Olivier Hagolle. Later, we would load the images with the available raster commands of GeoWave (Raster demo here).

But I have developed a new module for GeoWave to directly ingest the Theia Sentinel-2 Level-2A data. This extension works in a very similar way as current Landast8 extension does.

The module complements GeoWave commandline tools with direct access to Theia public imagery. To use, ensure the module is on the classpath for your geowave commandline tools and then you should have geowave theia options available to you. "analyze" and "download" are completely separate from storage within GeoWave. The "ingest" routines wrap download with the additional step of ingesting into GeoWave. "ingestraster" and "ingestvector" are fairly self-explanatory and ingest just wraps both in a single command so for all of the scenes and bands you have ingested into your grid coverage (raster) layer, you will have the vector layers of scenes and bands with associated metadata.

The following is the commandline usage help listing the set of available commands and options:

Usage: geowave theia [options] 
Commands: 
 analyze 
  Print out basic aggregate statistics for available Theia imagery. 
 download 
  Download Theia imagery to a local directory. 
 ingest 
  Ingest routine for locally downloading Theia imagery and ingesting
  ... 
 ingestraster 
  Ingest routine for locally downloading Theia imagery and ingesting
  ... 
 ingestvector 
  Ingest routine for searching Theia scenes that match certain
  filter...

Read full description of options here.

The Theia REST service only returns metadata of 500 records by request, if you do not obtain correct results, maybe you have to limit the response defining a more restrictive criteria. Although the CQL expression can contain many attributes, the module accepts a set of main filter fields to pre-filter server-side the data to consider.

The available parameters to filter are:
  • --collection: Product collection to fetch ('SENTINEL2', 'Snow'). 
  • --platform: Satellite ('SENTINEL2A', 'SENTINEL2B'). 
  • --location: Product location, 100 km Grid Square ID of the Military Grid Reference System (EX: 'T30TWM'). 
  • --startDate & --endDate: Date window to consider. 
  • --orbitNumber: Orbit number, 
  • --relativeOrbitNumber: Relative orbit number, 
  • --cql: CQL expression to filter the ingested imagery. The feature type for the expression has the following attributes: shape (Geometry), location (String), productIdentifier (String), productType (String), collection (String), platform (String), processingLevel (String), startDate (Date), quicklook (String), thumbnail (String), bands (String), resolution (int), cloudCover (int), snowCover (int), waterCover (int), orbitNumber (int), relativeOrbitNumber (int). 

Usage, config and ingest Theia data


First of all, remember, I have my GeoWave platform running with HBASE and Hadoop (Read this for first steps), but now also it contains my Theia extension.

I defined my own environment variable with a base command in order to execute GeoWave processes as comfortable as possible:

> set GEOWAVE=
   java -cp "%GEOWAVE_HOME%/geowave-deploy-0.9.7-SNAPSHOT-tools.jar"
   mil.nga.giat.geowave.core.cli.GeoWaveMain

Then, I can easily run commands typing %geowave% [...]. To check the GeoWave version:

> %geowave% --version

Now, let's me practice with a small spatial coverage, Navarra (North of Spain), the most wonderful place in the world :-)




It covers 4 product tiles (T30TWM, T30TWN, T30TXM and T30TXN), full Theia product description here. I am always going to set a basic spatial criteria to only work with my tiles of interest:

--cql "BBOX(shape, -1.8274, 42.3253, -1.6256, 42.4735)"

Typing the "analyze" command:

> %geowave% theia analyze 
  --cql "BBOX(shape,-1.8274,42.3253,-1.6256,42.4735)"


The module prints information of scene/products and their bands: 20 scenes for this request (Several Dates), 10 bands for each one of them:




A more precise criteria using a Date range:

> %geowave% theia analyze
  --startDate "2018-01-28"
  --endDate "2018-01-30"
  --cql "BBOX(shape,-1.8274,42.3253,-1.6256,42.4735)"


Ok, our four tiles for a Date:




Next step, download data. The Sentinel-2 Level-2A data distributed by Theia are based on Copernicus Sentinel-2 Level-1C data, which are subject to this pdf). To download you need an user that you can registrer here.

When you get your credentials (user and password) you will include them in the "download" command:

> %geowave% theia download
  --startDate "2018-01-28" 
  --endDate "2018-01-30" 
  --cql "BBOX(shape,-1.8274,42.3253,-1.6256,42.4735)" 
  --userident ? 
  --password ? 



You could have noticed the size of files to download, if "CloudCover" value is low, zip files are around 2gb. The process automatically unzips the content of each compressed file in order to be ingested in a later step.

Now, we can load data into GeoWave. We need to create two GeoWave stores for this ingest. "theia-raster-store" will store the raster images, and "theia-vector-store" will store the vector metadata associated with those images:

> %geowave% config addstore theia-raster-store 
  --gwNamespace geowave.theia-raster 
  -t hbase 
  --zookeeper localhost:2222 

> %geowave% config cpstore theia-raster-store theia-vector-store 
  --gwNamespace geowave.theia-vector


And some indexes:

> %geowave% config addindex -t spatial theia-spindex 
  --partitionStrategy ROUND_ROBIN 

> %geowave% config addindex -t spatial_temporal theia-dyindex 
  --partitionStrategy ROUND_ROBIN 
  --period DAY

Typing the "ingest" command to only load bands 'B4' and 'B8', each of them will go to an independent coverage layer:

> %geowave% theia ingest
  --startDate "2018-01-28"
  --endDate "2018-01-30"
  --cql
   "BBOX(shape,-1.8274,42.3253,-1.6256,42.4735)
    AND 
   (band='B4' OR band='B8')"
  --userident ?
  --password ?
  --retainimages
  --vectorstore theia-vector-store
  --vectorindex theia-spindex,theia-dyindex
  --pyramid
  --coverage theia_navarra_mosaic-${band}
  theia-raster-store
  theia-spindex 



Or we can ingest a multiband raster layer with R+G+B channels, as GeoWave raster demo documentation shows:

> %geowave% theia ingest 
  --startDate "2018-01-28" 
  --endDate "2018-01-30" 
  --cql 
   "BBOX(shape,-1.8274,42.3253,-1.6256,42.4735) 
    AND 
   (band='B2' OR band='B3' OR band='B4')" 
  --userident ? 
  --password ? 
  --retainimages 
  --vectorstore theia-vector-store 
  --vectorindex theia-spindex,theia-dyindex 
  --pyramid 
  --coverage theia_navarra_mosaic-rgb 
  theia-raster-store 
  theia-spindex


Use the "gs" command of GeoWave to register Data Stores and Layers in a GeoServer instance:

> %geowave% gs addlayer theia-raster-store --add ALL 
> %geowave% gs addlayer theia-vector-store --add ALL

Here our grid of envelopes:




We have got two new vector layers, "theia-scene" layer with all metadata per product or scene, and "theia-band" layer that adds information for each band. Depending on you ingested the imagery, you have got a raster layer per band, or a multiband raster layer (e.g. a mix of visible channels: Red, Green and Blue bands).

A GetMap request for the Band #8 ("geowave:theia_navarra_mosaic-B8" layer):




Great! we are working with these layers in GeoServer as usual. With bands #4 and #8 we can compose the NDVI index ( (B08-B04)/(B04+B08) ), SAVI index ( (1.5*(B08-B04))/(B04+B08+0.5) ), or use them in any WPS raster process, or for whatever you need.


Time series data


A very important attribute of the imagery is the acquisition date. The scenes are packed per day, and it is very useful provide a mechanism to fetch them by a specific Date. It is supposed we are going to load images for several days, namely, we are creating a mosaic of coverages. In GeoServer terminology, we are speaking about time-series data.

We can easily configure the time-dimension settings of ingested vector layers, but there is a problem here, how can we configure it for raster layers? As far I know, the GeoServer ImageMosaic plugin does not allow to directly associate a vector layer as source of attributes (time, elevation, etc) of another raster layer. This association is not possible in Geowave platform either, but the solution would go through extend the GeoWave GridCoverageReader class with the StructuredGridCoverage2DReader interface to link vector and raster layers.


Next?


F̶o̶r̶ ̶t̶h̶e̶ ̶t̶i̶m̶e̶ ̶b̶e̶i̶n̶g̶ ̶I̶ ̶p̶u̶b̶l̶i̶s̶h̶ ̶t̶h̶e̶ ̶f̶u̶l̶l̶ ̶T̶h̶e̶i̶a̶ ̶m̶o̶d̶u̶l̶e̶ ̶i̶n̶ ̶m̶y̶ ̶g̶i̶t̶h̶u̶b̶ ̶f̶o̶r̶k̶,̶ ̶a̶n̶d̶ ̶I̶ ̶o̶f̶f̶e̶r̶ ̶a̶ ̶p̶u̶l̶l̶ ̶r̶e̶q̶u̶e̶s̶t̶ ̶w̶i̶t̶h̶ ̶i̶t̶ ̶t̶o̶ ̶G̶e̶o̶W̶a̶v̶e̶ ̶p̶r̶o̶j̶e̶c̶t̶.̶ ̶I̶f̶ ̶I̶ ̶d̶i̶d̶ ̶n̶o̶t̶ ̶d̶e̶v̶e̶l̶o̶p̶ ̶a̶ ̶m̶e̶s̶s̶ ̶a̶n̶d̶ ̶I̶ ̶a̶m̶ ̶l̶u̶c̶k̶y̶ ̶:̶-̶)̶,̶ ̶t̶h̶e̶ ̶c̶o̶r̶e̶ ̶m̶a̶i̶n̶t̶a̶i̶n̶e̶r̶s̶ ̶c̶o̶u̶l̶d̶ ̶m̶e̶r̶g̶e̶ ̶i̶t̶. UPDATE: The code was merged, thanks GeoWave team!

I think there are very interesting areas to explore, add raster time series support to GeoWave (It is useful for other data sources, like Landsat8 data), develop a new module to ingest Sentinel-2 from Amazon (L2A data is available over EU Central region - Requester Pays buckets)... UPDATE: The code was merged.

lunes, 8 de enero de 2018

Fast Cesium terrain rendering with the new quantized-mesh output of the opensource CesiumTerrainBuilder tool

Cesium is an open-source JavaScript library for creating 3D globes and 2D maps in a web browser without a plugin. It uses WebGL for hardware-accelerated graphics, and is cross-platform, cross-browser, and tuned for dynamic-data visualization.
Cesium is developed by an open-source community, with support from the Cesium Consortium, founded by Analytical Graphics, Inc (AGI). and Bentley Systems.
Terrain support
Cesium supports streaming and visualizing global high-resolution terrain and water effects for oceans, lakes, and rivers. Mountain peaks, valleys, and other terrain features really show the benefit of a 3D globe compared to a 2D map.
Cesium supports several methods for requesting terrain using terrain providers. Most terrain providers use a REST interface over HTTP to request terrain tiles as height maps. Terrain providers differ based on how requests are formatted and how terrain data is organized. You can read details in this tutorial.
How can you add your own terrain data to Cesium? 
First, you need to check if you really need it. You have the option to stream high-resolution terrain data directly from the servers at AGI. It's free to use on public sites under the terms of use. If you want to host the terrain data on your own servers, AGI provides a commercial product - the STK Terrain Server.
But if you want to do it yourself, you have to follow reading next :-).
Cesium supports two terrain formats:
  1. heightmap
  2. quantized-mesh
Read this great article here for more details. About formats, it says:
  • The quantized-mesh format follows the same tile structure as heightmap tiles, but each tile is better optimised for large-scale terrain rendering. Instead of creating a dense, uniform triangle mesh in the browser, an irregular triangle mesh is pre-rendered for each tile. It's a better representation of the landscape, having less detail in flat areas while increasing the density in steep terrain. The mesh terrain is also more memory efficient and renders faster.
You can generate heightmap tiles with Cesium Terrain Builder, a great opensource command-line utility by Homme Zwaagstra at the GeoData Institute, University of Southampton.
Here you can see how much dense are the tiles using the heightmap format:
At the present time, this tool can't create tiles in quantized-mesh format. So I have added the quantized-mesh support to "ctb-tile" tool to create tiles in this lightweight format :-)
Briefly, ctb-tile tool provides two new features:
  • New quantized-mesh format ouput.
  • New metadata file output.
The format parameter ("-f" or "--output-format") provides the new "Mesh" option in order to output quantized-mesh tiles. It reads from a Digital Elevation Model (DEM) as usual, and applies the Chuncked LOD strategy by Thatcher Ulrich to convert the regular grid of each tile in an equivalent and simplified irregular mesh.
The process takes care of an input geometric error to safely merge each pair of triangles with its parent one if there is not a visible loss of quality. Also, you can define a factor of this geometric error using the new "--mesh-qfactor" parameter (1.0 by default) to apply your own error value. Larger values should mean minor quality.
How does ctb-tile calculate the geometric error of each tile? It uses a similar algorithm as Cesium chooses what level of tiles has to render. The error for level zero of a geodetic quadtree is around 77,067.34 meters:
This errors table is applied to decide if a triangle has to be written to output mesh or it has to be splitted to its next child ones. To do not collapse all tiles of lower levels, the process adds more triangles than geometric error indicates, otherwise we would get a squared globe.
And to finish, the new "-l" (or "--layer") option will allow you to create the "layer.json" file with all metadata information that describes the tileset. It fills the "available" json-attribute too, it seems mandatory to successfully load the terrain in Cesium.
Do you like this? You can freely test my github fork!. I proposed a pull request with this to CTB repository too, If owners agree, I would happy they merge it.