Stetl: Preparing Rich GML Data for deegree - The ETL Challenge

Preview:

DESCRIPTION

Presentation given at deegree Community Space, november 13, 2012, Bonn. More and more do we need to work with rich/complex GML. How can we tame this mess ? The deegree WMS/WFS server is very suitable to store and serve rich GML, but how do we get our data in ? What spatial ETL options do we have ? Basically this is a tutorial on Stetl, pronounce "staddle", with particular focus on deegree integration. NB Stetl used to be called sETL...

Citation preview

Preparing Your Rich GML Data for deegree

-

the ETL Challenge

Just van den Broeckedeegree Community Space 2012, Bonn

November 13, 2012www.justobjects.nl

About MeIndependent Open Source Geospatial Professional

Trailblazer OSGeo Dutch Local Chapter

Just van den Broeckejust@justobjects.nl www.justobjects.nl

THE DUTCH KADASTER

GETS INSPIRED WITH

deegree day - nov 16, 2010

THE DUTCH KADASTER

GETS INSPIRED WITH

deegree Community Space - nov 13, 2012

50+ DATASETS

WMS/WFS/WCS/ATOM

METADATA

+

NL INSPIRE ACCESS POINT

PDOK  –  Open  Source  &  Open  Standards

ApplicationsOGC/ISO/INSPIRE Web Services

Storage

Conversion

Services

sETLFME?

OSGeo - Bolsena - 2010

BOLSENA2012

ALLES VORBEI ?

BOLSENA2012

BOLSENA2012

Preparing Your Rich GML Data for deegree

-

the ETL Challenge

Just van den Broeckedeegree Community Space 2012, Bonn

November 13, 2012www.justobjects.nl

We have a Problem

The Rich GML Problem

Rich GML = Complex Mess

INSPIREDutch National DSsAFIS-ALKIS-ATKIS

.

.

The Streetname!

Complex Model

Transformations

Millionsof

Objects

10s of Millionsof

<Elements>

MultipleTransformation

Steps

Solution is Spatial ETL

A.K.A.

Thank You for your

Attention!

But what about.......FOSS ? ... Stetl?

FOSS ETL - High Level

FOSS ETL - Lower Level

But Each Powerful by Itself

ogr2ogr

FOSS ETL - DIY ? (No!)

FOSS ETL - How to Combine?

=+ + ?ogr2ogr

FOSS ETL - Add Python to Equation

=+ + ?( )ogr2ogr

=+ +

Stetl

( )ogr2ogr

Stetl=

SimpleStreaming

SpatialSpeedy

ETL

Process Chain

Input Filter Outputgml

Filter

Stetl concepts

Speed: Streaming

Input Filter Outputgml

Stetl concepts

Speed: Going Native

Input Filter Outputgml

ogr2ogr sETLsETL

Native C Libs/Progs

Calls

Stetl concepts

Example: GML to PostGIS

Reader XMLSplitter ogr2ogr

gml

Stetl concepts

Example: INSPIRE Model Transform

ogr2ogr XSLT Writergml

Stetl concepts

Example: deegree Store

ogr2ogr XSLT deegreeWriter

Stetl concepts

Process Chain - How?

Input Filters Output

Stetl concepts

Example: XML to Shape

The Source

Example: XML to Shape

First: XSLT Transform to GML

Example: XML to Shape

XMLInput XSLT ogr2ogr

Example: XML to Shape

The SETL Chain Config File

ProcessChain

Reader

XSLT

ogr2ogr

Example Components

Input Filters Output

Stetl concepts

XMLFile XSLT GMLFile

ogr2gml GMLSplitter gml2ogr

LineStream XMLValidator WFS-T

deegree* FeatureExtractor deegree*

YourInput YourFilter YourOutput

Data Structures

Stetl concepts

✴Components exchange Packet✴Packet contains data✴Data format:

xml_line_stream etree_docetree_feature_arrayxml_doc_as_stringany

deegree Integration

Stetl concepts

✴Input DeegreeBlobstoreInput✴Output DeegreeBlobstoreInput DeegreeFSLoaderOutput WFSTOutput

Cases✴INSPIRE Download Services publish to deegree store (WFS) GML files (for Atom Feed)

✴National GML Datasets GML to PostGIS

Case: Dutch Addresses

Source<GML>

sETL sETLdeegree

WFS

sETL INSPIRE<GML>

AtomFeed

Other

INSPIREAddresses

DutchAddresses+

Buildings

DutchGeocoder

deegreeblobstore