31
www.irstea.fr Pour mieux affirmer ses missions, le Cemagref devient Irstea Catherine ROUSSEY, Stephan BERNARD, Géraldine ANDRE, Oscar CORCHO, Gil DE SOUSA, Daniel BOFFETY , Jean-Pierre CHANET October 13th 2014 Weather Station Data Publication at Irstea: an implementation Report Thanks to Jean Paul CALBIMONT, W3C SSN Working Group and SSN rewievers

Weather Station Data Publication at Irstea: an implementation Report

Embed Size (px)

DESCRIPTION

Réunion du réseau MIA, 14 octobre 2014, Montpellier.

Citation preview

Page 1: Weather Station Data Publication at Irstea: an implementation Report

www.irstea.fr

Pour mieux

affirmer

ses missions,

le Cemagref

devient Irstea

Catherine ROUSSEY, Stephan BERNARD, Géraldine ANDRE,

Oscar CORCHO, Gil DE SOUSA, Daniel BOFFETY ,

Jean-Pierre CHANET

October 13th 2014

Weather Station Data Publication at Irstea: an implementation Report

Thanks to

Jean Paul CALBIMONT,

W3C SSN Working Group and SSN rewievers

Page 2: Weather Station Data Publication at Irstea: an implementation Report

2

Outline

• Irstea needs

• a data provider

• From open data to linked open data

• State of the art about meteorological dataset publication

• Dataset

• Weather dataset from montoldre weather station

• Csv files

• Model the data, use standard vocabularies

• Semantic Sensor Network (SSN) ontology

• Networks of ontologies around SSN: SSN+GeoSPARQL+locn, SSN+

AWS+ Climate and Forecast, SSN+ QU+ Time

• Convert data to linked data representation

• Conclusion and Perspectives

Page 3: Weather Station Data Publication at Irstea: an implementation Report

3

Irstea: an environmental data provider

Irstea uses and provides several datasets.

Teams belongs to several environmental observatories.

• Data Base about avalanche

• BDOH Data Base about hydrology https://bdoh.irstea.fr/

• Data about soil pollution

Scientific data may be used by other public and research institutes

Scientific data

open data (non proprietary format)

linked open data (linked RDF)

Page 4: Weather Station Data Publication at Irstea: an implementation Report

4

What is Open Data?

Open data is data that can be freely used, reused and redistributed by

anyone - subject only, at most, to the requirement to attribute and

sharealike.

• Availability and Access: the data must be available as a whole and

at no more than a reasonable reproduction cost, preferably by

downloading over the internet. The data must also be available in a

convenient and modifiable form.

• Reuse and Redistribution: the data must be provided under terms

that permit reuse and redistribution including the intermixing with other

datasets.

• Universal Participation: everyone must be able to use, reuse and

redistribute - there should be no discrimination against fields of

endeavour or against persons or groups.

source: Open Data Handbook,

http://opendatahandbook.org/en/what-is-open-data/

Page 5: Weather Station Data Publication at Irstea: an implementation Report

5

What is 5 star Open Data?

source: Tim Berners-Lee, http://5stardata.info/

Page 6: Weather Station Data Publication at Irstea: an implementation Report

6

How to build 5 star Open Data

1. Prepare Stakeholders

2. Select a dataset

3. Model the data.

4. Specify an appropriate open data license

5. Create good URIs for Linked Data

6. Use standard vocabularies

7. Convert data to a Linked Data

representation.

8. Provide machine access to data

9. Announce the new data sets on an

authoritative domain

10. Recognize the social contract

Hyland, B., Atemezing G, & Villazón-Terrazas B (2014) Best

Practices for Publishing Linked Data. W3C Working Group

Note. http://www.w3.org/TR/ld-bp/

Page 7: Weather Station Data Publication at Irstea: an implementation Report

7

Linked Open Data cloud

An extension of the

current Web…

… where data are given

well-defined and

explicitly represented

meaning, …

… so that it can be

shared and used by

humans and machines,

...

... better enabling them

to work in cooperation

And clear principles on

how to publish data

Page 8: Weather Station Data Publication at Irstea: an implementation Report

8

State of the Art SSN SSN FOR PUBLISHING METEOROLOGICAL DATA

Feature of interest, spatial, time

• AEMET (Agencia Estatal de Meteorologia)

AEMET, WGS84,Geobuddies, W3C Time

• Swiss Experiment project

SWEET, WGS84, QUDT

• ACORN-SAT (Australian Bureau of Meteorology)

WGS84, UK Intervals, DUL, Data Cube

• SMEAR (Finnish Station for Measuring Ecosystem Atmosphere

Relations)

SWEET, Geoname, WGS84,DUL, Data Cube, Situation Theory

Page 9: Weather Station Data Publication at Irstea: an implementation Report

9

Irstea Weather Station MONTOLDRE

Montoldre center of France

Vantage Pro 2 of Davis Instruments

Sensors:

• temperature outdoor temperature

• atmospheric pressure external pressure

• air humidity outdoor relative humidity

• weathervane wind direction

• anemometer wind speed

• rain gauge precipitation quantity + precipitation rate

• solar radiation solar radiation

Measurement from 2010 to 2013, every 30 minutes

convertion of CSV files

Page 10: Weather Station Data Publication at Irstea: an implementation Report

10

Irstea Weather Station

Page 11: Weather Station Data Publication at Irstea: an implementation Report

11

Semantic Sensor Network Ontology

Page 12: Weather Station Data Publication at Irstea: an implementation Report

12

Network of Ontologies

Semantic Sensor Network : the backbone

Sensing Device

ontology for meteorological sensor (aws)

Feature of Interest

Climate and Forecast (cf-feature + cf-property)

Platform location

GeoSPARQL and Location Core Vocabulary (geosparql + locn)

Observation

W3C Time Ontology (time)

Observation value

Library of Quantity Kind and Units (qu + dim)

Dolce Ultra Light (dul)

Page 13: Weather Station Data Publication at Irstea: an implementation Report

13

Description of Weather Station SSN + LOCATION + GEOMETRY

What is a weather station?

It is a ssn:Platform, ssn:System.

• Platform is not the set of software uses to manage the sensor nodes

Platform is an entity to which other entities can be attached

Where is the weather station?

The location is always associated to a Platform individual

• WGS84 vocabulary usage does not make the difference between the

spatial feature and its geometrical representation (a point). Spatial

feature may have several geometrical representations depending of

the scale (point, polygon etc…)

Spatial queries : Where are the sensors near "Clermont Ferrand"?

Page 14: Weather Station Data Publication at Irstea: an implementation Report

14

Description of Weather Station SSN + LOCATION + GEOMETRY

Page 15: Weather Station Data Publication at Irstea: an implementation Report

15

Description of sensors SSN + AWS + CF-PROPERTY

Which type of sensor ?

• It is hard to find the specific type of sensor.

• Documentation is incomplete and not precise enough.

What type of phenomenum observes sensor?

Cf-property individuals are not declared as instances of ssn:Property

class

No problem the constraint on the property ssn:observes will infers that these

individuals are instances of ssn:Property class

Which station belongs the sensors?

The property ssn:onPlatform should be used between a sensor and

the weather station

• Query: How many sensors onPlatform lesPalanquinsVP2_1? no results

Page 16: Weather Station Data Publication at Irstea: an implementation Report

16

Description of Sensors

Page 17: Weather Station Data Publication at Irstea: an implementation Report

17

Description of Observation SSN (DUL) + CF-FEATURE +CF-PROPERTY+ QU

Observation describes the context of measurement.

Which sensor do the measurement ?

What is measured?

What is the measured data?

What is the unit of the data ?

• Dul properties and qu properties are redondants: which one should be

used and why?

• Lots of (blank) nodes between the observation and the data value

• Hard to find an URI pattern for observation :

at_Time_of_Plateform_Sensor_on_Property

A sensor (rain gauge) can observe several properties

Page 18: Weather Station Data Publication at Irstea: an implementation Report

18

Description of Observation

Page 19: Weather Station Data Publication at Irstea: an implementation Report

19

Description of Observation SSN + TIME

Observation describes the context of measurement.

When the measure was done?

A measurement can be a instant event: temperature, pressure, humidity

A measurement may be an interval event: precipitation quantity,

precipitation rate, wind direction, wind speed, solar radiation.

• Lack of documentation (wind direction)

Aggregation queries:

Find the strange days?

What are the day where the average temperature is above the monthly expected

temperature?

Find the days where the farmer can not go working (too much

precipitation or wind)

Give me the date where the daily quantity precipitation is above a threshold?

Page 20: Weather Station Data Publication at Irstea: an implementation Report

20

Time Instant Observation

Page 21: Weather Station Data Publication at Irstea: an implementation Report

21

Time Interval Observation

Page 22: Weather Station Data Publication at Irstea: an implementation Report

22

Convert data to linked data representation

TRANSFORMATION FROM CSV TO RDF

• Timestamps and duration

creation

• Wind direction conversion

• Split by month

Page 23: Weather Station Data Publication at Irstea: an implementation Report

23

Provide Machine Access to Data DEMO

http://ontology.irstea.fr

select weather data

SPARQL endpoint

http://ontology.irstea.fr/weather/snorql/

Rdf server jena fuseki

No reasoner

Dataset

8 type of measurement * 48 measurements per day * 365 days * 4

years= 560 640 observations

9 300 000 triplets

Page 24: Weather Station Data Publication at Irstea: an implementation Report

24

Recommendations

• Find a set of ontologies that are build to be connected together

• Never create a new class, just reference existing classes from others

ontologies

• Good URI are not so easy

• Define pattern (see cooluri)

• Create URI for individual with / only (#?)

• No Blank Nodes in order to browse the dataset

• Review your dataset with several reviewers (ssn workshop)

Page 25: Weather Station Data Publication at Irstea: an implementation Report

25

Conclusion & Perspectives

Not so easy to do it well !

Promote our dataset

• find a correct licence

• Publish it in datahub

Use it at a benchmark to run aggregation queries

New dataset about hydrology

Query a dataset in french and in natural language

One day to

publish a dataset

Ok we do it in 6

months

Page 26: Weather Station Data Publication at Irstea: an implementation Report

www.irstea.fr

Pour mieux

affirmer

ses missions,

le Cemagref

devient Irstea

Thanks for your attention!

Page 27: Weather Station Data Publication at Irstea: an implementation Report

27

W3C Semantic Sensor Incubator Group : SSN XG

SSN – XG : mars 2009

41 Participants de 16 organisations : Des grands noms du domaine des

ontologies et des réseaux de capteurs : CSIRO, Wright State University, OGC, DERI, OEG,

Knoesis etc…

Objectifs:

• Proposer un modèle unifié de données de capteurs et de métadonnées

• Etat de l’art sur les ontologies de capteurs existantes

• Proposer des méthodes de développements applications intelligentes

travaillant sur les données de capteurs

Résultat :

une ontologie qui intègre plusieurs ontologies existantes, validées dans des

projets.

Final Report 28 June 2011

http://www.w3.org/2005/Incubator/ssn/XGR-ssn-20110628/

Page 28: Weather Station Data Publication at Irstea: an implementation Report

28

Semantic Sensor Network Ontology

Format OWL 2, disponible sur le web et documentée

(!!) Orientée capteur uniquement, compatible avec les standards de OGC

Aligner sur l’ontologie de haut niveau Dolce Ultra Light (DUL)

Faciliter l’intégration avec d’autres ontologies

SSN ne s’utilise jamais seule (!!), chaque application ne réutilise qu’une sous partie

de l’ontologie

Ontologie modulaire basé sur des patrons de conception (Design Pattern)

Importe que les parties nécessaires

Faciliter l’évolution de l’ontologie

Répond à plusieurs cas d’usage (4)

Permettre d’avoir plusieurs niveaux de description

« Redondance » voulue et nécessaire

Semantic Sensor Network Ontology: http://www.w3.org/2005/Incubator/ssn/ssnx/ssn

M. Compton et al. The SSN ontology of the W3C semantic sensor network incubator

group. Web Semantics: Science, Services and Agents on the World Wide Web

Volume 17, December 2012, pp 25–32

Page 29: Weather Station Data Publication at Irstea: an implementation Report

29

Ontology Design Pattern: ODP SSO STIMULUS SENSOR OBSERVATION

Sensor is anything that observes

How it senses ?

What is sensed?

What senses ?

Page 30: Weather Station Data Publication at Irstea: an implementation Report

30

Ontology Design Pattern: SSO in SSN STIMULUS SENSOR OBSERVATION

Sensor is anything that observes

How it senses ?

What is sensed?

What senses ?

Page 31: Weather Station Data Publication at Irstea: an implementation Report

31

DUL et SSN