24
iMarine Catalogue of Services Pasquale Pagano (CNR) iMarine Technical Director pasquale.pagano@is?.cnr.it iMarine data plaAorm for collabora?ons 7th March 2014, 09:00 – 17:30 Food and Agriculture Organiza2on of the United Na2ons (FAO) Headquarters

iMarine catalogue of services

Embed Size (px)

DESCRIPTION

iMarine solutions and benefits for communities.

Citation preview

Page 1: iMarine catalogue of services

iMarine  Catalogue  of  Services  

Pasquale  Pagano  (CNR)  iMarine  Technical  Director  pasquale.pagano@is?.cnr.it  

iMarine  data  plaAorm  for  collabora?ons    7th  March  2014,  09:00  –  17:30    

Food  and  Agriculture  Organiza2on  of  the  United  Na2ons  (FAO)  Headquarters    

 

Page 2: iMarine catalogue of services

The  Catalogue  of  Services  

iMarine  is  exploi?ng  a  Hybrid  Data  Infrastructure  combining  over  500  soPware  components  into  a  

coherent  and  centrally  managed  system  of  hardware,  soPware,  and  data  resources.  

iMarine  data  plaAorm  for  collabora?ons   2  

Page 3: iMarine catalogue of services

Born  from  the  user  needs  

3  iMarine  data  plaAorm  for  collabora?ons  

I  need  to  host  my  applica?ons  in  a  secure  and  scalable  environment  

I  need  to  maintain  my  database  

I  need  to  backup  my  data  

I  need  to  delivery  my  data  to    a  set  of  known  people    

I  need  to  analyse  my  big  datasets  

Page 4: iMarine catalogue of services

Born  from  the  user  needs  

4  iMarine  data  plaAorm  for  collabora?ons  

I  need  to  manage  and  analyze  biological  and  ecological  data  

I  need  to  manage  the  full  data  life-­‐cycle  from  import  to  valida?on,  cura?on,  harmoniza?on  and  publica?on  

I  need  to  offer  to  my  team  a  powerful  tool  to  manage  code-­‐lists  

I  need  to  store  and  analyze  geospa?al  explicit  informa?on    

I  want  to  offer  a  flexible  sharing,  storage,  repor?ng,  search  and  retrieval  tool  

Page 5: iMarine catalogue of services

Born  from  the  user  needs  

5  iMarine  data  plaAorm  for  collabora?ons  

I  need  to  access  authorita?ve  biological  and  ecological  data    

I  wish  to  simplify  the  access  to  my  geospa?al  data    

I  need  to  mash-­‐up  sta?s?cal  and    biodiversity  data    

I  need  to  reduce  the  costs  of    data  maintenance  of  my  dept.    

I  need  to  validate  my  datasets  and  provide    a  standard  access  to  them  

Page 6: iMarine catalogue of services

User  Needs  Analysis  

6  iMarine  data  plaAorm  for  collabora?ons  

•  Needs  – Not  isolated  – Not  disconnected  – Not  trivial  

•  Solu?ons  – Actual  but  with  an  eye  to  the  future  

– Designed  for  individuals  but  looking  at  the  community  

Page 7: iMarine catalogue of services

Capaci?es:  Storage  as  Service  

• Scalability  and  high  availability  

• Across  sites  

•  ISO  19115/19139  Metadata  

• Catalogue  

• Open  source  RDBMS  

• Up  to  1  TB  data  

• Secure  • Fault-­‐tolerant  • Replica?on  

Virtual  Workspace  

Rela?onal  Databases  

Large  and  Ac?ve  data  storage  

Spa?al  Database  

iMarine  data  plaAorm  for  collabora?ons   7  

Page 8: iMarine catalogue of services

Capaci?es:  Compu?ng  as  Service  

Hadoop  

Sta?s?cal  Manager  

R  clusters  

• MapReduce  

• Analysis/clustering/modeling  

• Windows  and  Linux  

iMarine  data  plaAorm  for  collabora?ons  

1000  CPU

s  Currently  Available  

8  

Page 9: iMarine catalogue of services

Management  and  interpreta?on  of  biological  and  ecological  data  in  the  environment  

Complete  full  life-­‐cycle  data  framework,  from  observa?onal  data  to  aggregated  data  repositories  enriched  with  valida?on  and  analy?cal  tools  

Storage  and  interpreta?on  of  geospa?al  explicit  informa?on,  including  WPS  processing  

Flexible  sharing,  storage,  repor?ng,  search  and  retrieval,  aggrega?on  and  projec?on  facili?es  

Applica?ons  

iMarine  data  plaAorm  for  collabora?ons  

A  BUNDLE  is  a  set  of  

services  and  technologies  grouped  

according  to  a  family  of  related  tasks  for    

achieving  a  common  objec?ve  

9  

Page 10: iMarine catalogue of services

Occurrence  and  Taxonomic  Data  Discovery  Occurrence  Data  Processing  Species  Distribu2on  Modeling  Species  Distribu2on  Maps  Discovery  Taxonomic  Data  Comparison  Taxonomic  Data  Matching  

Code  List  Discovery  Code  List  Management  Sta2s2cal  Engine  Tabular  Data  Discovery  Tabular  Data  Enrichment    Tabular  Data  Management  Tabular  Data  Processing  

Geospa2al  Data  Discovery  Geospa2al  Data  Processing  

Enhanced  Documents  Management  Fact-­‐sheets  Management    Informa2on  Object  Discovery  Messaging  Shared  Workspace  Social  Networking  Facili2es  

Applica?ons  

10  iMarine  data  plaAorm  for  collabora?ons  

A  BUNDLE  is  a  set  of  

services  and  technologies  grouped  

according  to  a  family  of  related  tasks  for    

achieving  a  common  objec?ve  

Page 11: iMarine catalogue of services

iMarine  data  plaAorm  for  collabora?ons  

Presence  Points  

(FishBase    +    

Obis)  

Density  Based  Clustering  DBSCAN  

(with  outliers)  

Other  methods  are  also  available  …  

K-­‐Means  

X-­‐Means  

Features  Clustering  with  StatsCube  

11  

Page 12: iMarine catalogue of services

Data  Analysis  with  StatsCube  

12  

Import    CodeLists  

Validate  Datasets  

Analyse    And    

Project  

Page 13: iMarine catalogue of services

Ecological  Modeling  with  BiolCube  

iMarine  data  plaAorm  for  collabora?ons   13  

Page 14: iMarine catalogue of services

VS  

FAO  Eleutheronema  tetradactylum  

AquaMaps  Eleutheronema  tetradactylum  

Maps  Comparison  with  GeosCube  

MEAN=0.81  VARIANCE=0.02  NUMBER_OF_ERRORS=6691  NUMBER_OF_COMPARISONS=259200    ACCURACY=97.42  MAXIMUM_ERROR=1.0  MAXIMUM_ERROR_POINT=3005:363:1    COHENS_KAPPA=0.218  COHENS_KAPPA_CLASSIFICATION_LANDIS_KOCH=Fair  COHENS_KAPPA_CLASSIFICATION_FLEISS=Marginal  TREND=EXPANSION  RESOLUTION=0.5  

iMarine  data  plaAorm  for  collabora?ons   14  

Page 15: iMarine catalogue of services

iMarine  

OBIS  WoRMS  

WoRDS  

GBIF  

CoL  

ITIS  

IRMNG  NCBI  

MyOcean  

WOA  

EuroStat  

Data.FAO  

…  

Data  

15  iMarine  data  plaAorm  for  collabora?ons  

iMarine  Registries  

Valida2on  

Enriching  

Processing  

Sharing  

Page 16: iMarine catalogue of services

Data  

Ontologies  and  Data  

Warehouses  

Biological  and  

Ecological  Data  

GeoSpa?al  Data  

Sta?s?cal  Data  

Documents    

iMarine  data  plaAorm  for  collabora?ons  

DarwinCore  /  ISO19139  >35  M  Observa?ons  (OBIS)  ≈  120  K  Observed  Species  (OBIS)  ≈  500  K  Taxa  (WoRMS)  >600  K  Scien?fic  Names  (ITIS)  >12  K  Species  Maps  (AquaMaps)  ≈  600  Species  Extent  (FAO)  …  FishBase,  SeaLifeBase  …  CoL,  GBIF  

SDMX  *  Ø  FAO  CodeLists  Ø  IRD  CodeLists    Ø  FAO  datasets  Ø  Eurostat  Ø  …  

ISO19139  (OGC  W*S)  Ø  10  years  Chemical  and  Physical  variables  in  2D  space  

Ø  Ice  concentra?on  and  velocity,  Chlorophyll,  Oxygen,  Nitrate,  Phosphate,  Phytoplankton  as  carbon,  Salinity,  Temperature,  …  

Ø  On-­‐demand  Chemical  and  Physical  variables    in  3D  space  Ø  Apparent  Oxygen  U?liza?on,  Dissolved  Oxygen,  Salinity,  Temperature,  …  

>  350  

varia

bles    

16  

OAI-­‐PMH,  OpenSearch  Ø  FAO  Facksheets  Ø  Aqua?c  Commons  Ø  Bioline  Interna?onal  Ø  Biodiversity  Heritage  Ø  OceanDocs  Ø  Nature,  PenSoP  

Journals  Ø  …  

RDF,  OWL  Ø  FAO  FLOD  Ø  Marine  Top  Level  Ontology  Ø  IRD  Ecoscope  Ø  FactForge,  Yago2  Ø  …  

Page 17: iMarine catalogue of services

Is  this  enough?  •  An  ecosystem  of  par?cipatory  data  e-­‐Infrastructures    

•  Regulated  by  policies  •  Enabled  by  standards  •  Promo?ng  not  only  access  but  mash-­‐up  of  heterogeneous  data  

iMarine  data  plaAorm  for  collabora?ons  

User  centric    17  

Page 18: iMarine catalogue of services

Virtual  Research  Environment    iMarine  is  user-­‐centric  and  workflow-­‐oriented  thanks  to  the  gCube  VRE  technology    Virtual  Research  Environment  (VRE)  is    •  a  distributed  and  dynamically  created  environment    •  where  subset  of  data,  services,  computa?onal,  and  storage  resources    

•  regulated  by  tailored  policies  •  are  assigned  to  a  subset  of  users  via  interfaces  •  for  a  limited  2meframe  •  at  lifle  or  no  cost  for  the  providers  of    the  par?cipatory  data  e-­‐infrastructures  

iMarine  data  plaAorm  for  collabora?ons  

L.  Candela,  D.  Castelli,  P.  Pagano  (2013)  Virtual  Research  Environments:  An  Overview  and  a  Research  Agenda.  Data  Science  Journal,  Vol.  12  

18  

Page 19: iMarine catalogue of services

iMarine  Technology  

•  iMarine  is  powered  by  gCube  

iMarine  data  plaAorm  for  collabora?ons   19  

hups://www.ohloh.net/p/gCube  

Page 20: iMarine catalogue of services

iMarine  Technology  

•  iMarine  is  powered  by  gCube  

iMarine  data  plaAorm  for  collabora?ons   20  

hups://www.ohloh.net/p/gCube  

Page 21: iMarine catalogue of services

iMarine  Technology  

•  iMarine  is  powered  by  gCube  

iMarine  data  plaAorm  for  collabora?ons   21  

hups://www.ohloh.net/p/gCube  

Page 22: iMarine catalogue of services

iMarine  e-­‐infrastructure    

iMarine  is  exploi?ng  D4Science.org  

iMarine  data  plaAorm  for  collabora?ons   22  

Geographically  Distributed  Compu?ng  

Infrastructure  

Across  administra?ve  boundaries  

Across  private  and  commercial  

providers  

Service  Alloca?ons,  Deployment,  

Monitoring,  and  Opera?on  

Uniform  resource  and  data  access  

Opera?on   Built  on  SLAs  

Support  monitoring,  audi?ng,  repor?ng,  and  no?fica?on  

Trust   Privacy,  governance,  and  auribu?on  

Security,  trusted  network  

Page 23: iMarine catalogue of services

Landscape      

D4Science  e-­‐Infrastructure  

gCube  Framework  

gCube  Apps  

Discussion  

 www.i-­‐marine.eu  

   i-­‐marine.d4science.org  

       

iMarine  data  plaAorm  for  collabora?ons   23  

Page 24: iMarine catalogue of services

Google  Analy?cs  iMarine  portal  

iMarine  data  plaAorm  for  collabora?ons   24