33
Health-e-Child: A Grid Platform for European Paediatrics On behalf of the Health-e-Child Consortium (& with thanks to David Manset Maat- G) CHEP07 3 rd September 2007, Victoria, CA Richard McClatchey, UWE-Bristol UK

CHEP07 3 rd September 2007, Victoria, CA

  • Upload
    burke

  • View
    24

  • Download
    1

Embed Size (px)

DESCRIPTION

Health-e-Child: A Grid Platform for European Paediatrics On behalf of the Health-e-Child Consortium (& with thanks to David Manset Maat-G). Richard McClatchey, UWE-Bristol UK. CHEP07 3 rd September 2007, Victoria, CA. Contents. Project Objectives & Challenges Health-e-Child Gateway - PowerPoint PPT Presentation

Citation preview

Page 1: CHEP07 3 rd  September 2007, Victoria, CA

Health-e-Child: A Grid Platform for European Paediatrics

On behalf of the Health-e-Child Consortium(& with thanks to David Manset Maat-G)

CHEP07 3rd September 2007, Victoria, CA

Richard McClatchey, UWE-Bristol UK

Page 2: CHEP07 3 rd  September 2007, Victoria, CA

Contents

• Project Objectives & Challenges• Health-e-Child Gateway • The HeC Grid architecture• Data Integration & Ontologies• Futures• Conclusions

Page 3: CHEP07 3 rd  September 2007, Victoria, CA

3

Motivation for the Project

• Clinical demand for integration and exploitation of heterogeneous biomedical information• vertical dimension – multiple data sources• horizontal dimension – multiple sites

• Need for generic and scalable platforms (Grid?)• integrate traditional and emerging sources

– in vivo and in vitro• provide decision support• ubiquitous access to knowledge repositories in clinical routine• connect stakeholders in clinical research

• Need for complex integrated disease models• build holistic views of the human body• early disease detection exploiting in vitro information• personalized diagnosis, therapy and follow-up

Page 4: CHEP07 3 rd  September 2007, Victoria, CA

4

Objectives of Health-e-ChildBuild enabling tools & services that improve

the quality of care and reduce cost withIntegrated disease models Database-guided decision support systemsCross modality information fusion and data mining

for knowledge discovery

Establish multi-site, vertical, and longitudinal integration of data, information and knowledge

Develop a GRID based platform, supported by robust search, optimisation and matching

Healthy Child

Dec

isio

n S

uppo

rt S

yste

ms

Integrated Disease Modeling

Know

ledge Discovery

Augment

GuidanceGuidanceEnrich

Real-time alert

On-line learning

Ob

se

rva

tio

n P

roc

es

sS

en

so

rs

Imaging

Genomics

Lab Data

Proteomics

Demographics

Physician Notes

Life Style

Time

Organ

Tissue

Cell

Molecule

Population

Individual

Ver

tica

l D

ata

Inte

gra

tio

n

Integrated Medical

Database

Page 5: CHEP07 3 rd  September 2007, Victoria, CA

6

GOSH

NECKER

UWE

CERN

IGG

SIEMENS

ASPER

UOA

INRIA

LYNKEUS

UCL

EGF

FGG

MAAT

A Geographically Distributed Environment

Introduction

Clinical Site

R&D Site

Page 6: CHEP07 3 rd  September 2007, Victoria, CA

7

Applications Integration Challenge

Introduction

IGG

NECKER

GOSH

Highlights

- Different Networks: LANs, WANs, Internet- Security Constraints: Local & National Regulations- Bandwidth Limitations: LAN/WAN & Internet uplinks …

Highlights

- Different Networks: LANs, WANs, Internet- Security Constraints: Local & National Regulations- Bandwidth Limitations: LAN/WAN & Internet uplinks …

Page 7: CHEP07 3 rd  September 2007, Victoria, CA

8

Contents

• Project Objectives & Challenges• Health-e-Child Gateway • The HeC Grid architecture• Data Integration & Ontologies• Futures• Conclusions

Page 8: CHEP07 3 rd  September 2007, Victoria, CA

9

HeC System Overview

Grid Infrastructuredatabases, resource and user management, data security

HeC Gateway HeC specific models and Grid services like query processing, security

Heart Disease Applications

Inflammatory Diseases

Applications

Brain TumourApplications

Common Client Applications user interface for authentication, viewing, editing, similarity search

Page 9: CHEP07 3 rd  September 2007, Victoria, CA

1010 Health-e-Child Technical Review Meeting, Paris, 17 January 2007

• Our building blocks are Services• Our Gateway is a SOA

• Existing blocks are available in the community• WSRF Containers: Globus Toolkit4, Tomcat,

WSRF::Lite…• Data Access Layers: OGSA-DAI, AMGA…• File Transfer facilities: gridFTP…

• Applies to other distributed systems• Our approach is to efficiently reuse and combine

blocks to satisfy our requirements• Building blocks might be enhanced, i.e. HeC

Authentication Service

Data Management & Integration

The Health-e-Child Gateway

Page 10: CHEP07 3 rd  September 2007, Victoria, CA

1111 Health-e-Child Technical Review Meeting, Paris, 17 January 2007

Gateway• The HeC Gateway

• An intermediary access layer to decouple client applications from the

complexity of the grid • Towards a platform independent implementation• To add domain specific functionality not available in Grid middleware

Status√ SOA architecture

and design√ implementation of

privacy and security modules

Page 11: CHEP07 3 rd  September 2007, Victoria, CA

1212 Health-e-Child Technical Review Meeting, Paris, 17 January 2007

Our Approach

Workstation GatewayUser

Hospital X

Highlights- Simplicity, “all-in-one box” - Modularity & Scalability, “off-the-shelf” components- State-of-the-art Approaches

Highlights- Simplicity, “all-in-one box” - Modularity & Scalability, “off-the-shelf” components- State-of-the-art Approaches

One AccessPoint

Per Institution

One Key

To enter the system

Page 12: CHEP07 3 rd  September 2007, Victoria, CA

1313 Health-e-Child Technical Review Meeting, Paris, 17 January 2007

Platform & Grid Middleware

The Health-e-Child Access Point

One AccessPoint

Hosting Domain 1 Stable & Secure Environment

Hosting Domain 2 Stable & Secure Environment

Security &

Registration

JobManagnt

HeC Gateway Storage

HeC Scheduling Monitoring

InfoSystem

Co

mp

uta

tio

n U

nit

Dat

a U

nitHeC

DBMS

200GB 50GB 1TB

Health-e-Child EGEE gLite

HeC Gateway

Virtualization…

Page 13: CHEP07 3 rd  September 2007, Victoria, CA

1414 Health-e-Child Technical Review Meeting, Paris, 17 January 2007

The Platform

The Health-e-Child Gateway (1)

Inside the box…

Client ApplicationsOne Access

Point

HeC Gateway

Computing Resources

Functionality Access

Infrastructure Abstraction

Page 14: CHEP07 3 rd  September 2007, Victoria, CA

1515 Health-e-Child Technical Review Meeting, Paris, 17 January 2007

The Platform

The Health-e-Child Gateway (2)

Security

GT4

Inside the box…

One AccessPoint

HeC Gateway

Page 15: CHEP07 3 rd  September 2007, Victoria, CA

1616 Health-e-Child Technical Review Meeting, Paris, 17 January 2007

Platform & Grid Middleware

The Health-e-Child Gateway (3)

Grid GT4

Inside the box…

One AccessPoint

HeC Gateway

Page 16: CHEP07 3 rd  September 2007, Victoria, CA

1717 Health-e-Child Technical Review Meeting, Paris, 17 January 2007

The Platform

The Health-e-Child Gateway (4)

Client Connectivity

GT4

Inside the box…

One AccessPoint

HeC Gateway

Page 17: CHEP07 3 rd  September 2007, Victoria, CA

1818 Health-e-Child Technical Review Meeting, Paris, 17 January 2007

The Platform

The Health-e-Child Gateway (5)

Client Applications

Inside the box…

One AccessPoint

HeC Gateway

Data Integration

Page 18: CHEP07 3 rd  September 2007, Victoria, CA

1919 Health-e-Child Technical Review Meeting, Paris, 17 January 2007

Contents

• Project Objectives & Challenges• Health-e-Child Gateway • The HeC Grid architecture• Data Integration & Ontologies• Futures• Conclusions

Page 19: CHEP07 3 rd  September 2007, Victoria, CA

2020 Health-e-Child Technical Review Meeting, Paris, 17 January 2007

Grid• Grid technology (gLite 3.0) as

the enabling infrastructure• A distributed platform for

sharing storage and computing resources

• HeC Specific Requirements • Need support for medical

(DICOM) images• Need high responsiveness for

use in clinical routine • Need to guarantee patient data

privacy: access rights management storage of anonymized patient

data only

Status√ Testbed installation since Mai

2006√ HeC Certificate Authority√ HeC Virtual Organisation√ Security Prototype (clients &

services)√ Logging Portal & Appender

Page 20: CHEP07 3 rd  September 2007, Victoria, CA

2222 Health-e-Child Technical Review Meeting, Paris, 17 January 2007

Grid Platform

Grid Architecture @ Month18

Virtualization Applied

2nd Test Node Deployed

Being deployed

Page 21: CHEP07 3 rd  September 2007, Victoria, CA

2323 Health-e-Child Technical Review Meeting, Paris, 17 January 2007

Solution Facts & Conclusion

Grid Platform

- Test-bed Installed & Tested @ CERN- Validated the Middleware Architecture (V1.0) to be installed at the different Sites- However: Architecture still too centralized (VOMS, LFC…)

- Being addressed in the second planning period

- Test-bed Installed & Tested @ CERN- Validated the Middleware Architecture (V1.0) to be installed at the different Sites- However: Architecture still too centralized (VOMS, LFC…)

- Being addressed in the second planning period

CERN

Page 22: CHEP07 3 rd  September 2007, Victoria, CA

2626 Health-e-Child Technical Review Meeting, Paris, 17 January 2007

Contents

• Project Objectives & Challenges• Health-e-Child Gateway • The HeC Grid architecture• Data Integration & Ontologies• Futures• Conclusions

Page 23: CHEP07 3 rd  September 2007, Victoria, CA

27

• Vertical Levels• Split the domain into vertical levels• Data/Information from multiple levels• Implicit semantics connecting these

levels.

• Vertical Integration in applications• Data models should provide

seamless access to all the infor-mation stored in HeC

• Vertically Integrated Modelling• Coherently integrate information

from different levels (instances)• Provide integrated view of the data to the users (concept-oriented views, etc)• Queries against the integrated knowledge

• Topics• Vertical Integration along the longitudinal axis in pediatrics• Semantics of relevance for presenting clinical data • Fragment extraction, alignment and integration from biomedical knowledge sources

Vertical Integration Population

Individual

Organ

Tissue

Cell

Molecule

Page 24: CHEP07 3 rd  September 2007, Victoria, CA

28

Data and knowledge modelling

RequirementsRequirements

Data Acquisition Protocols (WP9)

Users Requirements Specifications (WP2)

Modelling with Domain Experts (WP6)

Integrated Data Modelling

Integrated Data Modelling

Applications DSS

Similarity

Knowledge Representation -

ontologies

Knowledge Representation -

ontologies

Query

HeC-universalHeC-universal

Application specific

Application specific

Page 25: CHEP07 3 rd  September 2007, Victoria, CA

31

Ontologies in Health-e-Child

• Reuse existing medical knowledge to structure our feature space• Working with established knowledge• Linkage to “live” knowledge base(s)• Saves effort on validating the models with domain experts• Non-domain specific conceptualization (upper level ontologies): reusing

these, we establish shareable/shared knowledge

• Ontological modelling of paediatric medicine• Integration of information – resolving semantic heterogeneity

• Different countries, hospitals, protocols

• Research potential• OWL DL representation of paediatrics medical knowledge, identification of

knowledge patterns• Decision-support systems• Similarity metrics and query enhancement.• Alignment of ontologies that overlap over the Health-e-Child domain

Page 26: CHEP07 3 rd  September 2007, Victoria, CA

32

Contents

• Project Objectives & Challenges• Health-e-Child Gateway • The HeC Grid architecture• Data Integration & Ontologies• Futures• Conclusions

Page 27: CHEP07 3 rd  September 2007, Victoria, CA

33

Clinical and Application Roadmap

Phase I(- 06/06)

Phase II(07/06 - 06/07)

Phase III(07/07 - 12/08)

Study Designand Approval

Phase IV(2009)

ClinicalValidation

Refinementof Models and

Algorithms

Dissemination

Data acquisition, genetic tests, ground truth annotations

UserRequirements

State ofthe Art

Reports

Knowledge Discovery Methods

Segmentation/Registration

Feature Extraction from Imaging

Disease Model Developmentgeneric subtype specific patient and treatment specific

Integrated decision support

Classifiers Based on Genetics

Page 28: CHEP07 3 rd  September 2007, Victoria, CA

34

Future Work – Decentralized Architecture

Grid Platform

Hospital 11 Box

4 Domains

- WN- site BDii- CE/LRMS or gsissh- UI- FTS

- top BDII- WMSLB- LFC (*M)- DPM- MON ?

- VOMS (S)- PX (S)- Hydra (D)- AMGA (*M)

Hospital 21 Box

4 Domains

- WN- site BDii- CE/LRMS or gsissh- LRMS- UI- FTS

- top BDII- WMSLB- LFC (*S)- DPM- MON ?

- VOMS (S)- PX (S)- Hydra (D)- R-GMA ?- AMGA (*S)

Hospital 31 Box

4 Domains

- VOMS (M)- PX (M)- Hydra (D)- AMGA (*S)

Virtual Computer

Virtual Computer

SMP 2 - n CPUs

Virtual Computer Virtual Computer

SMP 4 - n CPUs

Virtual Computer Virtual Computer

SMP 4 - n CPUs

- top BDII- WMSLB- LFC (*S)- DPM- MON ?

- site BDii- CE/LRMS or gsissh- UI- FTS

- WN

Working out a possible alternatives to decentralize key components of thegrid middleware

Autonomous Sites

Page 29: CHEP07 3 rd  September 2007, Victoria, CA

35

Ontology layer in the data management architecture

HeC Ontology

Ontological Layer

Ontology-Data Model MappingsQuery Processing Engine

GUI

Semantics Rules

Externalpublic DBs

External Biomedical ontologies

Hospitals’ DBs

Applications (e.g.DSS)

HeC-External Ontologies Mappings

Page 30: CHEP07 3 rd  September 2007, Victoria, CA

36

Contents

• Project Objectives & Challenges• Health-e-Child Gateway • The HeC Grid architecture• Data Integration & Ontologies• Futures• Conclusions

Page 31: CHEP07 3 rd  September 2007, Victoria, CA

37

HEP vs. BioMed - Comparison

Number & location of sites, Job vs. Interactive processing, Data distribution & governance, Nature of applications, skills sets of users, Data Types & lifetimes, Data Replication

Middleware-centric vs. Operating system-centric

Vs.

Page 32: CHEP07 3 rd  September 2007, Victoria, CA

38

Obstacles for Biomed Users

• Grid Middleware is difficult to set up, configure, maintain and use. New skills needed.

• Support for non-scientific applications is limited.• Hierarchical environments favoured, largely client-server,

inflexible in nature - no support for P2P.• Grid software has concentrated on job-oriented (batch-like)

computing rather than interactive computing.• Cannot share the cpu (CE) or storage capabilities (SE) with

the Grid : all or nothing for the resourcing.• We need out-of-the-box Grid functionality for biomedical

end-users.

Page 33: CHEP07 3 rd  September 2007, Victoria, CA

39

Conclusions

• HeC Gateway designed and implementation progressing through 2007-2008

• Grid platform decided – EGEE gLite• BUT Grid is not yet mature in non-HEP settings• Biomedical communities are very different to HEP

communities. So no one-size-fits-all• Future must be towards user-participation in Grids –

what about a Grid OS ?