38
Grid computing and health information sharing — A platform proposal — Ian Foster Director, Computation Institute Chan Soon-Shiong Scholar U. Chicago & Argonne Natl Lab National Coalition For Heath Integration Carl Kesselman Co-Director Center for Health Informatics University of Southern California

Grid And Healthcare For IOM July 2009

Embed Size (px)

DESCRIPTION

Carl Kesselman and I (along with our colleagues Stephan Erberich, Jonathan Silverstein, and Steve Tuecke) participated in an interesting workshop at the Institute of Medicine on July 14, 2009. Along with Patrick Soon-Shiong, we presented our views on how grid technologies can help address the challenges inherent in healthcare data integration.

Citation preview

Page 1: Grid And Healthcare For IOM July 2009

Grid computing and health information sharing

— A platform proposal —

Ian FosterDirector, Computation Institute

Chan Soon-Shiong Scholar

U. Chicago & Argonne Natl Lab

National Coalition For Heath Integration

Carl KesselmanCo-Director

Center for Health Informatics

University of Southern California

Page 2: Grid And Healthcare For IOM July 2009

2

Responding to a pandemic

Page 3: Grid And Healthcare For IOM July 2009

3

Addressing urban health

needs

Page 4: Grid And Healthcare For IOM July 2009

4

Important characteristics

We must integrate systems that may not have worked together before

These are human systems, with differing goals, incentives, capabilities

All components are dynamic—change is the norm, not the exception

Processes are evolving rapidly too

We are not building something simple like a

bridge or an airline reservation system

Page 5: Grid And Healthcare For IOM July 2009

5

Healthcare is acomplex adaptive system

A complex adaptive system is a collection of individual

agents that have the freedom to act in ways that are not

always predictable and whose actions are interconnected

such that one agent’s actions changes the context

for other agents.

Crossing the Quality Chasm, IOM, 2001; pp 312-13

Non-linear and dynamic Agents are independent

and intelligent Goals and behaviors

often in conflict Self-organization through

adaptation and learning No single point(s) of

control Hierarchical decomp-

osition has limited value

Page 6: Grid And Healthcare For IOM July 2009

6

Ralph Stacey, Complexity and Creativity in Organizations, 1996

Low

LowHigh

High

Agreementabout

outcomes

Certainty about outcomes

We need to function in the zone of complexity

Plan and

control

Chaos

Zone of

complexity

Page 7: Grid And Healthcare For IOM July 2009

7

Ralph Stacey, Complexity and Creativity in Organizations, 1996

Low

LowHigh

High

Agreementabout

outcomes

Certainty about outcomes

We need to function in the zone of complexity

Plan and

control

Chaos

Page 8: Grid And Healthcare For IOM July 2009

8We call these groupingsvirtual organizations (VOs)

Healthcare = dynamic, overlapping VOs, linking Patient – primary care Sub-specialist – hospital Pharmacy – laboratory Insurer – …

A set of individuals and/or institutions engaged in the controlled sharing of

resources in pursuit of a common goal

But U.S. health system is marked by

fragmented and inefficient VOs with

insufficient mechanisms for

controlled sharing

I advocate … a model of virtual integration rather than true vertical integration … G. Halvorson, CEO Kaiser

Page 9: Grid And Healthcare For IOM July 2009

9

The Grid paradigm

1995 2000 2005 2010

Principles and mechanisms for dynamic VOs Leverage service oriented architecture (SOA) Loose coupling of

data and services Open software,

architecture

Computer science

Physics

Astronomy

Engineering

Biology

Biomedicine

Healthcare

Page 10: Grid And Healthcare For IOM July 2009

10

The Grid paradigm and healthcare information integration

Radiology Medical records

Name data and move it around

Make data usable and useful

Make data accessible over the network

Pathology Genomics Labs

Man

ag

e w

ho ca

n d

o w

hat

RHIOData

sources

Platform services

Page 11: Grid And Healthcare For IOM July 2009

11

The Grid paradigm and healthcare information integration

Transform data into knowledge

Radiology Medical records

Management

Integration

Publication

Enhance user cognitive processes

Incorporate into business processes

Pathology Genomics Labs

Secu

rity a

nd

policy

RHIOData

sources

Platform services

Page 12: Grid And Healthcare For IOM July 2009

12

The Grid paradigm and healthcare information integration

Analysis

Radiology Medical records

Management

Integration

Publication

Cognitive support

Applications

Pathology Genomics Labs

Secu

rity a

nd

policy

RHIOData

sources

Platform services

Value services

Page 13: Grid And Healthcare For IOM July 2009

13

We partition the multi-faceted interoperability problem

Process interoperability Integrate work across healthcare

enterprise Data interoperability

Syntactic: move structured data among system elements

Semantic: use information across system elements

Systems interoperability Communicate securely, reliably

among system elements

Analysis

Management

Integration

Publication

Applications

Page 14: Grid And Healthcare For IOM July 2009

14

Security and policy:Managing who can do what

Familiar division of labor

Publication level: bridge between local and global

Integration level: VO-specific policies, based on attributes

Attribute authorities

Page 15: Grid And Healthcare For IOM July 2009

Identity-based authZMost simple - not scalable

Unix Access Control Lists (Discretionary Access Control: DAC)

Groups, directories, simple admin

POSIX ACLs/MS-ACLs

Finer-grained admin policy

Role-based Access Control (RBAC)

Separation of role/group from rule admin

Mandatory Access Control (MAC)

Clearance, classification, compartmentalization

Attribute-based Access Control (ABAC)

Generalization of attributes

>>> Policy language abstraction level and expressiveness >>>

>>> Policy language abstraction level and expressiveness >>>

Page 16: Grid And Healthcare For IOM July 2009

16

Globus / caGrid GAARDS

Page 17: Grid And Healthcare For IOM July 2009

17

Publication:Make information accessible

Make data available in a remotely accessible, reusable manner

Leave mediation for integration layer

Gateway from local policy/protocol into wide area mechanisms (transport, security, …)

Page 18: Grid And Healthcare For IOM July 2009

18

Imaging clinical trials use case

NANTCOG

Childrens Oncology Group

VO

Neuroblastoma Cancer Foundation

VO

Page 19: Grid And Healthcare For IOM July 2009

19

ApplnService

Create

Index service

StoreRepository ServiceAdvertize

Discover

Invoke;get

results

Introduce

Container

Transfer GAR

Deploy

caGrid, Introduce, gRAVI: Ohio State, U.Chicago

Automating service creation, deployment

Introduce Define service Create skeleton Discover types Add operations Configure security

Grid Remote Application Virtualization Infrastructure Wrap executables

Page 20: Grid And Healthcare For IOM July 2009

20

As of Oct19, 2008:

122 participants105 services

70 data35 analytical

Page 21: Grid And Healthcare For IOM July 2009

21

Management:Naming and moving data

Persistent, uniform global naming of

objects, independent of type

Orchestration of data movement among

services

D

S1

S2

S3

D

S1

S2

S3

D

S1

S2

S3

Page 22: Grid And Healthcare For IOM July 2009

22

Naming health objects:A prerequisite to management

The naming problem: “Health objects” =

patient information, images, records, etc.

“Names” refer to health objects in records, files, databases, papers, reports, research, emails, etc.

Challenges: No systematic way of

naming health objects Many health objects,

like DICOM images and reports, include references to other objects through non-unique, ambiguous, PHI-tainted identifiers

A framework for distributed digital object services: Kahn, Wilensky, 1995

Page 23: Grid And Healthcare For IOM July 2009

23

Health Object Identifier (HOI)naming system

uri:hdl://888.us.npi.1234567890.dicom/8A648C33-A5…4939EBE

Random String for Identifier-Body

PHI-free and guaranteed unique

Random String for Identifier-Body

PHI-free and guaranteed unique

888: CHI’s top-level naming

authority

888: CHI’s top-level naming

authority

National Provider Id used in hierarchical Identifier

Namespace

National Provider Id used in hierarchical Identifier

Namespace

Application Context’s Namespace governed by provider Naming Authority

Application Context’s Namespace governed by provider Naming Authority

HOI’s URI schema identifier—based on

Handle

HOI’s URI schema identifier—based on

Handle

Page 24: Grid And Healthcare For IOM July 2009

24

Data movement in clinical trials

Page 25: Grid And Healthcare For IOM July 2009

25Community public health:Digital retinopathy screening network

Page 26: Grid And Healthcare For IOM July 2009

26

Integration:Making data usable and useful

?

0% 100% Degree of prior syntactic and semantic agreement

Degree of communication

0%

100%

Rigid standards-based approach

Loosely coupled approach

Adaptive approach

Page 27: Grid And Healthcare For IOM July 2009

27

Integration:Generally used approaches

Allow free text and lose interoperability Tightly encode data elements specific to

purpose but lose expressivity/re-use and interoperability

Post-hoc tying data elements to biomedical vocabularies

Constraining choices to concepts in biomedical vocabularies

Assemble raw data into warehouses

Page 28: Grid And Healthcare For IOM July 2009

28

Semantic expressivity is generally problematic in biomedical data

Biomedical concepts are context dependent For billing data, ICD and CPT works For quality/effectiveness/research more detail is

required Encode data for semantic interoperability and re-

use— or collect specific to context? Physicians prefer free text Biomedical researchers collect data in highly

specific contexts -> tying data to standard vocabularies alone is insufficient and burdensome

Page 29: Grid And Healthcare For IOM July 2009

29

Integration via mediation

Map between models Scoped to domain use

Multiple concurrent use

Bottom up mediation between standards and

versions between local versions in absence of

agreement

Query Reformulation

Query Optimization

Query Execution Engine

Wrapper

Query in the source schema

Wrapper

Query in union of exportedsource schema

Distributed query execution

Global Data Model

(Levy 2000)

Page 30: Grid And Healthcare For IOM July 2009

30

ECOG 5202 integrated sample management

ECOGCC

Web portal

CHI appliance

CHI appliance CHI appliance CHI appliance

OGSA-DQP

OGSA-DAI OGSA-DAI OGSA-DAI

Mediator

No coordinated data systems

MD AndersonECOG PCO

Page 31: Grid And Healthcare For IOM July 2009

31

Analytics:Transform data into knowledge

“The overwhelming success of genetic and genomic research efforts has created an enormous backlog of data with the potential to improve the quality of patient care and cost effectiveness of treatment.”

— US Presidential Council of Advisors on Science and Technology, Personalized Medicine Themes, 2008

Page 32: Grid And Healthcare For IOM July 2009

32Microarray clustering using Taverna

1. Query and retrieve microarray data from a caArray data service:cagridnode.c2b2.columbia.edu:8080/wsrf/services/cagrid/CaArrayScrub

2. Normalize microarray data using GenePattern analytical service node255.broad.mit.edu:6060/wsrf/services/cagrid/PreprocessDatasetMAGEService

1. Hierarchical clustering using geWorkbench analytical service: cagridnode.c2b2.columbia.edu:8080/wsrf/services/cagrid/HierarchicalClusteringMage

Workflow in/output

caGrid services

“Shim” servicesothers

Page 33: Grid And Healthcare For IOM July 2009

33

Many many tasks:Identifying potential drug targets

2M+ ligands Protein xtarget(s)

(Mike Kubal, Benoit Roux, and others)

Page 34: Grid And Healthcare For IOM July 2009

34

start

report

DOCK6Receptor

(1 per protein:defines pocket

to bind to)

ZINC3-D

structures

ligands complexes

NAB scriptparameters

(defines flexibleresidues,

#MDsteps)

Amber Score:1. AmberizeLigand

3. AmberizeComplex5. RunNABScript

end

BuildNABScript

NABScript

NABScript

Template

Amber prep:2. AmberizeReceptor4. perl: gen nabscript

FREDReceptor

(1 per protein:defines pocket

to bind to)

Manually prepDOCK6 rec file

Manually prepFRED rec file

1 protein(1MB)

6 GB2M

structures(6 GB)

DOCK6FRED~4M x 60s x 1 cpu

~60K cpu-hrs

Amber~10K x 20m x 1 cpu

~3K cpu-hrs

Select best ~500

~500 x 10hr x 100 cpu~500K cpu-hrsGCMC

PDBprotein

descriptions

Select best ~5KSelect best ~5K

For 1 target:4 million tasks

500,000 cpu-hrs(50 cpu-years)

Page 35: Grid And Healthcare For IOM July 2009

35DOCK on BG/P: ~1M tasks on 118,000 CPUs

CPU cores: 118784 Tasks: 934803 Elapsed time:

7257 sec Compute time:

21.43 CPU years Average task time: 667 sec Relative Efficiency: 99.7% (from 16 to

32 racks) Utilization:

Sustained: 99.6% Overall: 78.3%

Time (secs)

Page 36: Grid And Healthcare For IOM July 2009

36

Recap

Increased recognition that information systems and data understanding are limiting factor… much of the promise associated with health IT requires high

levels of adoption … and high levels of use of interoperable systems (in which information can be exchanged across unrelated systems) …. RAND COMPARE

Health system is complex, adaptive system There is no single point(s) of control. System behaviors are often

unpredictable and uncontrollable, and no one is “in charge.” W Rouse, NAE Bridge

With diverse and evolving requirements and user communitities… I advocate … a model of virtual integration rather than true

vertical integration…. G. Halvorson, CEO Kaiser

Page 37: Grid And Healthcare For IOM July 2009

37

Ralph Stacey, Complexity and Creativity in Organizations, 1996

Low

LowHigh

High

Agreementabout

outcomes

Certainty about outcomes

Functioning in the zone of complexity

Plan and

control

Chaos

Page 38: Grid And Healthcare For IOM July 2009

38

The Grid paradigm and healthcare information integration

Analysis

Radiology Medical records

Management

Integration

Publication

Cognitive support

Applications

Pathology Genomics Labs

Secu

rity a

nd

policy

RHIOData

sources

Platform services

Value services