45
PrIMe PrIMe: Provenance Incorporating Methodology Steve Munroe ([email protected]) The EU Grid Provenance Project University of Southampton UK www.gridprovenance.org

PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe ([email protected]) The EU Grid Provenance Project University of Southampton UK

Embed Size (px)

Citation preview

Page 1: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

PrIMe

PrIMe: Provenance Incorporating MethodologySteve Munroe ([email protected])

The EU Grid Provenance ProjectUniversity of Southampton UK

www.gridprovenance.org

Page 2: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

2

PrIMe

EU Grid Provenance Consortium

• University of Southampton– Luc Moreau, Steve Munroe, Sheng Jiang, Paul Groth, Simon Miles

• IBM UK (Project Coordinator) – John Ibbotson, Neil Hardman, Alexis Biller

• University of Wales, Cardiff – Omer Rana, Arnaud Contes, Vikas Deora

• Universitad Politecnica de Catalunya (UPC)– Steven Willmott, Javier Vazquez

• SZTAKI– Laszlo Varga, Arpad Andics

• German Aerospace– Andreas Schreiber, Guy Kloss, Frank Danneman

Page 3: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

3

PrIMe

Overview of Talk• Introducing Provenance• Introducing PrIMe• Stepping through PrIMe

– Step 1. Provenance use cases– Step 2. Information items– Step 3. Identifying actors– Step 4. Actor interactions– Step 5. Knowledgeable actors– Step 6. Adaptations

• Evaluation• Summary• Conclusions

Page 4: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

4

PrIMe

Provenance: dictionary definition

• Oxford English Dictionary: – the fact of coming from some particular source

or quarter; origin, derivation– the history or pedigree of a work of art,

manuscript, rare book, etc.; concretely, a record of the ultimate derivation and passage of an item through its various owners.

Page 5: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

5

PrIMe

Provenance Definition

• Our definition of provenance in the context of applications for which process matters to end users:

The provenance of a piece of data is the process that led to that piece of data

• Our aim is to conceive a computer-based representation of provenance that allows us to perform useful analysis and reasoning to support our use cases

• We use the notion of Process Documentation, which is composed of p-assertions

Page 6: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

6

PrIMe

Provenance Applications

Aerospace engineering: maintain a historical record of design processes, up to 99 years.

Organ transplant management: tracking of previous decisions, crucial to maximise the efficiency in matching and recovery rate of patients

Page 7: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

8

PrIMe

Types of p-assertions (1)

– Interaction p-assertion: is an assertion of the contents of a message by an actor that has sent or received that message

I received M1, M4I sent M2, M3

Page 8: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

9

PrIMe

Types of p-assertions (2)

– Relationship p-assertion: is an assertion, made by an actor, that describes how the actor obtained output data or the whole message sent in an interaction by applying some function to input data or messages from other interactions.

M2 is in reply to M1M3 is caused by M1M2 is caused by M4

M3 = f1(M1)M2 = f2(M1,M4)

Page 9: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

10

PrIMe

Types of p-assertions (3)

– Actor state p-assertion: assertion made by an actor about its internal state in the context of a specific interaction

I used sparc processor

I used algorithm xversion x.y.z

Page 10: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

11

PrIMe

The Provenance Middleware Architecture

Page 11: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

12

PrIMe

Introducing PrIMe

• A software engineering methodology for making applications provenance-aware

Page 12: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

13

PrIMe

Introducing PrIMe:Key aimsProvide software engineering guidelines for:• identifying and expressing provenance use cases• identifying the kinds of information items that are

required to satisfy use cases• identifying actors and the interactions between them in

order to effect the recording of process documentation• identifying the set of adaptations that integrate the

provenance architecture with the application to expose the right kinds of information.

Page 13: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

14

PrIMe

Introducing PrIMe:Overview

Step 6. Adaptations

Step 5. Knowledgeable Actors

Application Structure

Step 3. Actors Step 4. Interactions

Step 1. Use Cases

Step 2. Information Items

Page 14: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

15

PrIMe

Step 1: Provenance Use Cases

Application StructureStep 1. Use Cases

Page 15: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

17

PrIMe

Step 1: Provenance Use Cases Gathering Use Cases• It is not always obvious to users what use cases

they could expect the provenance middleware architecture to support.

• We provide a simple requirements elicitation process to help designers collect the core provenance use cases– Give definitions of provenance– Give examples of the general questions that can be

answered using the architecture

Page 16: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

18

PrIMe

Step 1: Provenance Use Cases Definition of Provenance

• The provenance of a result is the process that produced that result.

Page 17: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

22

PrIMe

Step 1: Provenance Use Cases Elicitation Steps

• 3 Important steps: – Step (1) Describe something that already

happens in the application.

– Step (2) Describe a specific provenance-related use case question that cannot be answered (easily), but our functionality could help to achieve.

– Step (3) Identify the relevant services required for answering the use case.

Page 18: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

23

PrIMe

Step 1: Provenance Use Cases Example use case

• Donor A’s organs are screened for potential donation.

• What is the provenance of the donor organ’s diagnosis?

Page 19: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

25

PrIMe

Step 2: Information items

Application StructureStep 1. Use Cases

Step 2. Information Items

Page 20: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

26

PrIMe

Step 2: Information ItemsOverview• The kinds of information that will answer

your use case

• May be one piece or many pieces of information– e.g. a given result, or a sequence of decisions

• For each use case, identify the information items required to satisfy the use case.

Page 21: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

27

PrIMe

Step 2: Information ItemsExamples• Information items may be :

– Data items, i.e. the result of some calculation, decision (found in interactions or actor state).

– Whole or part processes, e.g. the sequence of decisions that led to a donor’s organ being rejected for donation.

– Relationships, e.g. what were the causal determinants of a given decision.

Page 22: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

28

PrIMe

Step 2: Information ItemsCapture• Information items are to be captured by

process documentation, i.e. p-assertions

– Data items: Interaction or actor state p-assertions

– Processes: Interaction and relationship p-assertions

Page 23: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

29

PrIMe

Step 3: Actors

Step 3. Actors

Step 1. Use Cases

Step 2. Information Items

Application Structure

Page 24: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

30

PrIMe

Step 3: ActorsDescription

• An actor is an entity within the application that performs actions, e.g. Web Services, components, machines, people etc. and interacts with other actors.– One actor may be seen as being composed of

other actors.

Page 25: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

32

PrIMe

Step 3: Actors Identification Heuristics

• Identify the components that receive information. E.g. a component/service in a workflow, a script command, the GUI/desktop application into which a user enters information.

• Identify the components that provide the information in each interaction. These could be, for example, a workflow engine, a script executor, a user.

Page 26: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

34

PrIMe

Step 4: Interactions

Application Structure

Step 3. Actors Step 4. Interactions

Step 1. Use Cases

Step 2. Information Items

Page 27: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

35

PrIMe

Step 4: InteractionsInformation exchange

Page 28: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

36

PrIMe

Step 3: ActorsInformation

Message ID Data item Receiver ID

M2

M4

M6

Q1

Pid

r1

EHCR

Testing Lab

UI

Donor data collector

Sending

Receiving

Message ID Data item Sender ID

M1

M3

M5

q1

pid

R1

UI

EHCR

Testing lab

Actor : Donor data collector

Actor: Donor data collector

Page 29: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

42

PrIMe

Step 5: Knowledgeable Actors

Step 5. Knowledgeable Actors

Application Structure

Step 3. Actors Step 4. Interactions

Step 1. Use Cases

Step 2. Information Items

Page 30: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

43

PrIMe

Step 5: Knowledgeable Actors

• Knowledgeable actors have access to Information items

• Sometimes, for a given information item, a knowledgeable actor cannot be found

• Further decomposition might be necessary or,

• New actors may need to be introduced (Step 6)

Page 31: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

44

PrIMe

Step 5: Knowledgeable ActorsWho knows what?

Hospital

EHCRS Testing lab

Page 32: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

45

PrIMe

Step 5: Knowledgeable Actors OTM Example

User interface

Donor data collector

Electronic health care records

Testing laboratory

Hospital

Page 33: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

47

PrIMe

Step 3,4,5: Knowledgeable ActorsRepeat as necessary

– Step 3: Identify actors– Step 4: Identify interactions– Step 5: Identify knowledgeable

actors

Page 34: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

48

PrIMe

Step 6: Adaptations

Step 6. Adaptations

Step 5. Knowledgeable Actors

Application Structure

Step 3. Actors Step 4. Interactions

Step 1. Use Cases

Step 2. Information Items

Page 35: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

49

PrIMe

Step 6: AdaptationsModifying actors

• A non-knowledgeable actor may be modified so that it gains access to information items not currently available to itself or other actors in the system.

Page 36: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

50

PrIMe

Step 6: AdaptationsActor Introduction

•A new actor can be introduced to the application to help in the answering of use cases

Page 37: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

51

PrIMe

Step 6: AdaptationsInteraction Extension

• An interaction in the application can be extended to exchange more information between a knowledgeable actor and a non-knowledgeable actor, making the latter knowledgeable.

Actor Actor

Before

Actor Actor

After

a

a,b

Page 38: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

52

PrIMe

Step 6: AdaptationsInteraction Introduction

• A new interaction between actors can be introduced into the application in which a knowledgeable actor sends the information item to another actor, which then becomes knowledgeable.

Actor Actor

Before

Actor Actor

Afterb

Page 39: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

53

PrIMe

Step 6: AdaptationsProvenance Functionality

• The provenance wrapper exposes an actor’s input and output data, relationships and aspects of the actor’s state.

Page 40: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

54

PrIMe

Step 6: AdaptationsThe Client Side Library• A collection of functions

– To allow provenance-aware applications to communicate with provenance store services

– An implementation of the actor side library should contain at least one of the query library, the record library and the management library

– Helps application developers follow architecture rules

Page 41: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

56

PrIMe

Recording Provenance

Page 42: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

57

PrIMe

Evaluation• Protein compressibility experiment

• 10% overhead for asynchronous recording

Page 43: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

58

PrIMe

Summarising PrIMe

• Step 1: Identify the provenance use cases

• Step 2: Identify relevant information items

– Step 3: Identify actors– Step 4: Identify interactions– Step 5: Identify knowledgeable actors

• Step 6: Make necessary adaptations

Granularity

Page 44: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

59

PrIMe

Conclusions

• PrIMe provides a clear and easy guide to make applications provenance-aware

• Crucial in the adoption of the Provenance Middleware Architecture

Page 45: PrIMe PrIMe : Provenance Incorporating Methodology Steve Munroe (sjm@ecs.soton.ac.uk) The EU Grid Provenance Project University of Southampton UK

60

PrIMe

Questions?

Steve Munroe

[email protected]