36
Bridging Persistent Data and Process Data Jianwen Su University of California at Santa Barbara

Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

Embed Size (px)

Citation preview

Page 1: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

Bridging Persistent Data

and Process Data

Jianwen Su

University of California at Santa Barbara

Page 2: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

Activity data-centricity artifact

Lessons from practice

BP as a Service

Extending the artifact concept: Help from data integration? (or not)

Cross reference paths

The updatability requirement

Isolation of process “footprints” or dataprints

Many challenges ahead

Conclusions

Outline

2013/08/26 DAB 2013 2

Page 3: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

Activity-centric, focusing on control flow (e.g. BPMN)

Mainly aiming at business management in general (instead of software design/development) E.g., resource planning, logistics, and management

Missing data is a key reason for hindering software design and management,

many miserable stories including

Hangzhou Housing Management Beauru (HHMB)

Kingfore Corporation (KFC, Beijing)

RuiJing hospital (Shanghai) &

Cottage hospital (Santa Barbara, CA)

IBM Global Financing (IGF)

Traditional BP Modeling

2013/08/26 DAB 2013 3

Page 4: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

DAB 2013 4

Four Kinds of Data

Business data: essential for business logic

− Examples: items, shipping addresses

Enactment status: the current execution snapshot

− Examples: order sent, shipping request made

Resource usage and state needed for service execution

− Examples: cargo space reserved, truck schedule to be determined

Correlation between processes instances

− Example: 3 warehouse fulfillment process instances for Jane’s order

Need models that include both activities and data

2013/08/26

Page 5: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

DAB 2013 5

Four Classes of BP Models

Data agnostic models: data mostly absent

– WF (Petri) nets, BPMN, UML Activity Diagrams, …

Data-aware models: data present (as variables), but storage and management hidden

– BPEL, YAWL, …

Storage-aware models: schemas for persistent stores, mappings to/from data in BPs defined and managed manually

– jBPM, …

Data encapsulting models: logical data modeling, automated modeling other 3 types, data-storage mapping

– Business objects, artifact-centric models

2013/08/26

Page 6: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

DAB 2013 6

Artifact = Biz Process

A business artifact is a key conceptual business element that is used in guiding the operation of the business

fedex package delivery, patient visit, application form, insurance claim, order, financial deal, registration, …

Consists of a business entity and a lifecycle [Nigum-Caswell IBM Sys J 03]

Very natural to business managers and BP modelers

For this talk : artifact is a synonym of BP (practically beneficial)

2013/08/26

application preliminary review

secondary review approval

lic. fee payment certificate delivary

Business (biz) entity

Page 7: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

Activity data-centricity artifact

Lessons from practice

BP as a Service

Extending the artifact concept: Help from data integration? (or not)

Cross reference paths

The updatability requirement

Isolation of process “footprints” or dataprints

Many challenges ahead

Conclusions

Outline

2013/08/26 DAB 2013 7

Page 8: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

Development of application systems in DB a course Last Winter: a bank system

Accounts, clients, transactions; a small number of typical transactions; teller & management: monthly statements, tax reports

Typical development approach: Entity-Relationship modeling Java classes/modules Java & JDBC code

Most frequent mistakes:

Mismatch of data design in Java and in ER: omissions, incompatible semantics

Too bad: this is the best available to teach

Story 1: Toy Application Systems

2013/08/26 DAB 2013 8

The two sides of the coin are indeed separated

Page 9: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

Heating repair workflow for Kingfore in Beijing

The primary workflow consisting of reporting problems, assign service persons, onsite repair, and post-repair review visits

3-month development contracted to BUPT

Their problem:

Mid-way requirement change including, in particular, adding an activity to the repair workflow: demands rewriting a lot of code

Artifact BP helps conceptualizaing changes, but…

A close look: rewritten code mostly involve DB accesses

Story 2: An Application System

2013/08/26 DAB 2013 9

Page 10: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

Typical development steps:

Enterprise database design

The repair workflow modeled in XPDL (BPMN)

Each activity in the workflow coded, “biz entity” never designed but just coded as needed

Developers made isolated decisions to “link” biz entity to database (via SQL) (contrast to BP model)

Elevating to the conceptual level

Biz entity artifact info model

Link database-entity mappings

could enable automating coding db accesses

2013/08/26 DAB 2013 10

Integrating the two sides helps application development

[Sun-S.-Wu-Yang 2013]

Database Design & Biz Entity Design

Page 11: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

Ad hoc design, developed over time, patches, multiple technologies, … a typical legacy system

Problems: Embedded business logic, hard to learn hard to maintain, costly to add new functionality hard to change/evolve

An XXX Application System

2013/08/26 DAB 2013 11

Page 12: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

Services encapsulate system details and reflect business logic, easier to learn

Easier to manage even if not technically

New functions on top of services

SOA Paints a Bright Picture

DAB 2013 12

Tax Calculation

Reassessment Title Change

Inheritance

Sales-transaction Determine

tax base

Appraisal

services

2013/08/26

Page 13: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

Towards a goal of

Business Process as a Service (BPaaS)

Enterprises may run virtual IT systems

2013/08/26 DAB 2013 13

Enterprise System Tax

Calculation

Reassessment

Title Change

Inheritance

Determine tax base

AppraisalPAL

TaxPAL

TitlePAL

HR_PAL AccountingPAL AssessorPAL

How do we do it?

The LEGO Fantasy

Page 14: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

Service Programming is an Art

2013/08/26 DAB 2013 14

Tax Calculation

Reassessment Title Change

Inheritance

Sales-transaction Determine

tax base

Certificate

new service How to compose?

Is it “correct”?

Appraisal

services

How to query?

Warn if #applications for title change involving tax reassessment reach 5

Sales-transaction

Add new edu tax

How to change & evolve?

How to do transactions?

age>55 & …

The real world is not very kind

HELP NEEDED

Page 15: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

Activity data-centricity artifact

Lessons from practice

BP as a Service

Extending the artifact concept: Help from data integration? (or not)

Cross reference paths

The updatability requirement

Isolation of process “footprints” or dataprints

Many challenges ahead

Conclusions

Outline

2013/08/26 DAB 2013 15

Page 16: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

Conceptualizing Running Workflows

2013/08/26 DAB 2013 16

Wo

rkflo

w in

sta

nce

s

Database

Each workflow (BP) instance consists of a biz entity and a lifecycle

Data mappings are ad hoc

Page 17: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

Global as View (GAV): The global database is a view (result of a query) on local data sources

Local as View (LAV): each local data source stores the result of view on the virtual global database

Research focused on query evaluation

Schema mapping (e.g., Clio) focus on computing general target databases [Popa et al VLDB 02] [Fagin et al, ICDT 03]

Data Integration: A Bird’s View

2013/08/26 DAB 2013 17

Local data source

Local data source

... Global Database

[Lenzerini PODS 02]

Page 18: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

Data Integration for Workflows?

2013/08/26 DAB 2013 18

GAV is not suitable: Data not stored in workflow instances The number of instances changes at runtime

LAV? Data not stored in workflow instances

Wo

rkflo

w in

sta

nce

s

Database

Page 19: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

A local view is

sound: only contains (part of) results of the view

complete: contains all results of the view

Workflow data mappings?

Must be exact, i.e., both sound and complete

Open problem: demands a better understanding of data mappings

Soundness and Completeness

2013/08/26 DAB 2013 19

Local data source

Global Database

[Lenzerini PODS’02]

Page 20: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

Activity data-centricity artifact

Lessons from practice

BP as a Service

Extending the artifact concept: Help from data integration? (or not)

Cross reference paths

The updatability requirement

Isolation of process “footprints” or dataprints

Many challenges ahead

Conclusions

Outline

2013/08/26 DAB 2013 20

Page 21: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

w(ID)

w(Customer Name)

r(Customer Address)

w(Service ID)

w(Repairperson Name)

w(Repairperson Phone)

w(Material ID)

w(Material)

Repair Application

Application Review

Repairperson Assignment On-site Repair

Post-repair VisitDocument Archive. . .

. . .

. . . . . .

. . .. . .

Example: The Database (& Lifecycle)

2013/08/26 DAB 2013 21

Includes keys, foreign keys, and a cardinality specification on each foreign key

Page 22: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

Example: The Biz Entity

2013/08/26 DAB 2013 22

Tuple and (nested) set constructs

Page 23: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

aID : tRepair.tRepairID

aReason =

aReason.aRepair Info.aID@tRepair(tRepairID).tReason

aCust Addr = aCust Addr.aCust Name.[aCust Last Name,

aCust First Name]@tUser(tLastName, tFirstName).tAddress

Example: Cross Reference Paths

2013/08/26 DAB 2013 23

Page 24: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

aServiceID : tServiceInfo.tServiceID when

aServiceID.aService Info.aID = tServiceInfo.tRepairID SI

aTime = aTime.aServiceID@tServiceInfo(tServiceID).tTime

In summary, two kinds of mapping rules:

Key mapping rule — existentially quantified

Non-key mapping rules —access path with equality

More Cross Reference Paths

2013/08/26 DAB 2013 24

Page 25: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

ED cover consists of one mapping rule for each primitive attribute in biz entity

Key attributes use key mapping rules

Non-key attributes use equality access rules

Great news: DB accessed can be auto-generated

Workflow modifies its entity, DB hidden

Every update on DB can be propogated to entity?

Every update on entity can be propogated to DB?

Entity-Database Cover

2013/08/26 DAB 2013 25

Workflow DB

Page 26: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

Activity data-centricity artifact

Lessons from practice

BP as a Service

Extending the artifact concept: Help from data integration? (or not)

Cross reference paths

The updatability requirement

Isolation of process “footprints” or dataprints

Many challenges ahead

Conclusions

Outline

2013/08/26 DAB 2013 26

Page 27: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

Database updability: for each update Dd on d, there is an e such that e = m(Dd(d))

Entity updability: for each update De on e = m(d), there is a d such that m(d) = De(e)

Updatability

2013/08/26 DAB 2013 27

Workflow e = m(d) DB

d

m

Page 28: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

Database updability: for each update Dd on d, there is an update De such that De(m(d)) = m(Dd(d))

Entity updability: for each update De on m(d), there is an update Dd such that m(Dd(d)) = De(m(d))

Updatability

2013/08/26 DAB 2013 28

Workflow e = m(d) DB

d

m

De Dd m

Dd(d) De(m(d))

Page 29: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

Database updatability: forward, can always be done

Entity updatability: backward, often not possible

Very closely related to database view update problem [Bancilhon-Spyratos TODS 81]

View complement [BS81] [Lechtenbörger et al PODS 03]

Clean source [Dayal-Bernstein TODS 82][Wang et al DKE 06]

Fortunate here: Theorem: Every non-overlaping ED cover is entity updatable

[Sun-S.-Wu-Yang ICDE ‘14]

Entity Update & View Update

2013/08/26 DAB 2013 29

Page 30: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

Activity data-centricity artifact

Lessons from practice

BP as a Service

Extending the artifact concept: Help from data integration? (or not)

Cross reference paths

The updatability requirement

Isolation of process “footprints” or dataprints

Many challenges ahead

Conclusions

Outline

2013/08/26 DAB 2013 30

Page 31: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

SeGA separates data from execution engine

Serves as a mediator

SeGA: A Service Wrapper/Mediator

2013/08/26 DAB 2013 31

[Sun-Xu-S.-Yang CoopIS ’12]

SeGA

...

Dispatcher

Artifact

Repository

Event Queue

Barcelon

a

Engine 1

EZ-Flow

Engine 1 Barcelon

a

Engine n

.

.

.

.

.

.

.

.

.

Incoming even

t BP instance

Schema

Outgoing even

t

1. SeGA receives incoming events

2. A dispatcher fetches the correlated BP instances according to the type of the incoming event

3. The dispatcher sends the incoming event, the BP instances, and their schemas to the corresponding engine

4. The engine then processes the incoming event, updates the BP instances, and sends outgoing events

5. The dispatcher retrieve the updated BP instances from the engine and store them back to the repository

Possible only if “footprints” of BP instances disjoint

Page 32: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

m is isolating if each update on a single entity (instance) will not affect write (and/or read) attributes of other entity instances

Theorem: Isolation can be tested

Testing “conflicting” updates

EXPTIME with conditional updates

Isolation of BP Instances

2013/08/26 DAB 2013 32

Snapshot

DB

d

m

. . .

[Sun-S.-Wu-Yang ICDE ‘14]

Page 33: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

Activity data-centricity artifact

Lessons from practice

BP as a Service

Extending the artifact concept: Help from data integration? (or not)

Cross reference paths

The updatability requirement

Isolation of process “footprints” or dataprints

Many challenges ahead

Conclusions

Outline

2013/08/26 DAB 2013 33

Page 34: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

Fundamentals

What are these mappings? db queries phrased in 1960’s, not understood until [Chandra-Harel JCSS 79, Bancilhon-Paredaens IPL 79]

Updatability, what else?

Mapping languages

Design principles

Isolation, for lifecycles?, runtime mechanisms?

Data design completeness, needs ontology

Implementability: translating IOPEs on artifact to DB

Transactions

Workflow vs databases

Connecting Biz Entities and Databases

2013/08/26 DAB 2013 34

Page 35: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

Activity data-centricity artifact

Lessons from practice

BP as a Service

Extending the artifact concept: Help from data integration? (or not)

Cross reference paths

The updatability requirement

Isolation of process “footprints” or dataprints

Many challenges ahead

Conclusions

Outline

2013/08/26 DAB 2013 35

Page 36: Jianwen Su University of California at Santa Barbarasu/tutorials/20130826_DAB2-pub.pdfJianwen Su University of California at Santa Barbara ... Mainly aiming at business management

Research on artifact BPs: need to look outside

Data is the enabler/destroyer

Holistic approaches including data and BPs can benefit practice, i.e., software design for enterprises

BPaaS requires independence of service and data management [S. ICSOC’12]

Need a new forum to explore holistic approaches

Conclusions

2013/08/26 DAB 2013 36