35
Scalable Trigger Processing Discussion of publication by Eric N. Hanson et al Int Conf Data Engineering 1999 CS561

Scalable Trigger Processing

  • Upload
    haruko

  • View
    25

  • Download
    1

Embed Size (px)

DESCRIPTION

Scalable Trigger Processing. Discussion of publication by Eric N. Hanson et al Int Conf Data Engineering 1999 CS561. Motivation. Triggers popular for: Integrity constraint checking Alerting, logging, etc. Commercial database systems Limited triggering capabilities - PowerPoint PPT Presentation

Citation preview

Page 1: Scalable Trigger Processing

Scalable Trigger Processing

Discussion of publication byEric N. Hanson et al

Int Conf Data Engineering 1999

CS561

Page 2: Scalable Trigger Processing

Motivation

Triggers popular for: Integrity constraint checking Alerting, logging, etc.

Commercial database systems Limited triggering capabilities 1 trigger/update-type on table; or at best 100.

But : Current technology doesn’t scale well And, internet and web-based applications

may need millions of triggers.

Page 3: Scalable Trigger Processing

An Example Trigger

Example “stock ticker notification”: Stock holding: 100*IBM Query: Inform an agent whenever the price of the stock

holding crosses $10,000

Create Trigger stock-watch from quotes q

on update(q.price)when q.name=‘IBM’ and 100*q.price > 10,000do raise event ThresholdCrossed(100*q.price).

Note: We may need 1,000 or millions of such triggers Web interface may allow users to create such triggers

Page 4: Scalable Trigger Processing

What Next?

Problem description TriggerMan system architecture Predicate index Trigger processing

Page 5: Scalable Trigger Processing

Problem Definition

Given: Relational DB, Trigger statements, Data Stream Find: Triggers corresponding to each stream item Objective: Scalable trigger processing system

Assumptions: Number of distinct structures of trigger expressions is relatively

small All trigger expression structures small enough to fit in main

memory

Page 6: Scalable Trigger Processing

The Problem, once more.

Requires millions of triggers (on huge data). Steps for trigger processing

Event monitoringCondition evaluationExecuting triggered action

Response time for database operations critical !

Page 7: Scalable Trigger Processing

Related Work

Range Predicates, Marking-Based

[Hans96b, Ston90] (large memory, complicated

storage)

AI[Forg82,Mira87](smaller rule set)

Parallel Processing[Gupt89,Hell98]

IndexingECA Model(not scalable)

Page 8: Scalable Trigger Processing

Overall Driving Idea

If large number of triggers are created, then many have the same format.

Triggers share same expression signature except that parameters substituted.

Group predicates from trigger conditions based on expression signatures into equivalence classes

Store them in efficient main memory data structures

Page 9: Scalable Trigger Processing

TriggerMan System

Page 10: Scalable Trigger Processing

Components

TriggerMan Datablade (lives inside Informix) Data Sources

Local/remote tables/streams; must capture updates and transmit to TriggerMan (place in a queue)

TriggerMan Client applications Create /drop triggers, etc.

TriggerMan Driver Periodically involve TmanTest() fn to perform condition testing

and action execution. TriggerMan console

Direct user interaction interface for trigger creation, system shutdown, etc.

Page 11: Scalable Trigger Processing

TriggerMan Syntax

Trigger syntax

create trigger <triggerName> [in setName][optionalFlags]from fromList[on eventSpec][when condition][group by attributeList][having groupCondition]do action

Page 12: Scalable Trigger Processing

Example : Salary Increases

Update Fred’s salary when Bob’s salary is updated

create trigger updateFred

from emp

on update (emp.salary)

when emp.name = ’Bob’

do execSQL ’update emp set salary=:NEW.emp.salary where emp.name=’’Fred’’’

Page 13: Scalable Trigger Processing

Example : Real Estate Database

“If new house added which is in neighborhood that salesperson Iris reprensents then notify her”

House (hno,address,price,nno,spno)Salesperson (spno,name,phone)Represents (spno,nno)Neighborhood (nno,name,location)

create trigger IrisHouseAlerton insert to housefrom salesperson s, house h, represents rwhen s.name = ‘Iris’ and s.spno=r.spno and r.nno=h.nnodo raise event NewHouseInIrisNeighborhood(h.hno, h.address)

Page 14: Scalable Trigger Processing

Trigger Condition Structure

Expression signature

Expression signature consists ofData source IDOperation code, e.g. insert, delete, etc.Generalized Expression (parameterized)

=

Emp.name CONSTANT

FROM: Data src: empON: Event : updateWHEN: boolean exp.

Page 15: Scalable Trigger Processing

Condition structure (contd)

Steps to obtain canonical representation of WHEN clause Translate expression to CNF Group each conjunct by data source they refer to

Selection Predicate will be of form :

(C11 OR C12 OR ..) AND ... AND (Ck1 OR …),

where each Cij refers to same tuple variable. Each conjunct refers to zero, one, or more data sources Group conjuncts by set of sources they refer to

If one data source, then selection predicate If two data sources, then JOIN predicate

Page 16: Scalable Trigger Processing

Triggers for stock ticker notification

Create trigger T1 from stock when stock.ticker = ‘GOOG’ and stock.value < 500 do notify_person(P1)

Create trigger T2 from stock when stock.ticker = ‘MSFT’ and stock.value < 30 do notify_person(P2)

Create trigger T3 from stock when stock.ticker = ‘ORCL’ and stock.value < 20 do notify_person(P3)

Create trigger T4 from stock when stock.ticker = ‘GOOG’ do notify_person(P4)

Page 17: Scalable Trigger Processing

Expression Signature

Idea: Common structures in condition of triggers

Expression Signature: E1: stock.ticker = const1 and stock.value < const2

Expression Signature:

E2: stock.ticker = const3

Expression signature defines equivalence class of all instantiations of expression with different constants

T4: stock.ticker = ‘GOOG’

T1: stock.ticker = ‘GOOG’ and stock.value < 500T2: stock.ticker = ‘MSFT’ and stock.value < 30T3: stock.ticker = ‘ORCL’ and stock.value < 20

Page 18: Scalable Trigger Processing

What to do now

Only a few distinct expression signatures, build data structures to represent them explicitly (in memory)

Create constant tables that store all different constants, and link them to their expression signature

Page 19: Scalable Trigger Processing

Main Structures

A-treat NetworkNetwork for trigger condition testing

For a trigger to fire, all conditions must be true

Expression SignatureCommon structure in a trigger

E1: stock.ticker = const1 and stock.value < const2

Constant TablesConstants for each expression signature

Page 20: Scalable Trigger Processing

A-Treat Network to represent a trigger

For each trigger condition stock.ticker = const1 and stock.value < const2

Root

stock.ticker = const1 stock.value < const2

alpha-node alpha-node

predicates

Node 1 Node 2

Page 21: Scalable Trigger Processing

Condition Testing

A-Treat network is a discrimination network for trigger condition testing.

For a predicate to be satisfied, all its conjuncts should be true.

This is checked using A-Treat network.

Page 22: Scalable Trigger Processing

A-Treat network (Hanson 1992)

Define rule SalesClerk

If emp.sal>30,000

And emp.dno=dept.dno

And dept.name=“sales”

And emp.jno=job.jno

And job.title=“clerk”

Then Action

Page 23: Scalable Trigger Processing

Expression Signature TableEx.

ID

Data Source

Signature

Description

Constant Table

Number of Constants

Constant Organization

E1 stock … const_e1 2 Main Memory

E2 stock … const_e2 1 Main memory

E1: stock.ticker = const1 and stock.value < const2E2: stock.ticker = const3

Page 24: Scalable Trigger Processing

Constant Tables Tables of constants in trigger conditions

Ex. ID Trigger ID Constant 1 Constant 2 Next Node Rest

E1 T1 GOOG 500 Node 2

E1 T2 MSFT 30 Node 2

E1 T3 ORCL 20 Node 2

T1: stock.ticker = ‘GOOG’ and stock.value < 500T2: stock.ticker = ‘MSFT’ and stock.value < 30T3: stock.ticker = ‘ORCL’ and stock.value < 20

Ex. ID Trigger ID Constant 1 Next Node Rest

E2 T4 GOOG Null

Const_e2

T4: stock.ticker = ‘GOOG’

Const_e1

Page 25: Scalable Trigger Processing

Tables

Primary tables trigger_set (tsID, name, comments, creation_date, isEnabled) Trigger (triggerID, tsID, name, comments, trigger_text,

creation_date, isEnabled, …)

Trigger cache in main memory for recently accessed triggers.

Page 26: Scalable Trigger Processing

Predicate Index

Tables expression_signature(sigID, dataSrcID, signatureDesc,

constTableName, constantSetSize, constantSetOrganization) const_tableN(exprID, triggerID, nextNetworkNode, const1, … constK,

restOfPredicate)

Root of predicate index linked to data source predicate indices Each data source contains an expression signature list Each expression signature links to its constant table. Index expressions on most selective conjunct (rest on fly).

Page 27: Scalable Trigger Processing

Predicate Index

Goal: Given an update, identify all predicates that match it.

hash(src-ID)

Page 28: Scalable Trigger Processing

Processing Trigger Definition

Parse the trigger and validate it Convert the when clause to conjunctive normal

form Group the conjuncts by the distinct sets of tuple

variables they refer to Form a trigger condition graph, that is, undirected

graph with node for each tuple variable and edge for join predicates.

Build A-Treat network

Page 29: Scalable Trigger Processing

Processing trigger definition (2)

For each selection predicate If predicate with same signature not seen before

Add signature of predicate to list And, add signature to expression_signature table If signature has a constant placeholder in it, create a

constant table for the signature. Add constants

Else if predicate has constants, add a row to the constant table for the expression

Page 30: Scalable Trigger Processing

Alternate Organizations

Storage for the expression signature’s equivalence class: Main memory lists Main memory index Non-indexed database table Indexed database table

For each expression signature, choose a structure depending on number of triggers.

efficiency

Scalability

Page 31: Scalable Trigger Processing

Processing update descriptors

On getting an update descriptor (token) (data src ID, operator code, old/new tuple)

Locate data source predicate index from root of predicate index.

For each expression signature, find constant matching the token using index.

Check additional predicate clauses against the token. When all predicate clauses of a trigger have matched,

pin the trigger in main memory Bring in A-treat network representing that trigger to

process aremaining part of trigger, like join, etc. If trigger condition is satisfied, execute action.

Page 32: Scalable Trigger Processing

Processing an Update

Root

Update Stock (ticker=GOOG, value=495)

Index ofstock.ticker=const1

E1 E2

const_e1 const_e2

Other source Predicate index…

Trigger ID Constant 1 Constant 2 Next Node

T1 GOOG 500 Node 2

T2 MSFT 30 Node 2

T3 ORCL 20 Node 2

E1: stock.ticker = const1 and stock.value < const2

const_e1

Page 33: Scalable Trigger Processing

Concurrency

Better scalability even on single processor

Page 34: Scalable Trigger Processing

Concurrency

Identified elements that can be parallelized Token-level

Multiple tokens processed in parallel Condition-level

Multiple selection conditions tested concurrently Rule-action-level

Multiple rule actions fired at the same time Data-level

Set of data values in the network processed in parallel

Page 35: Scalable Trigger Processing

Conclusion : Overall Key Points

If a large number of triggers are created, many of them have almost the same format

Group triggers with same structure together into expression signature equivalence classes

Number of distinct signatures is small enough to fit into main memory (index)

Develop a selection predicate index structures Architecture to build a scalable trigger system.