146
5/28/2010 1 Distributed Cyber-Physical Systems Tarek Abdelzaher University of Illinois at Urbana Champaign Where is Computer Science Research Going? Core The beginning: C entralized machines

Distributed Cyber-Physical Systems - ArtistDesign NoE - … · 2010-05-28 · 5/28/2010 1 Distributed Cyber-Physical Systems Tarek Abdelzaher UniversityofIllinois at UrbanaChampaign

Embed Size (px)

Citation preview

5/28/2010

1

Distributed Cyber-PhysicalSystems

Tarek AbdelzaherUniversity of Illinois at Urbana Champaign

Where is Computer ScienceResearch Going?

Core

The beginning:C entralizedmachines

5/28/2010

2

Where is Computer ScienceResearch Going?

C entralizedmachines

Core

Where is Computer ScienceResearch Going?

C entralizedmachines

Core

5/28/2010

3

Where is Computer ScienceResearch Going?

C entralizedmachines

Core

Cyber-PhysicalComputing

Where is Computer ScienceResearch Going?

C entralizedmachines

Core

Cyber-Physical(Distr ibuted) Computing

5/28/2010

4

Where is Computer ScienceResearch Going?

DistributedCyber-Physical Systems

For example:

In the US, the Presidential Counsel ofAdvisors in Science and Technologynamed systems that interact with the physicalworld the

#1 Research Priority in the US

SituationA ware

DistributedEmbedded

Sy stems

Emergency

Response

Distributed Cyber-Physical SystemsThe Next Frontier

Today

C lusters, F armsGrids, WWW

EmbeddedEvery where- Transparent- Context-aware- Mobile- Miniature- Ubiquitous(Smart attire,smart spaces,

…)

A utonomic

C omputing

PrivacySecurity

Processors

Cyber-PhysicalSystems

5/28/2010

5

Wearable and AmbientSensors: OpportunisticContext Measurement

ImplantableMedical Devices,

Biosensors

BiologicalSystem

BiometricSensing

C ontext andA ctivity Sensing,NEAT-factor, etc.

LoggingLogging

Community/SocialNetwork

DirectC ontrol

C ontext Factors,Bio-feedback

SanitizedC ommunity

Data

Personal DataServices

SocialF actors

Bio-loops

Medical careservices

CareProviders,Physicians

C omparativeA naly sis

Personalized Healthcare

Point of CareDevicesBio-feedback Sensors

Implanted SensorsInsulin pumps, pacemakers,

glucose monitors, …

Smart SpacesWearable Activityand Biometric Monitoring

Human-CentricSensing

Micro- and Nano-sensors, Biochips

TransparentTesting

5/28/2010

6

Browsing the Physical WorldFeng Zhaohttp://atom.research.microsoft.com/sensormap/

12

Signal data

Visual data

Human data

Extractobjects and

linkages

InformationNetwork

Optimizeresourceallocation

Sensors, w itnesses, sources, …

Information

High QoI,Quantifiableuncertainty

PrioritizedSituation-awareness

Extractobjects and

linkages

Feedbackto sensor andcommunicationnetworks

Socialnetworks

Military Applications

5/28/2010

7

Confluence of TrendsThe Overarching Challenge

Trend1: Device/Data Proliferation(by Moore’s Law)

Trend2: Integration at Scale(Isolation has cost)

Trend3: A utonomy(Humans are not getting faster)

Confluence of TrendsThe Overarching Challenge

Trend1: Device/Data Proliferation(by Moore’s Law)

Trend2: Integration at Scale(Isolation has cost)

Trend3: A utonomy(Humans are not getting faster)

Distributed Cyber-PhysicalSystems

5/28/2010

8

Confluence of TrendsThe Overarching Challenge

Trend1: Device/Data Proliferation(by Moore’s Law)

Trend2: Integration at Scale(Isolation has cost)

Trend3: A utonomy(Humans are not getting faster)

Distributed Cyber-PhysicalSystems

InteractionChallenges

Interaction Domains

In distributed cyber-physical systems, computation,communication and sensing interact in severaldomains: Part I: Temporal interactions

Real-time systems

Part II: Spatial interactions Sensor networks

Part III: Social interactions Human-centric CPS

5/28/2010

9

Part I: Temporal Interactions(Distributed Real-time CPS)

Tarek AbdelzaherUniversity of Illinois at Urbana-Champaign

Applications: Mission-critical Systems

Building Timely, Predictable, Reliable Systems

5/28/2010

10

Temporal Analysis

Periodic tasks

Known workload

Small sy stems

Microscopic execution models

C omplex analy sis for non-trivial cases

Past Future

A periodically arriving tasks

Largely unpredictable workload

Large distributed sy stems

A ggregate execution models

S imple back-of-the-envelope analysis

Focus of the Real-Time community : Needed solutions:

Fundamental question: How to determine in a cyber-physical systemthat timing and throughput requirements are met?

(Optimality, accuracy, moderate scale) (Simplicity, sufficiency, large scale)

(Courtesy of Lockheed Martin, 2002)

Motivating Application (US Navy)Courtesy of Patrick Lardieri, ATL, Lockheed Martin

FUTURE – Total Ship Computing (TSCE)•1,000s of computer nodes connected bystandard/COTSmiddleware on distributedswitched backplane

•N-version redundancy (no single failure point)•Virtually unlimited growth capability

•Software replicated on many CPUs/nodes

•Essentially invulnerable to battle damage

UNDER DEVELOPMENT –Networked Processing (Aegis Baseline 7)

•Open HW + operating system(COTS/industry standards)

•Distributed LAN interconnects•Redundancy plus reconfigurability•Significant growth capability

•Software distribution possible•Vulnerable to large scale damage•Highly constrainted by legacy stove-pipesystems

TODAY - Adjunct Processing (Aegis Baselines 5P3, 6)•UYK-43s w/COTSprocessors

•Point-to-point interconnection•Display LAN

•Limited growth capability

•Vulnerable to damage

Moving from customstovepipe infrastructures

to a common, COTSbased infrastructure

5/28/2010

11

Total Ship ComputingEnvironment (TSCE) VisionCourtesy of Patrick Lardieri, ATL, Lockheed Martin (2002)

TSCE Design Goals Use COTS Infrastructure

Technology

Enable Plug-n-PlayComponent Architecture

TSCE Benefits Improve Performance

Increase Extensibility

Break Apart ApplicationStovepipes

TSCE Design Challenges

• Using shared computing and networking resources

• Satisfying a mix of hard and soft real-time performance requirementsfor periodic and aperiodic tasks

• Assuring mission and safety critical processing will always meet theirdeadlines

Radar Scheduling:Middleware and Applications

Communication, Alarm Handling inNetworked Cooperative Engagement

Threat DiscoveryTarget Tracking

WeaponControl

• Unpredictable environment

• Time/QoS constraints

• Limited system capacity

5/28/2010

12

Target Scenario

Target

negotiationSchedulabilityAnalysis

TargetAssignment

SchedulabilityAnalysis

A distributed system of multiple battleships

A multifunction radar on each ship (node)

A set of targets to be jointly tracked

Goal: Each target should be tracked by atleast one node and at most K nodes.

The First Prototype

Zumwalt Class Destroyer Total Ship Computing Environment

5/28/2010

13

Challenge

Establish a new analytic foundation forrobust timing guarantees in highlydynamic, largely unpredictable, time-critical software systems

Methodology

Develop a more robust schedulabilitytheory

A schedulability theory for aperiodic tasks(fewer rigid assumptions)

Extend the theory to distributed systems

Analyze end-to-end behavior of complex taskgraphs

5/28/2010

14

Goal: Analyze DistributedHighly Coupled Systems

C omplex timing behavior:Every thing depends onevery thing

©

Historical Perspective:Utilization Bounds

Consider a set of periodic tasks

Each task invocation must finish by the endof its period

How to tell if all invocations will meet theirdeadlines? Period=Deadline

5/28/2010

15

©

Liu and Layland UtilizationBound

Well-known result (1973): Assume that each task Ti executes for Ci every

period Pi.

Processor utilization needed for this task is:

Ui = Ci/Pi

The task set is schedulable by an optimalfixed-priority scheduling policy if Si Ci/Pi < 0.69

Optimal fixed priority policy is rate-monotonic(higher rate = higher priority)

Real-TimeOverview

Deadline=Period Deadline<Period

Rate Monotonic EDF

BoundsO ptimality

ResultBound

O ptimalityResult

Deadline Monotonic EDF

Bound(Poor)

Per TaskTests

S imple Recursive

ProcessorDemand

C lassical Hy perbolic

Real-time Tasks

Periodic Tasks

5/28/2010

16

Deadline=Period Deadline<Period

Rate Monotonic EDF

BoundsO ptimality

ResultBound

O ptimalityResult

Deadline Monotonic EDF

Bound(Poor)

Per TaskTests

S imple Recursive

ProcessorDemand

C lassical Hy perbolic

A periodic Tasks

Real-time Tasks

F ixed-Priority Dy namic-Priority

Polling

Slack Steal. Priority Ex.

Deferrable Sporadic

Total B.

Sporadic

DPE

C BS

TBS+

IPE

EDL

Periodic Tasks

Real-TimeOverview

Periodic Tasks

Deadline=Period Deadline<Period

Rate Monotonic EDF

BoundsO ptimality

ResultBound

O ptimalityResult

Deadline Monotonic EDF

Bound(Poor)

Per TaskTests

S imple Recursive

ProcessorDemand

C lassical Hy perbolic

Real-time Tasks

Treat asA periodic

5/28/2010

17

Why An Aperiodic Theory forReal-Time Systems?

Reason #1: Aperiodic tasks are an increasingproportion of workload. They are no longerthe “exception” Consider a server serving randomly arriving

requests

Each request has a desired response time

Can one invent an aggregate measurableutilization-like metric (we call it instantaneousutilization, U), such that all deadlines are met aslong as U is below some threshold, Umax?

Feasible region: 0 < U < Umax

Why An Aperiodic Theory forReal-Time Systems?

Reason #2: Even systems where tasks arriveperiodically suffer aperiodic artifacts if thereare multiple execution stages

Stage 1 Stage 2 Stage 3

PeriodicA rrivals

No longerperiodic

Even lessperiodic

Look completelyaperiodic

5/28/2010

18

Aperiodic Task andInstantaneous Utilization

Instantaneous utilization U(t) is a function of time, t

U(t) is defined over the current invocations

U(t) = Si Ci/Di

D1

D3

D2

Aperiodic Task andInstantaneous Utilization

Instantaneous utilization U(t) is a function of time, t

U(t) is defined over the current invocations

U(t) = Si Ci/Di

D1

D3

D2

A rrived but deadlinehas not expired

5/28/2010

19

The Single Queue Problem:Relaxing the Periodicity Assumption

Consider processing requests that arrive at a resource queue in someaperiodic pattern Each request needs a processing time Ci

Each request must meet a latency bound D i

Is there a bound on U(t) = Si Ci/D i (instantaneousutilization) of eligible requests such they areschedulable if this bound is not exceeded?

Note: The average utilization of the resource(percentage non-idle time) is equal to the averageof sy nthetic utilization U(t) (of accepted tasks)

Solution: A set of aperiodic tasks is schedulable using an optimal fixed-priority policy if:

Allows reasoning about temporal correctness and real-time throughput ofapplications with irregular workloads

22)( tU

Arrival curve

Expiration curve

U(t)

Time

Cumulative utilization

Ci/Di

Observation

Utilization bound for aperiodic tasks issmaller than the Liu and Layland boundfor periodic tasks. Why?

5/28/2010

20

Example

Consider two tasks

D

D

D/2

U(t) = 0.75

Main Idea of Derivation

Minimize, over all arrival patterns z , themaximum Uz(t) that precedes a latencyviolation

LatencyViolation

Uz(t)= Si Ci/Di (of eligible requests)

t

MaximumUz(t)

5/28/2010

21

Derivation

Observe that each request i contributes Ci tothe area under the Uz (t) curve – see figurebelow.

Uz(t)

t

MaximumUz(t)

D C / Di i i

LatencyViolation

Corollary

The total area under the Uz (t) curve is S Ci

over all arrived tasks*

t

LatencyViolation

D C / Di i i

MaximumUz(t)

Uz(t)

*Note: that’s why the average U(t) is equal to the average resource utilization

5/28/2010

22

A Geometric Interpretation

The sum S Ci (area) must be at least equal to somedeadline Dn if latency violations occur

Minimize curve hight given area = Dn

Dn

U

t

LatencyViolation

D C / Di i i

MaximumUz(t)

Uz(t)

A Geometric Interpretation

The sum S Ci (area) must be at least equal to somedeadline Dn if latency violations occur

Minimize curve hight given area = Dn

Dn

Dn

<Dn

U

t

LatencyViolation

D C / Di i i

MaximumUz(t)

Uz(t)

5/28/2010

23

A Realizable Worst-CaseArrival Pattern

Dn

Dmax=Dn-

A rea = Dn

Uz(t)

A Realizable Worst-CaseArrival Pattern

Ci/Dn

Ci

Dn

A rea = Dn

Uz(t)

5/28/2010

24

A Realizable Worst-CaseArrival Pattern

Ci/Dn

Ci

Dn

A rea = Dn

Uz(t) Slope = 1/Dn

A Realizable Worst-CaseArrival Pattern

Ci/Dn

Ci

Dn

A rea = Dn

Uz(t) Slope = 1/Dn

How big are the individual tasks when the bound is minimized?

5/28/2010

25

A Realizable Worst-CaseArrival Pattern

Dn

A rea = Dn

Uz(t) Slope = 1/Dn

How big are the individual tasks when the bound is minimized?

Basic Result

Area cannot be rectangular because task departuresgradually decrease utilization

Proved that minimum-height curve is a trapezoid withslope 1/Dn

Dn

Slope = 1/DnU

22U

Dn <Dn

22

024

2/)22(2

U

UU

DUDDDU nnnn

5/28/2010

26

Extensions

211)( tU

Dn

Slope = 1/DnU

a is the urgency inversion (the minimum ratio of deadline of low priority task todeadline of high priority task which can preempt it).

g = max (Blocking/D) is the maximum time a task can be blocked (by a lowerpriority task) relative to its deadline.

U t( ) 1 1 2 2

Dn <Dn/a

Dn-g

Slope = 1/DnU

Dn <Dn/a

O ther scheduling policies:

Multi-stage Resource Pipelines

Can one derive a utilization-basedexpression for schedulability of a pipeline?

Machine 1 Machine 2

U1 U2

U1

U2

5/28/2010

27

Multi-stage Resource Pipelines

Dn

Slope = 1/DnU j

L

Assume delay of task n on stage j does not exceed L

The sum S Ci under the utilization curve is at least L.

Stage 1 Stage N

Stage j

Stagej–1

Stagej+1

… …Queue

Resource

Dn

L

The Stage Delay Theorem

Slope = 1/Dn

A rea = LU j

L Dn

Consider the worst case pattern:

5/28/2010

28

The Stage Delay Theorem

U

UU

D

L

LUUUD

LUDDLU

n

n

nn

1

)2/1(

)1()2/1(

2/)22(

Consider the worst case pattern:

Slope = 1/Dn

A rea = LU j

L Dn

Constructing Feasible Regions:The Stage Delay Theorem

The Stage Delay Theorem: If the instantaneousutilization of resource i, does not exceed Ui, thenno task is queued on resource i under deadlinemonotonic scheduling for more than a fraction bof its end-to-end deadline, where:

b = Ui (1 – Ui /2)/(1 – Ui)

Resourcei

Ui

b U(t) = Si Ci/Di

D1

D2

D3

Instantaneous Utilization

5/28/2010

29

Main Result: Schedulability ofResource Pipelines

Let U1, U2, …, Un be the instantaneous utilizationvalues of n stages in a pipeline

All end-to-end deadlines are met if:

Stage 1 Stage nStage 2

U1 U2 Un

n

i i

ii

U

UU

1

11

)21(

Schedulability Regions ofArbitrary Task Graphs

Let U1, U2, …, Un be the synthetic utilization values of nservers S1, S2, …, Sn in a distributed system

Client requests traverse a task graph with multiple flows

Client requests meet their end-to-end deadlines if:

pathS i

iipath

iU

UU1

1

)21(max

Path 1

Path 2

Path 3

5/28/2010

30

Example

Is every task T schedulable in the pipeline below?

Task 1 0.1 0.1 1Task 2 0.2 0.1 2Task 3 1 0.125 5Task 4 2 0.5 20

Task Stage1 C omp. Time Stage2 C omp. Time Deadline

Example

Is every task T schedulable in the pipeline below?

U = 0.5 U = 0.2

Task 1 0.1 0.1 1Task 2 0.2 0.1 2Task 3 1 0.125 5Task 4 2 0.5 20

Task Stage1 C omp. Time Stage2 C omp. Time Deadline

5/28/2010

31

Example

Is every task T schedulable in the pipeline below?

U = 0.5 U = 0.2

Task 1 0.1 0.1 1Task 2 0.2 0.1 2Task 3 1 0.125 5Task 4 2 0.5 20

Task Stage1 C omp. Time Stage2 C omp. Time Deadline

0.5(1-0.25)/0.5 + 0.2(1-0.1)/0.8=0.75+0.225=0.925<1 O K!

References

Utilization bound for aperiodic tasks:

Tarek A bdelzaher, Vivek Sharma, Chenyang Lu, “A Utilization Bound forA periodic Tasks and Priority Driven Scheduling,” IEEE Transactions onC omputers, V ol. 53, No. 3, March 2004. Earlier version appeared in: TarekA bdelzaher, C henyang Lu, “Schedulability Analysis and Utilization Bounds forHighly Scalable Real-Time Services,” IEEE Real-Time Technology andA pplications Sy mposium,TaiPei, Taiwan, June 2001.

Extension to resource requirements:

Tarek A bdelzaher and V ivek Sharma, “A Synthetic Utilization Bound for A periodicTasks w ith Resource Requirements,” 5th Euromicro C onference on Real-TimeSy stems, Porto, Portugal, July 2003.

Extension to pipelines:

Tarek A bdelzaher, Gautam Thaker, Patrick Lardieri, “A Feasible Region forMeeting A periodic End-to-end Deadlines in Resource P ipelines,” IEEEInternational C onference on Distributed C omputing Systems,Tokyo, Japan,March 2004.

5/28/2010

32

Capacity Planning for Real-TimeWireless Sensor Networks

Seminal recent work established wireless network capacity bounds What if traffic has latency requirements and only bits that make it

by the latency constraint are counted towards throughput? Problem: express real-time network capacity that quantifies the

throughput of timely bits only as a function of network parametersand time constraints

A rea S

n nodes

SamplingRate?

Defining Real-time NetworkCapacity

Network bandwidth is the bottleneck

Intuitively, network ability to meet time constraintsdecreases with:

Increased data size, C

Increased distance between source and destination, L

Decreased end-to-end latency constraint, D

Schedulability decreases with CL/D

Is there a bound CapacityRT, such that all packets, i, reachtheir destinations by their deadlines if:

RTi i

ii CapacityD

LC

5/28/2010

33

The Single Hop Problem

B

A

Receiver’sRadio Range

EquivalentVirtual Queue

Only one node in the vicinity of a receiver cansend at a time

Throughput Optimization:Maximizing Real-Time Capacity

Consider a localized communication pattern where eachnode communicates with nodes at most N hops away.

On any path, maximize Sj Uj subject to

From symmetry, Uj = U

Hence,

2)1(111 NNU

N

j j

jj

U

UU

1

11

)21(

NU

UU/1

1

)2/1(

5/28/2010

34

Real-Time Capacity

The total capacity theorem: In a load-balancednetwork of n nodes, each with a radio of transmissionspeed W and m neighbors on average, ifcommunication is localized within at most N hops:

For large N:

2)1(111

NNm

nWCapacity

Opt

mN

nWCapacityOpt

Real-time Capacity of Multi-hopData-Collection in Sensor Networks

In a data collection sensor network with K collectionpoints, maximum path length N, and radio transmissionspeed W, what is a sufficient bound on real-timecapacity?

5/28/2010

35

Data conservationconstraints

Real-time Capacity of Multi-hopData-Collection in Sensor Networks

In a data collection sensor network with K collectionpoints, maximum path length N, and radio transmissionspeed W, what is a sufficient bound on real-timecapacity?

jUU

UUj

pathj j

jj /1,11

)21(

N

KNWCapacityDC

ln5.01

K=3N=2

Real-time Capacity of Multi-hopData-Collection in Sensor Networks

In a data collection sensor network with K collectionpoints, maximum path length N, and radio transmissionspeed W, a sufficient bound on real-time capacity is:

5/28/2010

36

N

KNWCapacityDC

ln2

Extensions

Contention due to nodes outside the receiver’s neighborhood cuts capacity in half (worst case).

Interference region (area) m i is larger than the reception region mr scalescapacity down by mr/m i.

N

KNW

m

mCapacity

i

rDC ln2

O ther considerations:

MAC-layer overhead Different traffic prioritization policies

Evaluation: Effect of the Numberof Sinks on Schedulability

Simulation versus analytic prediction of the onset ofdeadline misses in a 1600 node network

5/28/2010

37

Evaluation: Effect of the SensorRadio Radius on Schedulability

Simulation versus analytic prediction of the onset ofdeadline misses in a 1600 node network

Experimental Evaluation

Comparing empirically measured and theoretical real-timecapacity on a testbed of Mica2 motes

TDMA-based MAC layer with locally prioritized queues

0

2

4

6

8

10

12

14

9 nodes 16 nodes 25 nodes

Thoeretical

Empirical

5/28/2010

38

References

Real-time capacity:

Tarek F. Abdelzaher, Shashi Prabh, Raghu Kiran, "On Real-time Capacity Limits of Multihop Wireless Sensor Networks,"IEEE Real-time Systems Symposium, Lisbon, Portugal,December 2004.

Methodology

Develop a more robust schedulabilitytheory

A schedulability theory for aperiodic tasks(fewer rigid assumptions)

Extend the theory to distributed systems

Analyze end-to-end behavior of complex taskgraphs

5/28/2010

39

Methodology

Develop a more robust schedulabilitytheory

A schedulability theory for aperiodic tasks(fewer rigid assumptions)

Extend the theory to distributed systems

Analyze end-to-end behavior of complex taskgraphs

Kirchhoff's laws provide simple composition rules toreduce complex circuits to single components

Composition rules for distributed systems?

Delay? Capacity?

Towards a Theory forTemporal Composition

5/28/2010

40

Analogy with circuitor control theory

Distributed Cyber-PhysicalSystems

Given two points, determine end-to-end schedulability properties- Reduce system to equivalent single block

80

Machine 1 Machine nMachine 2

U1 U2 Un

…U U

U

i i

ii

n ( )1 2

11

1

Task T Tight Bound!

Main Results in Schedulabilityof Multistage Execution

5/28/2010

41

81

Machine 1 Machine nMachine 2

U1 U2 Un

…U U

U

i i

ii

n ( )1 2

11

1

Task T Worst case scenario: cross traffic

What if all tasks follow the same multistage path?

Machine 1 Machine nMachine 2

U1 U2 Un

Main Results in Schedulabilityof Multistage Execution

82

Machine 1 Machine nMachine 2

U1 U2 Un

…U U

U

i i

ii

n ( )1 2

11

1

Task T Worst case scenario: cross traffic

What if all tasks follow the same multistage path?

Machine 1 Machine nMachine 2

U1 U2 Un

…Better schedulability !A dversary has fewer degrees of freedom!

Task T andothers

Main Results in Schedulabilityof Multistage Execution

5/28/2010

42

83

A fundamental question: How does delay of a low priority task depend on execution times

of higher priority tasks on each stage?

Machine 1 Machine nMachine 2

U1 U2 Un

Task T andothers

Main Results in Schedulabilityof Multistage Execution

84

jobsi

i,1CDelay

stagesj

j1,CDelay

?i j

ji,CDelayIs

jobs stages

J2 J1

J1 J2

J1 J2

J1 J2 J3

Many jobs, one stage

J1

J1

J1time

time

Many stages, one job

Many jobs, many stages

?i j

ji,CDelayIs

jobs stages

S 1

S 1

S 2

S 3

S 1

S 2

S 3

time

Delay Composition in PipelinedExecution

5/28/2010

43

Delay Composition

85

Machine 1 Machine nMachine 2 …

Task T1

Machine 1

Machine 2

Machine 3

Delay Composition

86

Machine 1 Machine nMachine 2 …

Task T1Task T2

Machine 1

Machine 2

Machine 3

5/28/2010

44

Delay Composition

87

Machine 1 Machine nMachine 2 …

Task T1Task T2Task T3

Machine 1

Machine 2

Machine 3

Delay Composition

88

Machine 1 Machine nMachine 2 …

Task T1Task T2Task T3

Machine 1

Machine 2

Machine 3

5/28/2010

45

Delay Composition

89

Machine 1 Machine nMachine 2 …

Task T1Task T2Task T3

Machine 1

Machine 2

Machine 3

EquivalentUniprocessor?

Delay Composition

90

Machine 1 Machine nMachine 2 …

Task T1Task T2Task T3

Machine 1

Machine 2

Machine 3

EquivalentUniprocessor? 2 C max T1 S C max stage j2 C max T2

5/28/2010

46

Proof Outline

Proof by induction on task priority

Prove delay composition theorem for 2 task system; Adda task of higher priority and show that theorem still holds

Two-task system:Sum of J2’s

execution timesSum of J1’s

execution times

No. of terms in delayexpression = No. of tasks

+ (No. of stages – 1)

Intuition: Two Task System (1/2)

C ase 1: J2 arrives together w ith or before J1N+1 stage

computation times;O ne from each stage,

except stage Lwhich has two

5/28/2010

47

Intuition: Two Task System (2/2)

C ase 2: J2 arrives after J1Let

J2 preempt J1 at stage j;stage L be the last stage

where J1 waits for J2

ji,stagesall

ji,stagesall

CmaxCmax2Delay

jobs

priority

higher

jobs

priority

higher

C1,1 C2,1 C3,1 C4,1

C1,2 C2,2 C3,2 C4,2

C1,3 C2,3 C3,3 C4,3

C1,4 C2,4 C3,4 C4,4

Delay Composition in PipelinedExecution

Machine 1 Machine nMachine 2 …

Task Tm

and others

C3,1

C1,1

C4,1

C2,1

C3,2

C1,2

C4,2

C2,2

C3,n

C1,n

C4,n

C2,n

5/28/2010

48

95

Properties

● Worst case delay of a job is sub-additive in the executiontimes of higher priority jobs that share its path

● Results applicable to both periodic and aperiodic tasks

● Results applicable under any scheduling policy where jobprioritization is consistent across stages

Machine 1 Machine nMachine 2

U1 U2 Un

ji,stagesall

ji,stagesall

CmaxCmaxDelayjobs

jobs

Task T andothers

96

Properties

An Idea:

● Can we create an artificial uni-processor task set for which TaskT has the same delay as that above?

● Hint: Expressing the computation time of each “uni-processor”task in terms of the pipeline tasks

Reduce pipeline to equivalent uniprocessor!

Machine 1 Machine nMachine 2

U1 U2 Un

ji,stagesall

ji,stagesall

CmaxCmaxDelayjobs

jobs

Task T andothers

5/28/2010

49

97

A Recent Result: DelayComposition Algebra

A compositional algebraic framework to analyzetiming issues in distributed systems Operands represent workload on composed sub-

systems

Set of operators to systematically transformdistributed real-time task systems to uniprocessortask-systems

Traditional uniprocessor analysis can be applied toinfer schedulability of distributed tasks

98

Delay Composition Algebra

Two operators that apply to node workloads whilesimplifying the resource graph

W = PIPE (W1, W2)

W = SPLIT (W1)

5/28/2010

50

99

Two operators that apply to node workloads whilesimplifying the resource graph

W = PIPE (W1, W2)

W = SPLIT (W1)

Delay Composition Algebra

100

PIPE

Two operators that apply to node workloads whilesimplifying the resource graph

W = PIPE (W1, W2)

W = SPLIT (W1)

Delay Composition Algebra

5/28/2010

51

101

Two operators that apply to node workloads whilesimplifying the resource graph

W = PIPE (W1, W2)

W = SPLIT (W1)

Delay Composition Algebra

102

SPLIT

Two operators that apply to node workloads whilesimplifying the resource graph

W = PIPE (W1, W2)

W = SPLIT (W1)

Delay Composition Algebra

5/28/2010

52

103

Two operators that apply to node workloads whilesimplifying the resource graph

W = PIPE (W1, W2)

W = SPLIT (W1)

Delay Composition Algebra

104

Two operators that apply to node workloads whilesimplifying the resource graph

W = PIPE (W1, W2)

W = SPLIT (W1)

Delay Composition Algebra

5/28/2010

53

105

Two operators that apply to node workloads whilesimplifying the resource graph

W = PIPE (W1, W2)

W = SPLIT (W1)

Delay Composition Algebra

106

Two operators that apply to node workloads whilesimplifying the resource graph

W = PIPE (W1, W2)

W = SPLIT (W1)

Delay Composition Algebra

5/28/2010

54

107

Two operators that apply to node workloads whilesimplifying the resource graph

W = PIPE (W1, W2)

W = SPLIT (W1)

Delay Composition Algebra

108

The Workload Matrix

Each cell i,j holds the delay that job i imposes on job j inthe subsystem that the matrix represents.

Example: System consisting of four jobs (J1-J4) of whichonly J1, J2, and J4 execute on the stage

5/28/2010

55

109

The Operators

Schedulability Analysis Results

5/28/2010

56

111

Example

S1

S2

S3

S4 S5

S6

S7 S8

T1: P 1 = D1 = 10 units

T2: P 2 = D2 = 20 units

T3: P 3 = D3 = 20 units

Every job requires one unit at each stage

112

S3

S1

Example contd.

S2

S4 S5

S6

S7 S8

Step 1: A1 PIPE A3 = A3’

PIPE O perator:

S3’

5/28/2010

57

113

S3’’

S2

S3’

Example contd.

S4 S5

S6

S7 S8

Step 2: A2 PIPE A3’ = A3’’

PIPE O perator:

114

S8S7

S3’’

Example contd.

S4 S5

S6

Step 3: A7 PIPE A8 = A7’

PIPE O perator:

S7’

5/28/2010

58

115

S5S4

S7’

S3’’

Example contd.

S6

Step 4: A4 PIPE A5 = A4’

PIPE O perator:

S4’

116

S6

S7’

S3’’

Example contd. Step 5: A6 PIPE A7’ = A6’

PIPE O perator:

S4’

S6’

5/28/2010

59

117

S6’

S3’’

Example contd. Step 6: SPLIT(A3’’) => A31, A32

S4’

SPLIT Operator: - Replicate matrix and zero out columns corresponding tojobs that don’t follow arc in question

- Replace (qi,k, ri,k) w ith (0, qi,k + ri,k), if Ji and Jk followdifferent arcs (in the output matrix containing Jk)

S31

S32

118

S32

S6’

Example contd. Step 7: A31 PIPE A4’ = A4’’

S4’S31

PIPE O perator:

S4’’

5/28/2010

60

119

S32

S6’

Example contd. Step 8: A32 PIPE A6’ = A6’’

PIPE O perator:

S4’’

120

S4’’

S6’

Example contd. Step 9: A4’’ PIPE A6’’ = A final

PIPE O perator:

Sfinal

5/28/2010

61

121

Schedulability test for T3

Create T1* with comp. time 4, (two times the value in

T1’s row and T3’s column), period 10;

T2*: Comp. time 2 units, period 20;

T3*: Comp. time 1+5 = 6 units (adding the stage

additive component), period 20;

Apply uniprocessor test – response time analysis

T3*’s delay computed as 16 units < period; Schedulable

Example contd.

122

Idea: Modify non-acyclic graph into an acyclic graph

CUT operator: ‘cut’ one of the arcs forming the cycle;each job crossing the arc that is cut is replaced by twoindependent jobs one before the cut and one after

CUT only relaxes precedence constraints on arrival timesof jobs, allowing jobs to arrive in a manner that cancause worst case delay – reduces schedulability;transformation errs on the safe side

Handling Non-Acyclic Graphs

5/28/2010

62

123

Conclusions

Presented a delay composition rule for pipelined systems

Leads to a reduction of the pipeline to single-stagesystems

Can now apply well known uniprocessor schedulabilityanalysis to analyze pipelines

Presented Delay Composition Algebra, for composingtogether resource stages in real-time distributed systems,under both preemptive and non-preemptive scheduling

Operators and operands to successively reduce distributedsystem to a single equivalent uniprocessor for the purposeof schedulability analysis

Future work – develop a complete calculus for modelingand analyzing real-time distributed systems

Placement with Previous TheoreticFoundations

Network Calculus• deterministicand stochasticversions• network-traffic oriented• assumptions on traffic arrivals• generally accurate analysis

Real-Time SchedulingTheory• deterministicanalysis• mostly for periodic tasks• hard to analyzeaperiodicdistributed systems• detailed low-levelmodels• accurate analysis

Feasible RegionA naly sis• deterministicanalysis• aperiodic tasks• very simple analysis• high-levelmodels

• only sufficientconditions

Q ueueing Theory• probabilistic analysis• known distributionassumptions• hard to analyze (e.g.,correlated traffic)• accurate analysis

Real-TimeQ ueueing Theory• probabilistic analysis of miss ratio• liquid task model only• accurate analysis

5/28/2010

63

Towards Sufficient Simplicity

Feasible RegionA naly sis• deterministicanalysis• aperiodic tasks• very simple analysis• high-levelmodels

• only sufficientconditions• the safe “quick-and-dirty ” solution

Envisioned analy sisTechniques should be:

C urrent analy sisTechniques are:

exact

complex

resourceoptimal

Less error prone (simplicity ),more scalable (simplicity ),and safe (sufficiency )

References

Delay Composition Algebra

Praveen Jay achandran and Tarek Abdelzaher, “A Delay Composition Theorem forReal-Time Pipelines,” Euromicro C onference on Real-Time Systems, P isa, Italy,July 2007

Praveen Jay achandran and Tarek Abdelzaher “Delay Composition in Preemptiveand Non-preemptive Real-Time Pipelines,” Journal of Real-Time Systems,V olume 40 Number 3, 2008.

Praveen Jay achandran and Tarek Abdelzaher “Transforming Distributed A cyclicSy stems into Equivalent Uniprocessors Under Preemptive and Non-PreemptiveScheduling,” EC RTS Prague, Czech Republic, July 2008.

Praveen Jay achandran and Tarek Abdelzaher “Delay Composition Algebra: AReduction-based Schedulability A lgebra for Distributed Real-Time Systems,” IEEEReal-time Sy stems Symposium, Barcelona, Spain,December 2008.

Praveen Jay achandran and Tarek Abdelzaher, “End-to-End Delay Analysis ofDistributed Sy stems with Cycles in the Task Graph,” Euromicro ConferenceonReal-Time Systems,Dublin, Ireland, July 2009.

5/28/2010

64

Part II: Spatial Interactions(Distributed Sensing Systems)

Tarek AbdelzaherUniversity of Illinois at Urbana-Champaign

Precision Agriculture

Habitat Monitoring

Infrastructure Protection

Disaster Response

Target Tracking

Sensor Networks

Applications

Border Control

Features

A d hoc deploy ment

Massive distribution

Interaction w ith aphy sical environment

Unattended operation

5/28/2010

65

A Fundamental Challenge:Sensor Network ApplicationDevelopment

Cost of sensor network software will dominate totalsystem cost

Example: U. Virginia’s VigilNet Cost of hardware: $10K

(100 nodes * $100/node at scale) Cost of software development/debugging/testing: $120K

(5 graduate students * 20 weeks * 40 hrs * $30/hr) Hardware cost is decreasing but programmers’ cost isn’t

Reducing software cost Development cost: reusable components, high-level

abstractions Debugging cost: automated checking and analysis tools Testing cost: realistic simulation/emulation environments

130

Can youimplement the

protocol onMicaZ?

Sure.It will bedone in 3

days.

Typical Sensor Network Application Developer

5/28/2010

66

131

Can youimplement the

protocol onMicaZ?

Sure.It will bedone in 3

days.

Typical Sensor Network Application Developer

132

Typical Sensor Network Application Developer

Ok!Coding complete

Lets test the

application on 2motes.

5/28/2010

67

133

Typical Sensor Network Application Developer

Hurray!It works for 2

motes!Lights areblinking!

134

Typical Sensor Network Application Developer

Ok! Its time toshow off.

Lets run theapplication on

30 motes.

5/28/2010

68

135

Typical Sensor Network Application Developer

What?What happened?

Everything issooo… dead!

136

Typical Sensor Network Application Developer

Think hard!What couldbe possibly

wrong?

5/28/2010

69

137

Typical Sensor Network Application Developer

What?Still dead?I thought I

fixed allbugs!

138

Typical Sensor Network Application Developer

5/28/2010

70

139

I thought yousaid 3 days!

It workson 3 nodes

!

140

I thought yousaid 3 days!

It workson 3 nodes

!

5/28/2010

71

141

Debugging in Sensor Networks

Many clever debugging techniques have beenproposed to increase visibility into state

They help programmers step through distributedcode, narrow down errors to code snippets (e.g., abad pointer reference), use high-level source, etc.

We would like to complement those with adiagnostic capability

Initial Thoughts

Log state from multiple runs

Label runs as “good” or “bad” dependingon whether errors were manifested

Use a classifier to infer a predicate onstate that distinguishes “good” from “bad”runs

142

5/28/2010

72

143

Case Study: Envirotrack

Envirotrack• Distributed target

tracking protocol• Assigns unique leader• Memberssend data to

leader

Target movement causesleader handoffSometimes protocolresulted in more thanone “detected target” forone physical target.

• Why?

144

Case Study: Envirotrack

Good run = no spurioustargets

Bad run = spurioustargets

Logged state: allmessage headers

5/28/2010

73

145

Case Study: Envirotrack

Primary cause of failure: No member to leadermessages (singleton group) Uncovered bug: Singleton groups are not addressed

Good run = no spurioustargets

Bad run = spurioustargets

Logged state: allmessage headers

146

However… Success was Limited

Cause of failure is often distributed acrossmultiple components rather than local to asingle one

No single component is to blame

No predicate over current state explains failure

Unexpected sequences of events lead to problems

Troubleshooting requires correlation ofdistributed sequences of events withmanifestations of anomalies

5/28/2010

74

147

A “Kitchen” Analogy

Prepare the chicken,onions….

Add salt Marinate thechicken

Delicious chicken

Add seasoning

Put in the oven(3.00 pm)

Its time tostop the oven(4.00 pm)

148

A “Kitchen” Analogy

Prepare the chicken,onions….

Marinate thechicken

Put in the oven(3.00 pm)

Its time tostop the oven(4.00 pm)

Add saltAdd seasoning

Its daylight saving,Time set the clockone hour behind

Who Burnt thechicken?

5/28/2010

75

149

A “Kitchen” Analogy

Prepare the chicken,onions….

Marinate thechicken

Put in the oven(3.00 pm)

Its time tostop the oven(4.00 pm)

Add saltAdd seasoning

Its daylight saving,set the clock one hour

behind

Who Burnt thechicken?

150

A “Kitchen” Analogy

Prepare the chicken,onions….

Marinate thechicken

Put in the oven(3.00 pm)

Its time tostop the oven(4.00 pm)

Add saltAdd seasoning

Its daylight saving,set the clock one hour

behind

Who Burnt thechicken?

Target: Intermittent error manifestations due to interactive complexity

5/28/2010

76

151

Main Idea

Run the system multiple times and log keyruntime events

Sometimes it works, sometimes it fails

Find the frequent sequences of events whenthe system worked

Find the frequent sequences of events whenthe system failed

Contrast the two to identify the “culprit”sequences of events (correlated with failure)

152

How do we find frequent sequences of events?

5/28/2010

77

Main Idea of Sequence Mining

X,A,B,C,A,B,D,A,C,B,D,A,C,B

Logged sequenceof events

Assume “Frequent” means exists 3 times

Main Idea of Sequence Mining

X,A,B,C,A,B,D,A,C,B,D,A,C,B

A=4

Logged sequenceof events

5/28/2010

78

Main Idea of Sequence Mining

A=4B=4C=3D=2X=1

X,A,B,C,A,B,D,A,C,B,D,A,C,B

Candidate PatternsOf Length 1

Main Idea of Sequence Mining

A=4B=4C=3D=2X=1

X,A,B,C,A,B,D,A,C,B,D,A,C,B

A=4B=4C=3

Frequent Patterns

L1

Candidate PatternsOf Length 1

5/28/2010

79

Main Idea of Sequence Mining

A=4B=4C=3D=2X=1

X,A,B,C,A,B,D,A,C,B,D,A,C,B

A=4B=4C=3

Candidate patternsOf length 2

ABACBABCCACBCCAABB

L1

L1 x L1

Candidate PatternsOf Length 1

Frequent Patterns

Main Idea of Sequence Mining

A=4B=4C=3D=2X=1

X,A,B,C,A,B,D,A,C,B,D,A,C,B

A=4B=4C=3

Candidate patternsOf length 2

AB=4ACBABCCACBCCAABB

L1

L1 x L1

Candidate PatternsOf Length 1

Frequent Patterns

5/28/2010

80

Main Idea of Sequence Mining

A=4B=4C=3D=2X=1

X,A,B,C,A,B,D,A,C,B,D,A,C,B

A=4B=4C=3

Candidate patternOf length 2

AB=4AC=3BA=3BC=3CA=2CB=3CC=1AA=2BB=2

L1

L1 x L1

Candidate PatternsOf Length 1

Frequent Patterns

Main Idea of Sequence Mining

A=4B=4C=3D=2X=1

X,A,B,C,A,B,D,A,C,B,D,A,C,B

A=4B=4C=3

AB=4AC=3BA=3BC=3CA=2CB=3CC=1AA=2BB=2

L1

L1 x L1

AB=4AC=3BA=3BC=3CB=3

L2

Candidate PatternsOf Length 1

Frequent Patterns

Frequent Patterns

5/28/2010

81

Main Idea of Sequence Mining

A=4B=4C=3D=2X=1

X,A,B,C,A,B,D,A,C,B,D,A,C,B

A=4B=4C=3

AB=4AC=3BA=3BC=3CA=2CB=3CC=1AA=2BB=2

L1

L1 x L1

AB=4AC=3BA=3BC=3CB=3

L2

ABCBAC

L2XL1

Candidate PatternsOf Length 1

Frequent Patterns

Candidate PatternsOf Length 3

Main Idea of Sequence Mining

A=4B=4C=3D=2X=1

X,A,B,C,A,B,D,A,C,B,D,A,C,B

A=4B=4C=3

AB=4AC=3BA=3BC=3CA=2CB=3CC=1AA=2BB=2

L1

L1 x L1

AB=4AC=3BA=3BC=3CB=3

L2

ABCBAC

L2XL1

Candidate PatternsOf Length 1

Frequent Patterns

Candidate PatternsOf Length 3

Empty

Frequent patterns

5/28/2010

82

163

How it Works?

Application

Log collectionfront end

N Log files

Datapreprocessingmiddleware

High level“Bad” behavior

metric

Delay>5ms=> Bad

Good log 1Good log 2Good log K

Bad log 1Bad log 2Bad log L

Frequentsequences

in set ofgood logs

Frequentsequencesin set ofbad logs

Remove common sequences

Frequent sequencegenerationalgorithm

Frequentsequences

only in set ofgood logs

Frequentsequences

only in set ofbad logs

Execute n times

164

Front-End –III:Diagnostic Simulation

Front-End –II:Runtime Logging

Set of Data Collection Front-Ends

Front-End –I:Passive Listener

5/28/2010

83

165

Extension- I:Preventing generation of out of order patterns

in loops

166

Extension – I contd.

Radio Receive pkt,

Signal new data,

Write data to kernel buffer,

App read pkt from buffer,

App Process Pkt,

…….

Radio Receive pkt,

Signal new data,

App read pkt from buffer,

Write data to kernel buffer,

App Process Pkt,

** CRASH **

…..

…..

Good Log Bad Log

Radio Receive pkt,

Signal new data,

Write data to kernel buffer,

App read pkt from buffer,

App Process Pkt,

…….

Radio Receive pkt,

Signal new data,

Write data to kernel buffer,

App read pkt from buffer,

App Process Pkt,

…..

…..

…..

5/28/2010

84

167

Extension – I contd.

Radio Receive pkt,

Signal new data,

Write data to kernel buffer,

App read pkt from buffer,

App Process Pkt,

…….

Radio Receive pkt,

Signal new data,

App read pkt from buffer,

Write data to kernel buffer,

App Process Pkt,

…..

…..

…..

Good Log Bad Log

Radio Receive pkt,

Signal new data,

Write data to kernel buffer,

App read pkt from buffer,

App Process Pkt,

…….

Radio Receive pkt,

Signal new data,

Write data to kernel buffer,

App read pkt from buffer,

App Process Pkt,

…..

…..

…..

168

Extension – I contd.

Radio Receive pkt,

Signal new data,

Write data to kernel buffer,

App read pkt from buffer,

App Process Pkt,

…….

Radio Receive pkt,

Signal new data,

App read pkt from buffer,

Write data to kernel buffer,

App Process Pkt,

…..

…..

…..

Good Log Bad Log

Radio Receive pkt,

Signal new data,

Write data to kernel buffer,

App read pkt from buffer,

App Process Pkt,

…….

Radio Receive pkt,

Signal new data,

Write data to kernel buffer,

App read pkt from buffer,

App Process Pkt,

…..

…..

…..

5/28/2010

85

169

Extension – I contd.

Radio Receive pkt,

Signal new data,

Write data to kernel buffer,

App read pkt from buffer,

App Process Pkt,

…….

Radio Receive pkt,

Signal new data,

App read pkt from buffer,

Write data to kernel buffer,

App Process Pkt,

…..

…..

…..

Good Log Bad Log

Radio Receive pkt,

Signal new data,

Write data to kernel buffer,

App read pkt from buffer,

App Process Pkt,

…….

Radio Receive pkt,

Signal new data,

Write data to kernel buffer,

App read pkt from buffer,

App Process Pkt,

…..

…..

…..

170

Extension – I contd.

Radio Receive pkt,

Signal new data,

Write data to kernel buffer,

App read pkt from buffer,

App Process Pkt,

…….

Radio Receive pkt,

Signal new data,

App read pkt from buffer,

Write data to kernel buffer,

App Process Pkt,

…..

…..

…..

Good Log Bad Log

Radio Receive pkt,

Signal new data,

Write data to kernel buffer,

App read pkt from buffer,

App Process Pkt,

…….

Radio Receive pkt,

Signal new data,

Write data to kernel buffer,

App read pkt from buffer,

App Process Pkt,

…..

…..

…..

5/28/2010

86

171

Extension – I contd.

Radio Receive pkt,

Signal new data,

Write data to kernel buffer,

App read pkt from buffer,

App Process Pkt,

…….

Radio Receive pkt,

Signal new data,

App read pkt from buffer,

Write data to kernel buffer,

App Process Pkt,

…..

…..

…..

Good Log Bad Log

Radio Receive pkt,

Signal new data,

Write data to kernel buffer,

App read pkt from buffer,

App Process Pkt,

…….

Radio Receive pkt,

Signal new data,

Write data to kernel buffer,

App read pkt from buffer,

App Process Pkt,

…..

…..

…..

172

Extension –II:A subsequence of a good thing is not always

good

5/28/2010

87

173

Enable Radio,Message Sent ,Ack Received,Disable Radio

…….

Enable Radio,Message Sent ,Ack Received,Disable Radio

…..

…..

…..

Enable Radio,Message Sent ,Ack Received,Disable Radio

…….

Enable Radio,Message Sent ,Disable Radio,

…..

…..

** Crash **

…..

Extension –II contd.

Good Log Bad Log

174

Enable Radio,Message Sent ,Ack Received,Disable Radio

…….

Enable Radio,Message Sent ,Ack Received,Disable Radio

…..

…..

…..

Enable Radio,Message Sent ,Ack Received,Disable Radio

…….

Enable Radio,Message Sent ,Disable Radio,

…..

…..

…..

Extension –II contd.

Good Log Bad Log

5/28/2010

88

175

Enable Radio,Message Sent ,Ack Received,Disable Radio

…….

Enable Radio,Message Sent ,Ack Received,Disable Radio

…..

…..

…..

Enable Radio,Message Sent ,Ack Received,Disable Radio

…….

Enable Radio,Message Sent ,Disable Radio,

…..

…..

…..

Extension –II contd.

Good Log Bad Log

Rule for subsequence elimination:

If a subsequence has same support as a larger sequence,“eliminate” the subsequence

176

Finding infrequent events with frequentside effects?

5/28/2010

89

177

Extension –III:Two stage mining

178

How Does Two-stage Mining Work?

Bad Log………………………………………………………………

……………………………………………………………….msgSent, ackReceived, msgSent,

ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,

flashWriteDone, msgSent, ackReceived,

msgReceived,ackSent, oldDataDeleted,flashWriteDone, temperatureSensed,

msgSent, ackReceived, msgSent,ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,

newDataEntered, newNeighborFound,msgDropped, msgSent, ackReceived,

msgSent, ackReceived,msgReceived,ackSent, flashWriteDone,

temperatureSensed, Reboot,

flashWriteDone, msgReceived,msgDropped, newneighborfound,

msgReceived, msgDropped,msgReceived, msgDropped,msgReceived, msgDropped,

msgReceived, msgDropped,msgReceived, msgDropped,

………………………………………………………………………………………………………………………………

Good Log………………………………………………………………

…………………………………………………………..msgSent, ackReceived, msgSent,

ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,

flashWriteDone, msgSent, ackReceived,

msgReceived,ackSent, oldDataDeleted,flashWriteDone, temperatureSensed,

msgSent, ackReceived, msgSent,ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,

newDataEntered, newNeighborFound,msgDropped, msgSent, ackReceived,

msgSent, ackReceived,msgReceived,ackSent, flashWriteDone,temperatureSensed, flashWriteDone,

temperatureSensed, msgSent,ackReceived, msgSent, ackReceived,

msgReceived,ackSent, flashWriteDone,temperatureSensed, newDataEntered,

newNeighborFound, msgDropped,

………………………………………………………………………………………………………………………………

………………………………………………………………

5/28/2010

90

179

How Does Two-stage Mining Work?

Bad Log………………………………………………………………

……………………………………………………………….msgSent, ackReceived, msgSent,

ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,

flashWriteDone, msgSent, ackReceived,

msgReceived,ackSent, oldDataDeleted,flashWriteDone, temperatureSensed,

msgSent, ackReceived, msgSent,ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,

newDataEntered, newNeighborFound,msgDropped, msgSent, ackReceived,

msgSent, ackReceived,msgReceived,ackSent, flashWriteDone,

temperatureSensed, Reboot,

flashWriteDone, msgReceived,msgDropped, newneighborfound,

msgReceived, msgDropped,msgReceived, msgDropped,msgReceived, msgDropped,

msgReceived, msgDropped,msgReceived, msgDropped,

………………………………………………………………………………………………………………………………

Good Log………………………………………………………………

…………………………………………………………..msgSent, ackReceived, msgSent,

ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,

flashWriteDone, msgSent, ackReceived,

msgReceived,ackSent, oldDataDeleted,flashWriteDone, temperatureSensed,

msgSent, ackReceived, msgSent,ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,

newDataEntered, newNeighborFound,msgDropped, msgSent, ackReceived,

msgSent, ackReceived,msgReceived,ackSent, flashWriteDone,temperatureSensed, flashWriteDone,

temperatureSensed, msgSent,ackReceived, msgSent, ackReceived,

msgReceived,ackSent, flashWriteDone,temperatureSensed, newDataEntered,

newNeighborFound, msgDropped,

………………………………………………………………………………………………………………………………

………………………………………………………………

180

How Does Two-stage Mining Work?

Bad Log………………………………………………………………

……………………………………………………………….msgSent, ackReceived, msgSent,

ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,

flashWriteDone, msgSent, ackReceived,

msgReceived,ackSent, oldDataDeleted,flashWriteDone, temperatureSensed,

msgSent, ackReceived, msgSent,ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,

newDataEntered, newNeighborFound,msgDropped, msgSent, ackReceived,

msgSent, ackReceived,msgReceived,ackSent, flashWriteDone,

temperatureSensed, Reboot,

flashWriteDone, msgReceived,msgDropped, newneighborfound,

msgReceived, msgDropped,msgReceived, msgDropped,msgReceived, msgDropped,

msgReceived, msgDropped,msgReceived, msgDropped,

………………………………………………………………………………………………………………………………

Good Log………………………………………………………………

…………………………………………………………..msgSent, ackReceived, msgSent,

ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,

flashWriteDone, msgSent, ackReceived,

msgReceived,ackSent, oldDataDeleted,flashWriteDone, temperatureSensed,

msgSent, ackReceived, msgSent,ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,

newDataEntered, newNeighborFound,msgDropped, msgSent, ackReceived,

msgSent, ackReceived,msgReceived,ackSent, flashWriteDone,temperatureSensed, flashWriteDone,

temperatureSensed, msgSent,ackReceived, msgSent, ackReceived,

msgReceived,ackSent, flashWriteDone,temperatureSensed, newDataEntered,

newNeighborFound, msgDropped,

………………………………………………………………………………………………………………………………

………………………………………………………………

5/28/2010

91

181

How Does Two-stage Mining Work?

Bad Log………………………………………………………………

……………………………………………………………….msgSent, ackReceived, msgSent,

ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,

flashWriteDone, msgSent, ackReceived,

msgReceived,ackSent, oldDataDeleted,flashWriteDone, temperatureSensed,

msgSent, ackReceived, msgSent,ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,

newDataEntered, newNeighborFound,msgDropped, msgSent, ackReceived,

msgSent, ackReceived,msgReceived,ackSent, flashWriteDone,

temperatureSensed, Reboot,

flashWriteDone, msgReceived,msgDropped, newneighborfound,

msgReceived, msgDropped,msgReceived, msgDropped,msgReceived, msgDropped,

msgReceived, msgDropped,msgReceived, msgDropped,

………………………………………………………………………………………………………………………………

Good Log………………………………………………………………

…………………………………………………………..msgSent, ackReceived, msgSent,

ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,

flashWriteDone, msgSent, ackReceived,

msgReceived,ackSent, oldDataDeleted,flashWriteDone, temperatureSensed,

msgSent, ackReceived, msgSent,ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,

newDataEntered, newNeighborFound,msgDropped, msgSent, ackReceived,

msgSent, ackReceived,msgReceived,ackSent, flashWriteDone,temperatureSensed, flashWriteDone,

temperatureSensed, msgSent,ackReceived, msgSent, ackReceived,

msgReceived,ackSent, flashWriteDone,temperatureSensed, newDataEntered,

newNeighborFound, msgDropped,

………………………………………………………………………………………………………………………………

………………………………………………………………

182

How Does Two-stage Mining Work?

Bad Log………………………………………………………………

……………………………………………………………….msgSent, ackReceived, msgSent,

ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,

flashWriteDone, msgSent, ackReceived,

msgReceived,ackSent, oldDataDeleted,flashWriteDone, temperatureSensed,

msgSent, ackReceived, msgSent,ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,

newDataEntered, newNeighborFound,msgDropped, msgSent, ackReceived,

msgSent, ackReceived,msgReceived,ackSent, flashWriteDone,

temperatureSensed, Reboot,

flashWriteDone, msgReceived,msgDropped, newneighborfound,

msgReceived, msgDropped,msgReceived, msgDropped,msgReceived, msgDropped,

msgReceived, msgDropped,msgReceived, msgDropped,

………………………………………………………………………………………………………………………………

5/28/2010

92

183

How Does Two-stage Mining Work?

Bad Log………………………………………………………………

……………………………………………………………….msgSent, ackReceived, msgSent,

ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,

flashWriteDone, msgSent, ackReceived,

msgReceived,ackSent, oldDataDeleted,flashWriteDone, temperatureSensed,

msgSent, ackReceived, msgSent,ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,

newDataEntered, newNeighborFound,msgDropped, msgSent, ackReceived,

msgSent, ackReceived,msgReceived,ackSent, flashWriteDone,

temperatureSensed, Reboot,

flashWriteDone, msgReceived,msgDropped, newneighborfound,

msgReceived, msgDropped,msgReceived, msgDropped,msgReceived, msgDropped,

msgReceived, msgDropped,msgReceived, msgDropped,

………………………………………………………………………………………………………………………………

184

Case Study – II: LiteOS Bug

A simple data collectionalgorithm

Implemented on MicaZmote testbed

Failure scenario Some of the nodes would

crash occasionally (athigher loads) and nondeterministically.

Different nodes wouldcrash and at differenttimes.

5/28/2010

93

185

LiteOS Bug – contd.

Recorded Events AttributeList

Higher LevelEvents

Packet_Received Null

Packet_Sent Null

Radio OperationEvents

Get_Current_Radio_Info_Address Null

Get_Current_Radio_Handle_Address Null

Disable_Radio_State Null

Get_Current_Radio_Handle Null

Get_Radio_Mutex Null

Get_Radio_Send_Function Null

Thread OperationRelated Events

Get_Current_Thread_Address Null

Mutex_Unlock_Function Null

Get_Current_Thread_Index Null

Post_Thread_Task Null

Serial OperationRelated Events

Get_Serial_Mutex Null

Get_Current_Serial_Info_Address Null

Get_Serial_Send_Function Null

186

LiteOS Bug – contd.

<Context_Switch_to_User_Thread> ,<Get_ Current_Thread_ Address> ,<Ge t_Serial_Send_ Func tion>

<Packet_Received>,<Context_Switch_to_User_Thread>,<Get_Serial_Send_Function>

<Packet_Received>,<Post_Thread_Task>,<Get_Serial_Send_Function>

<Packet_Received>,<Get_Current_Thread_Index>,<Get_Serial_Send_Function>

<Packet_Received>,<Get_Current_Thread_Address>,<Get_Serial_Send_Function>

<Packet_Received>,<Packet_Sent >,<Get_Current_Radio_Handle>

<Packet_Received>,<Get_Current_Radio_Handle_Address>,<Get_Current_Radio_Handle>

<Packet_Received>,<Mutex_Unlock_Function>,<Get_Current_Radio_Handle>

<Packet_Received>,<Disabale_Radio_State>,<Get_Current_Radio_Handle>

<Packet_Received>,<Post_Thread_Task>,<Get_Current_Radio_Handle>

Frequent sequence of events found in Bad Log

Frequent sequence of events found in Good Log

5/28/2010

94

187

LiteOS Bug – contd.

<Context_Switch_to_User_Thread> ,<Get_ Current_Thread_ Address> ,<Ge t_Serial_Send_ Func tion>

<Packet_Received>,<Context_Switch_to_User_Thread>,<Get_Serial_Send_Function>

<Packet_Received>,<Post_Thread_Task>,<Get_Serial_Send_Function>

<Packet_Received>,<Get_Current_Thread_Index>,<Get_Serial_Send_Function>

<Packet_Received>,<Get_Current_Thread_Address>,<Get_Serial_Send_Function>

<Packet_Received>,<Packet_Sent >,<Get_Current_Radio_Handle>

<Packet_Received>,<Get_Current_Radio_Handle_Address>,<Get_Current_Radio_Handle>

<Packet_Received>,<Mutex_Unlock_Function>,<Get_Current_Radio_Handle>

<Packet_Received>,<Disabale_Radio_State>,<Get_Current_Radio_Handle>

<Packet_Received>,<Post_Thread_Task>,<Get_Current_Radio_Handle>

Frequent sequence of events found in Bad Log

Frequent sequence of events found in Good Log

188

LiteOS Bug – contd.

<Context_Switch_to_User_Thread> ,<Get_ Current_Thread_ Address> ,<Ge t_Serial_Send_ Func tion>

<Packet_Received>,<Context_Switch_to_User_Thread>,<Get_Serial_Send_Function>

<Packet_Received>,<Post_Thread_Task>,<Get_Serial_Send_Function>

<Packet_Received>,<Get_Current_Thread_Index>,<Get_Serial_Send_Function>

<Packet_Received>,<Get_Current_Thread_Address>,<Get_Serial_Send_Function>

<Packet_Received>,<Packet_Sent >,<Get_Current_Radio_Handle>

<Packet_Received>,<Get_Current_Radio_Handle_Address>,<Get_Current_Radio_Handle>

<Packet_Received>,<Mutex_Unlock_Function>,<Get_Current_Radio_Handle>

<Packet_Received>,<Disabale_Radio_State>,<Get_Current_Radio_Handle>

<Packet_Received>,<Post_Thread_Task>,<Get_Current_Radio_Handle>

Frequent sequence of events found in Bad Log

Frequent sequence of events found in Good Log

Missing

<Get_Current_Radio_Handle>

Event highlights

the missed registration process

5/28/2010

95

189

Case Study – III:

Performance problem with multi-channelMAC protocol

190

MAC Protocol Overview

A Multichannel MAC that

clusters tightly communicating

nodes together and assigns

different clusters separate

home channels to reduceinterference

Nodes can communicate across

clusters by temporarily using

the receiver’s home channel

If receiver not found sender

scans channels

Acks sent from home channel.

Acks overheard to update

home channel of originator

5/28/2010

96

191

Recorded Events

Recorded Events Attribute List

Ack_Received Null

Home_Channel_Changed oldChannel, newChannel

TimeSyncMsg referenceTime, localTime

Channel_Update_Msg_Sent homeChannel

Data_Msg_Sent_On_Same_Channel destId, homeChannel

Data_Msg_Sent_On_Different_Channel destId, homeChannel, destChannel

Channel_Update_Msg_Received homeChannel, neighborId, neighborChannel

Try_Next_Channel oldChannelTried, nextChannelToTry

No_Ack_Received Null

192

Extracted Sequence of Events

<No_Ack_Received>,< Try_Next_Channel>

< Try_Next_Channel>,<No_Ack_Received>

<Data_Msg_Sent_On_Same_Channel: homechannel:0>,<No_Ack_Received>,

< Try_Next_Channel >,< Try_Next_Channel : nextchanneltotry:1>,

< Try_Next_Channel >,< Try_Next_Channel : oldchanneltried:1>,

<No_Ack_Received>

<Data_Msg_Sent_On_Same_Channel: homechannel:0>,<No_Ack_Received>,

< Try_Next_Channel >,< Try_Next_Channel : nextchanneltotry:1>,

< Try_Next_Channel : nextchanneltotry:2>

<Data_Msg_Sent_On_Same_Channel: homechannel:0>,<No_Ack_Received>,

< Try_Next_Channel >,< Try_Next_Channel : nextchanneltotry:1>,

< Try_Next_Channel : oldchanneltried:2>,< Try_Next_Channel : nextchanneltotry:3>,

<No_Ack_Received>,

< Try_Next_Channel : oldchanneltried:3>

5/28/2010

97

193

Extracted Sequence of Events

<No_Ack_Received>,< Try_Next_Channel>

< Try_Next_Channel>,<No_Ack_Received>

<Data_Msg_Sent_On_Same_Channel: homechannel:0>,<No_Ack_Received>,

< Try_Next_Channel >,< Try_Next_Channel : nextchanneltotry:1>,

< Try_Next_Channel >,< Try_Next_Channel : oldchanneltried:1>,

<No_Ack_Received>

<Data_Msg_Sent_On_Same_Channel: homechannel:0>,<No_Ack_Received>,

< Try_Next_Channel >,< Try_Next_Channel : nextchanneltotry:1>,

< Try_Next_Channel : nextchanneltotry:2>

<Data_Msg_Sent_On_Same_Channel: homechannel:0>,<No_Ack_Received>,

< Try_Next_Channel >,< Try_Next_Channel : nextchanneltotry:1>,

< Try_Next_Channel : oldchanneltried:2>,< Try_Next_Channel : nextchanneltotry:3>,

<No_Ack_Received>,

< Try_Next_Channel : oldchanneltried:3>

194

Extracted Sequence of Events

<No_Ack_Received>,< Try_Next_Channel>

< Try_Next_Channel>,<No_Ack_Received>

<Data_Msg_Sent_On_Same_Channel: homechannel:0>,<No_Ack_Received>,

< Try_Next_Channel >,< Try_Next_Channel : nextchanneltotry:1>,

< Try_Next_Channel >,< Try_Next_Channel : oldchanneltried:1>,

<No_Ack_Received>

<Data_Msg_Sent_On_Same_Channel: homechannel:0>,<No_Ack_Received>,

< Try_Next_Channel >,< Try_Next_Channel : nextchanneltotry:1>,

< Try_Next_Channel : nextchanneltotry:2>

<Data_Msg_Sent_On_Same_Channel: homechannel:0>,<No_Ack_Received>,

< Try_Next_Channel >,< Try_Next_Channel : nextchanneltotry:1>,

< Try_Next_Channel : oldchanneltried:2>,< Try_Next_Channel : nextchanneltotry:3>,

<No_Ack_Received>,

< Try_Next_Channel : oldchanneltried:3>

5/28/2010

98

195

MAC Protocol Overview - Revisited

A Multichannel MAC that

clusters tightly communicating

nodes together and assigns

different clusters separate

home channels to reduceinterference

Nodes can communicate across

clusters by temporarily using

the receiver’s home channel

If receiver not found sender

scans channels

Acks sent from home channel.

Acks overheard to update

home channel of originator

196

MAC Protocol Overview - Revisited

A Multichannel MAC that

clusters tightly communicating

nodes together and assigns

different clusters separate

home channels to reduceinterference

Nodes can communicate across

clusters by temporarily using

the receiver’s home channel

If receiver not found sender

scans channels

Acks sent from home channel

Acks overheard to update

home channel of originator

5/28/2010

99

197

Performance Improvement

0

50000

100000

150000

200000

250000

300000

350000

Successful Send Successful Receive

Num

bero

fM

ess

ages Multichannel

MAC performance(with bug)

MultichannelMAC performance

(with bug fix)

Performance improved by upto 50%

Quick fix: Keep ACK enabled all the time

Next: Symbolic Debugging

Finding symbolicpatterns in raw eventsequences

5/28/2010

100

Motivating Example

Sensor Aggregator

Forwarder

2 5

8

Msg X

Msg Y

Motivating Example

5/28/2010

101

Sensor Aggregator

Forwarder

12

18

9

Msg X

Msg Y

Motivating Example

Forwarder15

Sensor Aggregator

38

Msg X

Msg Y

Motivating Example

5/28/2010

102

Pattern 1(support=1)1. Sensor 3 sends Msg X to Aggregator 82. Aggregator 8 sends Msg Y to Forwarder 15

Pattern 2(support=1)1. Sensor 2 sends Msg X to Aggregator 52. Aggregator 5 sends Msg Y to Forwarder 8

Pattern 3(support=1)1. Sensor 12 sends Msg X to Aggregator 92. Aggregator 9 sends Msg Y to Forwarder 18

Motivating Example

Pattern 1(support=1)1. Sensor 3(A) sends Msg X to Aggregator 8(B)2. Aggregator 8(B) sends Msg Y to Forwarder 15(C)

Pattern 2(support=1)1. Sensor 2(A) sends Msg X to Aggregator 5(B)2. Aggregator 5(B) sends Msg Y to Forwarder 8(C)

Pattern 3(support=1)1. Sensor 12(A) sends Msg X to Aggregator 9(B)2. Aggregator 9(B) sends Msg Y to Forwarder 18(C)

Symbolic Pattern(support=3)1. Sensor A sends Msg X to Aggregator B2. Aggregator B sends Msg Y to Forwarder C

Motivating Example

5/28/2010

103

Another Example

Pattern 1

< msgSent, senderId = 1,msgType = 0 >,

< msgReceived, receiverId = 2,msgType = 0 >

Symbolic pattern 1

< msgSent, senderId = X,msgType = 0 >,

< msgReceived, receiverId = neighbor(X),msgType = 0 >

Pattern 2

< msgSent, senderId = 3,msgType = 0 >,

< msgReceived, receiverId = 5,msgType = 0 >

Symbolic pattern 2

< msgSent, senderId = X,msgType = 0 >,

< msgReceived, receiverId = neighbor(X),msgType = 0 >

How to Generate Symbolic Patterns?

<msgSent, nodeId, msgType, TimeStamp>,<msgReceived, nodeId, msgType, TimeStamp>

1. Initially, log multi-attribute events

5/28/2010

104

How to Generate Symbolic Pattern?

<msgSent, nodeId, msgType, TimeStamp>,<msgReceived, nodeId, msgType, TimeStamp>

<msgSent, msgType=X>,<msgReceived, msgType=X>

2. Find frequent single attribute patterns

1. Initially, log multi-attribute events

How to Generate Symbolic Pattern?

<msgSent, nodeId, msgType, TimeStamp>,<msgReceived, nodeId, msgType, TimeStamp>

<msgSent, msgType=X>,<msgReceived, msgType=X>

2. Find frequent single attribute patterns

1. Initially, log multi-attribute events

<msgSent, nodeId=*, msgType=X, TimeStamp=*>,<msgReceived, nodeId=*, msgType=X, TimeStamp=*>

3. Reconstruct multi-attribute events

5/28/2010

105

How to Generate Symbolic Pattern?

<msgSent, nodeId, msgType, TimeStamp>,<msgReceived, nodeId, msgType, TimeStamp>

<msgSent, msgType=X>,<msgReceived, msgType=X>

2. Find frequent single attribute patterns

1. Initially, log multi-attribute events

<msgSent, nodeId=*, msgType=X, TimeStamp=*>,<msgReceived, nodeId=*, msgType=X, TimeStamp=*>

3. Reconstruct multi-attribute events

<msgSent, nodeId=A, msgType=X, TimeStamp=*>,<msgReceived, nodeId=Relation(A), msgType=X, TimeStamp=*>

4. Generate symbolic patterns (Relation: Neighbor, Hop-distance etc.).

How to Generate Symbolic Pattern?

<msgSent, nodeId, msgType, TimeStamp>,<msgReceived, nodeId, msgType, TimeStamp>

<msgSent, msgType=X>,<msgReceived, msgType=X>

2. Find frequent single attribute patterns

1. Initially, log multi-attribute events

<msgSent, nodeId=*, msgType=X, TimeStamp=*>,<msgReceived, nodeId=*, msgType=X, TimeStamp=*>

3. Reconstruct multi-attribute events

<msgSent, nodeId=A, msgType=X, TimeStamp=*>,<msgReceived, nodeId=Relation(A), msgType=X, TimeStamp=*>

4. Generate symbolic patterns (Relation: Neighbor, Hop-distance etc.).

5. a. Calculate support for candidate symbolic pattern.b. Replace original pattern with symbolic pattern if support is “similar”.

5/28/2010

106

Case Study IV

Directed Diffusion

Observed problem Occasionally base station stops receiving packets after a

while although source node was generating data

5/28/2010

107

Directed Diffusion Protocol Overview

1 2 3

Interest Id X:Forward data

to : None

Node 1Interest cache

Interest Id X:Forward data

to : 1

Node 2Interest cache

Interest Id X:Forward data

to : 2

Node 3Interest cache

Interest cache keeps track of which way to forwardreceived data

Data cache keeps track of last received packet

Received data are dropped if no matching interest is foundfor the received data

Gradient Gradient

Interest X Interest X

5/28/2010

108

215

Conclusions

We need to analyze distributed sequences ofevents to identify causes of anomaly

Discriminative sequence mining is a promisingchoice for debugging interactive bugs and cornercase protocol bugs

Easy to use

References

C lassification-based diagnostics:

Maifi Khan, Tarek Abdelzaher, Liqian Luo, "SNTS: Sensor NetworkT roubleshooting Suite," DCoSS, Santa Fe, New Mexico, June2007.

Sequence-mining-based diagnostics:

Mohammad Khan, Tarek Abdelzaher and Kamal Gupta, "TowardsDiagnostic Simulation in Sensor Networks," DCoSS Santorini,Greece, June 2008.

Mohammad Khan, H ieu Khac Le, Hossein Ahmadi, TarekAbdelzaher, Jiawei Han ``DustMiner: T roubleshooting InteractiveC omplexity Bugs in Sensor Networks," Sensys, Raleigh, NC,November 2008.

Symbolic mining:

Mohammad Khan, Tarek Abdelzaher, Jiawei Han and HosseinAhmadi, "Finding Symbolic BugPatterns in Sensor Networks,"International Conference on Dis tributed Computing in SensorSys tems (DCOSS) Marina Del Rey, C A, June 2009

216

5/28/2010

109

Part III: Social Interactions(Human-centric CPS)

Tarek AbdelzaherUniversity of Illinois at Urbana-Champaign

A Future Cyber-Physical Systemmight look something like this …

5/28/2010

110

A Future Cyber-Physical Systemmight look something like this …

Human-centric CPSEmpirical Evidence

http://www.sensatex.com

Nike -iPod

Wii

Spot

The mastercontroller

Technology introduced by 2007

5/28/2010

111

Business Case?Health and Wellness

http://www.sensatex.com

Nike -iPod

Wii

Spot

The mastercontroller

HealthVault

Business Case?Sports and Entertainment

http://www.sensatex.com

Nike -iPod

Wii

Spot

The mastercontroller

5/28/2010

112

Business Case?Multiplayer Games

http://www.sensatex.com

Nike -iPod

Wii

Spot

The mastercontroller

Community Sensing ApplicationsCarTel (Sam Madden)

Reprinted from http://cartel.csail.mit.edu/overview. html

• An ad hoc network ofvehicles with sensors• Can measure roadcongestion• Can generate maps ofroad conditions, etc.• Given uncontrolledmobility model, how toperform global dataoperations (querydissemination, collection,etc)?

5/28/2010

113

Emergence of a Web ofSensors

WWW a gathering place around topic of mutualinterest

Future Sensor Web a gathering place aroundmutually interesting data pools (and derived info) Feng Zhao: MSR Sensor Map

Dave Clark: The future Internet will link more sensorsand embedded devices that traditional hosts

Van Jacobson: Named-data networking paradigm (weuse the Internet as an information source not acommunication medium)

Community Sensing

Individuals withsensing devicescollect sensor data

Shared data usedtowards a commongoal

Internet

5/28/2010

114

Community Sensing

Individuals withsensing devicescollect sensor data

Shared data usedtowards a commongoal

Internet

Community Sensing

Individuals withsensing devicescollect sensor data

Shared data usedtowards a commongoal

Internet

5/28/2010

115

Enabling Community Sensing

The privacy dilemma: Individuals do not want to share private data

(e.g., GPS trajectories)

Useful applications can be built if enough datais shared (e.g., real-time traffic maps)

Enabling Community Sensing

The privacy dilemma: Individuals do not want to share private data

(e.g., GPS trajectories)

Useful applications can be built if enough datais shared (e.g., real-time traffic maps)

Two types of solutions: Anonymity: reveal data, hide owner

Data alteration: reveal owner, hide data

5/28/2010

116

Data Sharing:A Privacy Problem Formulation

Users have time-series sensor data Examples – weight of an user on diet measured once a day, speed

of an user measured periodically on a given route

Goal : Let users to “lie” about their data, yet allowcomputing accurate distribution over the community at anypointUser 1

User 2

User 3

User N

….

An Example

Dieters want to share weight information to findefficacy of the given diet, without revealing theirtrue weight, average, trend (loss or gain ofweight), etc…

5/28/2010

117

Perturb data? Add Noise?

Weight curve perturbed by addingindependent random noise

Estimation using PCA to breachprivacy of user

Add Noise and RandomOffset?

234

Weight curve perturbed by addingindependent random noiseand a random offset

Estimation using PCA to estimate thedata of the user

5/28/2010

118

Challenge

Develop perturbation that preserves privacyof individuals

Cannot infer individuals’ data without large error

Reconstruction of community distribution can beachieved within proven accuracy bounds

Perturbation can be applied by non-expert users

Intuitive Approach

Add virtualuser curve

to realcurve

Real user

V irtual user

Perturbed data curve

5/28/2010

119

Traffic Analyzer

Users share perturbedspeed data withaggregation server

Server combines perturbedspeed data and uses de-convolution with noisemodel to compute originalspeed distribution

Garmin GPS used for datacollection

Results are from real datacollection in Urbana-Champaign in 2008

Dept. ofC omputerScience

Roads for which we wantto estimate average speed

Perturbing Speed

5/28/2010

120

Reconstruction of AverageSpeed

Reconstruction of CommunitySpeed Distribution

Real community distribution ofspeed

Reconstructed community distributionof speed

5/28/2010

121

Perturbing Speed and Location

Clients lie about both location and speed

Reconstruction Accuracy

Real versus reconstructed speed

Real community distribution ofspeed

Reconstructed community distributionof speed

5/28/2010

122

More on ReconstructionAccuracy

Real versus reconstructed speed onWashington St., Champaign

Real community distribution ofspeed

Reconstructed community distributionof speed

How Many are Speeding?

Street Real %Speeding

Estimated %Speeding

University Ave 15.6% 17.8%

Neil Street 21.4% 23.7%

Washington Street 0.5% 0.15%

Elm Street 6.9% 8.6%

Real versus estimated percentage of speedingvehicles on different streets (from data of userswho “lie” about both speed and location)

5/28/2010

123

Privacy and OptimalPerturbation

245

Is there an optimal perturbation scheme?

What is the measure the privacy?

How can we generate the optimalperturbation?

Privacy Measure

We use the mutual information I(X;Y) to measurethe information about X contained in Y

Minimal information leak under noise powerconstraint

5/28/2010

124

Upper Bound on Privacy

Lemma (Ihara, 78)

The noise that minimizes the upperbound on information leak is a Gaussiannoise

C ovariance of signal

C ovariance of noise

Mutual Information (Leak)

Finding the Optimal Noise

248

Solving for the optimal noise’s covariancematrix

5/28/2010

125

Optimal Noise

The noise generation method can be seen as theoptimal allocation of noise energy in thefrequency domain

Utility vs. Privacy Trade-off

250

5/28/2010

126

References

Privacy for single data streams:

Raghu Ganti, Nam Pham, Yu-En Tsai, Tarek Abdelzaher “PoolView: StreamPrivacy for Grassroots Participatory Sensing,” Sensys, Raleigh, NC, November2008.

Privacy for multidimensional data:

Nam Pham, Raghu Ganti, Md. Yusuf Uddin, Suman Nath, Tarek Abdelzaher,“Privacy -Preserving Reconstruction of Multidimensional Data Maps in V ehicularParticipatory Sensing,” European Conference on Wireless Sensor Networks(EWSN), C oimbra,Portugal, February, 2010.

O ptimality and bounds on privacy :

Nam Pham, Tarek A bdelzaher, Suman Nath, “On Bounding Data Stream Privacyin Distributed C y ber-phy sical Systems,” IEEE International Conference onSensor Networks, Ubiquitous, and Trustworthy Computing (IEEE SUTC)Newport Beach, CA, June, 2010. (Invited)

Community Sensing

Individuals withsensing devicescollect sensor data

Shared data usedtowards a commongoal

Internet

5/28/2010

127

Community Sensing

Individuals withsensing devicescollect sensor data

Shared data usedtowards a commongoal

Internet

Information Extraction

Two types of community sensing: Sensing to compute aggregate statistics (e.g.,

traffic speed)

Results pertain only to the locales where data wascollected

Example: average speed on one street does notnecessarily help predict speed on another street

Sensing to compute generalizable models thatcan be used for prediction Results from sparsely sampled data apply to a

broader context

5/28/2010

128

The Participant Data ModelingChallenge

A phenomenon is sampled by participantsin spatial and temporal dimensions

Sampling is sparse (at least in conditions ofpartial adoption)

The phenomenon is high-dimensional

Question: how to generalize modelsobtained from the limited samples to coverthe high-dimensional phenomenon space?

Example: Fuel EfficientRouting with GreenGPS

Indiv iduals contribute fuel consumptionvalues of their cars on various streets atdifferent times of the day (sampling of ahigh-dimensional phenomenon)

Not too many cars are doing this (sparsesampling, partial deployment)

Fuel consumption depends on attributesof cars, streets, drivers, etc (high-dimensional phenomenon)

Service computes general model forpredicting the most fuel efficient routefor an arbitrary vehicle betweenarbitrary source and destination point

Saves 6% over shortest path

Saves 13% over fastest path

5/28/2010

129

Preliminary ParticipatorySensing Deployment

DashDyno – OBD scannerwith GPS to collect locationtagged car sensor data

16 different compact andmid-sized sedans (e.g., Ford,Toyota, Honda)

Over 1000 miles of datacollected

Users record sensor data andGPS on SD card and uploadto our service

DashDy no

C overage map

257

Sampling Regression ModelingFramework

Fuel consumption of 16cars driven on a few roads

P redict fuel consumption ofany car on any road

5/28/2010

130

Fuel Consumption Model

Simple model for fuel consumption derived fromphysics principles

Approximate based on easily measurableparameters (e.g. stop signs, speed limits)

Generalization and Modeling

Regression modeling:

Problem: one size does not fit all. Who says that Fords andToyotas have the same regression model?

Regression model per car? Problem: Cannot use data collected by some cars to

predict fuel consumption of others.

Challenge: Must jointly determine both (i) regressionmodels and (ii) their scope of applicability, to coverthe whole data space with acceptable modelingerror.

5/28/2010

131

Generalization Hierarchy

Generalize data by some common attribute(s)?

Example of Regression Cubes

Goal: predict fuel consumption

Group by make, model, or year

5/28/2010

132

Regression Cubes

Data cells correspond to: Output attributes Yc = {yi}

Each associated with k input attributes xi1, … , xik , Xc={xij}

Data cells store the following measures: Regression coefficients:

Regression modeling error:

The Challenge of RegressionCubes

Main challenge: compute cuboid measures,the model and error, recursively (withoutreprocessing raw data)

Model parameters and estimation error atcell c

Not distributive

5/28/2010

133

Compressed Representation

Compressed representation of a cell c: : scalar value

: vector of size k

: k by k matrix

nc : number of samples

These matrices are distributivemeasures

Compressible Measures

Model coefficients:

Error:

5/28/2010

134

Example of Regression Cubes

Goal: predict fuel consumption

Group by make, model, or year

Model and modelingerror are efficientlycomputed for eachpossible generalization.

Model Reduction

Independently find a subset of attributes for each cell, such that: The cell is reliable

Corresponding error is minimized

Exponential number of possible subsets

Our heuristic:

V elocity (v)Mass (m)

F rontal area (A)Stop signs (S)

L = {v}L = {m}L = {A }L = {S}

A ttr ibutes

0.0310.1520.0430.056

y esy esy esy es

Error Reliable

5/28/2010

135

Model Reduction

V elocity (v)Mass (m)

F rontal area (A)Stop signs (S)

L = {v}L = {m}L = {A }L = {S}

A ttr ibutes

0.0310.1520.0430.056

y esy esy esy es

Error Reliable

Model Reduction

L = {v, m}L = {v, A }L = {v, S}

0.0210.0300.028

noy esy es

Error Reliable

L = {v}L = {m}L = {A }L = {S}

V elocity (v)Mass (m)

F rontal area (A)Stop signs (S)

A ttr ibutes

5/28/2010

136

Model Reduction

L = {v, m}L = {v, A }L = {v, S}

0.0210.0300.028

noy esy es

Error Reliable

L = {v}L = {m}L = {A }L = {S}

V elocity (v)Mass (m)

F rontal area (A)Stop signs (S)

A ttr ibutes

Model Reduction

L = {v, m}L = {v, A }L = {v, S}

L = {v}L = {m}L = {A }L = {S}

L = {v, S, m}L = {v, S, A }

0.0240.026

nono

Error Reliable

V elocity (v)Mass (m)

F rontal area (A)Stop signs (S)

A ttr ibutes

5/28/2010

137

Model Reduction

L = {v, m}L = {v, A }L = {v, S}

L = {v}L = {m}L = {A }L = {S}

L = {v, S, m}L = {v, S, A }

Reduced Model: {v, S}

V elocity (v)Mass (m)

F rontal area (A)Stop signs (S)

A ttr ibutes

Accuracy Results

The sampling regression cube improves predictionaccuracy significantly

Sparse samplingchallenge: Aregression cubewithout modelreduction is worsethan a single “one-size fits-all” model!

5/28/2010

138

Model Performance

All driven paths are splitinto smaller segments tocapture variations in fuelconsumption on individualstreets

Segment 1 Segment 2 Segment 3

Long Path Error

Reduction in cumulative error with increasing path length

5/28/2010

139

Fuel Savings Evaluation

Experiment: Given shortest and fastest routes, GreenGPS predict best route.

Driver drives both routes repeatedly and compares average fuelconsumption of the two.

Car Details Landmarks Route Savings %

Honda A ccord2001

H1 to Mall Shortest 31.4

H1 to Gy m Shortest 19.7

Ford Taurus 2001 H2 to Restaurant Shortest 26

Toy ota Celica 2001 H2 to Work Fastest 10.1

Nissan Sentra2009

H3 to C UPHD Fastest 8.4

Honda C ivic 2002 Grad to Work Fastest 18.7

Conclusions

An emerging sensor network paradigm is that wheresensing is done by volunteers interested in the datato provide a service

The talked covered sample challenges… Privacy: “Interested in the service but cannot share my

data”

Sparse sampling: In conditions of sparse deployment, mustgeneralize to cover high-dimensional spaces

Interdisciplinary solutions are needed frominformation theory and data mining

5/28/2010

140

More to Do…

Many future challenges remain Architecture and foundations for networks as information sources as

opposed to networks as data carriers

Understanding the interplay between social, physical and informationnetworks

Closing the loop and the human factor: how to account for humans inthe loop (e.g., humans responding to GreenGPS instructions)

More privacy: what can and cannot be inferred from my data?

Usability and trust:

what technical features inspire human trust in the data collection sy stem?

How to ensure trustworthiness of data collection results

Provenance: Where did the data come from?

A repository of data sets?

References

Raghu Ganti, Nam Pham, Hossein Ahmadi, Saurabh Nangia,Tarek Abdelzaher, “GreenGPS: A Participatory Sensing Fuel-Efficient Maps Application,” Mobisys, San Francisco, CA,June 2010.

5/28/2010

141

Stability Challenges

Additional Optional material

Why Does Software HaveDynamics?

Output of component A depends on past output of B(delay)

Output of component A depends on integral of B(e.g., queue level is an integral of difference inrequest rates)

Filters (averaging, moving average)

Sampling

5/28/2010

142

Composition of Self-regulatingComponents

Challenge:

Larger systems composed of a larger number ofcomponents

Modularity and separation of concerns components aredesigned independently

Autonomy self-regulating behavior in many components

Composition of self-regulating components can haveunexpected adverse interactions

How to build well-behaved systems from large numbers ofself-regulating components while designing in a modularfashion?

Composition of Self-regulatingComponents

+ +

+

_ _

_

Subsystem I Subsystem II

Subsystem III

5/28/2010

143

Composition of Self-regulatingComponents

A B

BA

C C

+ +

+

_ _

_

Subsystem I Subsystem II

Subsystem III

Composition of Self-regulatingComponents

A B

C

+ +

+

_ _

_

Subsystem I Subsystem II

Subsystem III

5/28/2010

144

Composition of Self-regulatingComponents

A B

C

+ +

+

_ _

_

Subsystem I Subsystem II

Subsystem III

Example: Energy Managementin Data Centers

Energy expended on:

Computing (powering up racks of machines)

Sensors: Machine utilization, Delay,Throughput, …

Actuators: DVS, turning machines On/Off

Cooling

Sensors: Temperature, air flow, …

Actuators: Air-conditioning units, fans, …

Energy bill is 40-50% of total profit

5/28/2010

145

Energy control Cooling energy optimization and computing

energy optimization is done separately

Challenge: Coordination/decoupling of multiple “control

loops”

Uncoordinated manipulation of multiple knobscan lead to instability or poor efficiency

Example: Energy Managementin Data Centers

ExampleDVS + On/Off

DVS alone

On/Offalone

Optimaljointpolicy

5/28/2010

146

References

Jin Heo, Dan Henriksson, Xue Liu, and Tarek Abdelzaher,“Integrating Adaptive Components: An Emerging Challengein Performance-Adaptive Systems and a Server Farm Case-Study” RTSS, Tuscon, AZ, December 2007.