33
Design Considerations for High Fan-in Systems: The HiFi Approach Presented by Shawn Jeffery CIDR‘05 1/7/05 Michael J. Franklin, Shawn R. Jeffery, Sailesh Krishnamurthy, Frederick Reiss, Shariq Rizvi, Eugene Wu, Owen Cooper, Anil Edakkunni, and Wei Hong UC Berkeley, Intel Research Berkeley

Design Considerations for High Fan-in Systems: The HiFi Approach Presented by Shawn Jeffery CIDR‘05 1/7/05 Michael J. Franklin, Shawn R. Jeffery, Sailesh

Embed Size (px)

Citation preview

Design Considerations for High Fan-in Systems: The HiFi Approach

Presented by Shawn JefferyCIDR‘05 1/7/05

Michael J. Franklin, Shawn R. Jeffery, Sailesh Krishnamurthy, Frederick Reiss, Shariq Rizvi,

Eugene Wu, Owen Cooper, Anil Edakkunni, and Wei Hong

UC Berkeley, Intel Research Berkeley

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

Itinerary

• Introduction: High Fan-in Systems• HiFi Overview• Initial Prototype• Ongoing Work and Future Directions• Conclusions

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

Introduction

• Receptors everywhere!• Wireless sensor networks, RFID technologies,

digital home, network monitors, ...

• Somehow need to make sense of this data to provide near real-time decision support

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

High Fan-in Systems

Large numbers of receptors = large data volumesHierarchical, successive aggregation

The “Bowtie”

Challenges in 3 dimensions:•Geography•Time•Resources

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

Supply-Chain Management (SCM)

RFIDRFIDReceptors

Warehouses, Stores

Dock doors, Shelves

Regional Centers

Headquarters

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

State of the Art

• Not seen as a data management issue• Focus on protocol design• Different “data models” at each level• Reinventing “query languages” at each level

• Piecemeal/stovepipe approach• Each type of receptor (RFID, sensors, etc)

handled separately• Current solutions tend to be hand-coded,

script-based approaches

No end-to-end, integrated solution for managing distributed receptor data

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

Itinerary

• Introduction: High Fan-in Systems• HiFi Overview• Initial Prototype• Ongoing Work and Future Directions• Conclusions

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

HiFi: Cascading Stream Processing in a High Fan-in System

• A data management infrastructure for high fan-in environments

• Uniform Declarative Framework • Every node is a data stream processor

that speaks SQL-ese stream-oriented queries at all levels

• Hierarchical, stream-based views as an organizing principle

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

Hierarchical Query Processing

“I provide raw readings for Soda Hall”

“I provide avg daily values for Berkeley”

“I provide avg weekly values for California”

“I provide national monthly values for the US”

• Continuous and Streaming• Windows• Sharing

• Hierarchical• Temporal

granularity vs. geographic scope

SELECT S.area, AVG(S.temp)FROM SENSOR_STREAM S [range by ‘5 sec’ slide by ‘5 sec’]GROUP BY S.area

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

Basic HiFi Architecture

HiFi GlueDSQP

HiFi GlueDSQP

MDR

• Hierarchical federation of nodes

• Each node:• Data Stream Query

Processor (DSQP)• HiFi Glue

• Views drive system functionality

• Metadata Repository (MDR)

HiFi GlueDSQP

DSQP

HiFi Glue•DSQP Management•Query Planning•Archiving•Internode coordination and communication

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

In the paper…

HiFi Design Considerations• Dealing with Real-World Data• Hierarchical Windowed Views with Sharing• System Management• Topological Fluidity• Query Planning and Data Placement• Complex Event Processing• Archiving and Prioritization• Privacy and Access Control

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

Itinerary

• Introduction: High Fan-in Systems• HiFi Overview• Initial Prototype• Ongoing Work and Future Directions• Conclusions

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

Envisioning HiFiBuilding HiFi

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

A Tale of Two Systems

• TelegraphCQ• Data stream processor• Continuous, adaptive query

processing with aggressive sharing

• TinyDB• Declarative query processing for

wireless sensor networks• In-network aggregation

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

Initial Prototype

TelegraphCQ

TinyDB

Stargates

Sensor Networks &

RFID Readers

RFID Wrappers

PC

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

Initial Prototype

Demoed @ VLDB ‘04

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

HiFi Design Considerations

• Dealing with Real-World Data• Hierarchical Windowed Views with

Sharing• System Management• Topological Fluidity• Query Planning and Data Placement• Complex Event Processing• Archiving and Prioritization• Privacy and Access Control

• Dealing with Real-World Data• Hierarchical Windowed Views with

Sharing• System Management• Topological Fluidity• Query Planning and Data Placement• Complex Event Processing• Archiving and Prioritization• Privacy and Access Control

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

CSAVA: Processing RFID Data in HiFi

• RFID data is gross!• Lost readings• Errant readings• Duplicate readings

• Use queries to make the data usable• CSAVA: Clean Smooth Arbitrate Validate

Analyze

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

CSAVA: Processing RFID Data in HiFi

Clean

CREATE VIEW cleaned_rfid_stream AS(SELECT receptor_id, tag_idFROM rfid_stream rsWHERE read_strength >= strength_T)

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

CSAVA: Processing RFID Data in HiFi

Clean

SmoothCREATE VIEW smoothed_rfid_stream AS(SELECT receptor_id, tag_id FROM cleaned_rfid_stream [range by ’5 sec’, slide by ’5 sec’] GROUP BY receptor_id, tag_id HAVING count(*) >= count_T)

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

CSAVA: Processing RFID Data in HiFi

Clean

Smooth

ArbitrateCREATE VIEW arbitrated_rfid_stream AS(SELECT receptor_id, tag_idFROM smoothed_rfid_stream rs [range by ’5 sec’, slide by ’5 sec’]GROUP BY receptor_id, tag_idHAVING count(*) >= ALL (SELECT count(*) FROM smoothed_rfid_stream [range by ’5 sec’, slide by ’5 sec’] WHERE tag_id = rs.tag_id GROUP BY receptor_id))

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

CSAVA: Processing RFID Data in HiFi

Arbitrate

Validate

CREATE VIEW validated_tags AS(SELECT tag_name, FROM arbitrated_rfid_stream rs [range by ’5 sec’, slide by ’5 sec’], known_tag_list tlWHERE tl.tag_id = rs.tag_id

Clean

Smooth

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

CSAVA: Processing RFID Data in HiFi

Validate

CREATE VIEW tag_count AS(SELECT tag_name, count(*) FROM validated_tags vt [range by ‘5 min’, slide by ‘1 min’]GROUP BY tag_name

Analyze

Arbitrate

Clean

Smooth

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

CSAVA: Processing RFID Data in HiFi

Augment

Convert

Aggregate

Validate

Analyze

Arbitrate

Clean

Smooth

Augment

Convert

Aggregate

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

CSAVA: Bridging the Physical-Virtual Divide

• An example of HiFi processing, but instrumental in dealing with real world data

Arbitrate

Clean

Smooth Window

Single Tuple

Multiple Receptors

CSAVA Generalization

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

Complexity of Hierarchical Windowed Query Processing

•Naïve dissemination (unchanged query) introduces a lag in query results

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

Additive Lag in Hierarchical Windowed Query Processing

Level 0

Level 1

Level 2

Window

Event

Result Tuple(s)

Additive Lag!

Result Tuple(s)

Result Tuple(s)

Window

Window

SELECT S.area, AVG(temp)FROM SENSOR_STREAM S [range by ‘5 sec’ slide by ‘5 sec’]GROUP BY S.area

User

Time

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

Sketch of a Solution

Level 0

Level 1

Level 2

Event

Result Tuple(s)

Result Tuple(s)

Window

SELECT S.area, AVG(temp)FROM SENSOR_STREAM S[range by ‘5 seconds’ slide by ‘5 seconds’]GROUP BY S.area

User•Solution is to use both time-based windows and NOW windows

Time

Result Tuple(s)

NOW window

NOW window

Time-based window

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

System Management

• Our small deployment:• 20+ individual devices

(4 types of devices)• 5 different platforms

(OS + Hardware)

Management nightmare• System-wide management is crucial

• Both coarse and fine-grained• Where we’re headed:

• System monitoring needed: turn the lens inwards to introspect on system state

• Use uniform declarative framework to provide failover and load balancing

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

Itinerary

• Introduction: High Fan-in Systems• HiFi Overview• Initial Prototype• Ongoing Work and Future

Directions• Conclusions

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

Ongoing Work and Future Directions

• Bridging the physical-virtual divide• Generalize CSAVA-type processing to

other receptors

• Hierarchical query processing• Query planning, dissemination

• Complex event processing• Unify event and data processing

• System deployment and management• Archiving and prioritization

1/7/05 Shawn Jeffery, HiFi Project, UCB EECS

Conclusions

• Receptors everywhere High Fan-In Systems

• Uniform declarative framework is the key to building these systems

• The HiFi project is exploring this approach• Our initial prototype

• Leveraged TelegraphCQ and TinyDB• Validated the HiFi approach• Identified research directions

• Broad in scope = much work to be done!

Questions?

hifi.cs.berkeley.edu