11
Semantics and Evaluation Techniques for Window Aggregates in Data Stream Jin Li, David Maier, Kristin Tufte, Vassillis Papadimos, Peter Tucker. Presented by: Venkatesh Raghvan Charudatta Wad CS 525 Class discussion

Semantics and Evaluation Techniques for Window Aggregates in Data Stream

  • Upload
    gail

  • View
    18

  • Download
    0

Embed Size (px)

DESCRIPTION

Semantics and Evaluation Techniques for Window Aggregates in Data Stream. Jin Li, David Maier, Kristin Tufte, Vassillis Papadimos, Peter Tucker. Presented by: Venkatesh Raghvan Charudatta Wad CS 525 Class discussion. Overview. Background Problem Statement Window semantics - PowerPoint PPT Presentation

Citation preview

Page 1: Semantics and Evaluation Techniques for Window Aggregates in Data Stream

Semantics and Evaluation Techniques for Window Aggregates in Data Stream

Jin Li, David Maier, Kristin Tufte, Vassillis Papadimos, Peter Tucker.

Presented by: Venkatesh Raghvan Charudatta WadCS 525 Class discussion

Page 2: Semantics and Evaluation Techniques for Window Aggregates in Data Stream

Overview

Background Problem Statement Window semantics WID approach Discussion

Page 3: Semantics and Evaluation Techniques for Window Aggregates in Data Stream

Background

Disorders Handling: Punctuations. Aggregate Queries:

In SQL? In CQL? (without WIDs)

In sliding windows, what causes an output?

Page 4: Semantics and Evaluation Techniques for Window Aggregates in Data Stream

Problem Statement

Lack of explicit window semantics. Implementation efficiency. Out of order arrival of data.

Page 5: Semantics and Evaluation Techniques for Window Aggregates in Data Stream

Running Example Consider the example from the paper:

Schema <seg-id, speed, ts> Query:

SELECT seg-id, max(speed), min(speed) FROM Traffic [Range 300 seconds

SLIDE 60 seconds WATTR ts]

GROUP BY seg-id.

Page 6: Semantics and Evaluation Techniques for Window Aggregates in Data Stream

Running Example

- This picture is taken from the paper itself.

Page 7: Semantics and Evaluation Techniques for Window Aggregates in Data Stream

Big Picture

Mapping of tuples to window extents and vice versa.

New Window semantics. Window specifications: RANGE, SLIDE

and WATTR.

Page 8: Semantics and Evaluation Techniques for Window Aggregates in Data Stream

Window specification

Time based query: Counting the number of vehicles in each

segment for the past 1 hour, update the result every 20 min.

SELECT seg-id, count(*) FROM Traffic [RANGE 60 minutes

SLIDE 20 minutes WATTR ts]

GROUP BY seg-id.

Page 9: Semantics and Evaluation Techniques for Window Aggregates in Data Stream

Window specification

Tuple-based query: Counting the number of vehicles in each

segment for the past 100 rows, update the result every 10 rows.

SELECT seg-id, count(*) FROM Traffic [RANGE 100 rows

SLIDE 10 rows WATTR row-num]

GROUP BY seg-id.

Page 10: Semantics and Evaluation Techniques for Window Aggregates in Data Stream

Window specification Can we specify RANGE and SLIDE on

different attributes: YES!!

SELECT seg-id, count(*) FROM Traffic [RANGE 300 seconds

SLIDE 10 rows RATTR ts SATTR row-num]

GROUP BY seg-id.

Page 11: Semantics and Evaluation Techniques for Window Aggregates in Data Stream

WID Approach

Explained by Venky.