31
Probabilistic Similarity Search for Uncertain Time Series Presented by CAO Chen 21 st Feb, 2011

Probabilistic Similarity Search for Uncertain Time Series

  • Upload
    lyndon

  • View
    32

  • Download
    1

Embed Size (px)

DESCRIPTION

Probabilistic Similarity Search for Uncertain Time Series. Presented by CAO Chen 21 st Feb, 2011. Outline. Introduction Background Time Series Similarity Search Motivation & Contribution Uncertain Time Series Query Uncertainty Approximation Step-wise Refinement Evaluation - PowerPoint PPT Presentation

Citation preview

Page 1: Probabilistic Similarity Search for Uncertain Time Series

Probabilistic Similarity Search for Uncertain Time SeriesPresented by CAO Chen21st Feb, 2011

Page 2: Probabilistic Similarity Search for Uncertain Time Series

Outline• Introduction• Background• Time Series• Similarity Search

• Motivation & Contribution• Uncertain Time Series Query • Uncertainty Approximation• Step-wise Refinement

• Evaluation• Related Literature Review• Q & A

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

2

Page 3: Probabilistic Similarity Search for Uncertain Time Series

Background – Time Series

3

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Page 4: Probabilistic Similarity Search for Uncertain Time Series

Background – Time Series (cont’d)• Source of Time Series Data• Traffic measurements

• Uncorrelated

• Location tracking of moving objects

• Measuring environmental parameter(temperature)• Correlated 4

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Page 5: Probabilistic Similarity Search for Uncertain Time Series

Background – Similarity Search• Similarity Search• Pattern Matching• Shape Matching

5

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Page 6: Probabilistic Similarity Search for Uncertain Time Series

Background – Similarity Search (cont’d)• Range Query• Return all tuples that fits between an upper and lower boundary. • We don’t know how many it will return• Slower than top-k because no upper bound to prune

• Sequence Matching• Whole matching: Sequences with same length• Subsequence Matching

6

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Page 7: Probabilistic Similarity Search for Uncertain Time Series

Motivation & Contribution• Uncertainty • Moving objects• Object identification• Sensor network monitoring

7

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Page 8: Probabilistic Similarity Search for Uncertain Time Series

Motivation & Contribution (cont’d)• Contribution• (Firstly) Formalize the notion of uncertain time series• Two novel types of probabilistic range queries over uncertain

time series• Pruning strategy based on approximating representation of

uncertainty• Explicitly evaluate the refinement(processing) time cost

8

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Page 9: Probabilistic Similarity Search for Uncertain Time Series

Outline• Introduction• Background• Time Series• Similarity Search

• Motivation & Contribution• Uncertain Time Series Query • Uncertainty Approximation• Step-wise Refinement

• Evaluation• Related Literature Review• Q & A

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

9

Page 10: Probabilistic Similarity Search for Uncertain Time Series

Probabilistic Queries Over Uncertain TS• Definition of Uncertain Time Series

10

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Page 11: Probabilistic Similarity Search for Uncertain Time Series

Probabilistic Queries Over Uncertain TS (cont’d)• Definition of Uncertain Lp-Distance

11

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Page 12: Probabilistic Similarity Search for Uncertain Time Series

Probabilistic Queries Over Uncertain TS (cont’d)• Definition of Probabilistic Range Queries

12

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Page 13: Probabilistic Similarity Search for Uncertain Time Series

Challenge in Processing Range Queries with Uncertainty• Naïve Solution

• Computing all distance observations• CPU-bound vs. I/O bound• Long time series and high sample rates (large n), • Naïve Solution• Number of computing the distance

13

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Page 14: Probabilistic Similarity Search for Uncertain Time Series

Outline• Introduction• Background• Time Series• Similarity Search

• Motivation & Contribution• Uncertain Time Series Query • Uncertainty Approximation• Step-wise Refinement

• Evaluation• Related Literature Review• Q & A

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

14

Page 15: Probabilistic Similarity Search for Uncertain Time Series

Approximate Representation

15

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Page 16: Probabilistic Similarity Search for Uncertain Time Series

Approximate Representation (cont’d)• Two Levels of Appr. Representation• Different in whether existing multiple(K) groups of sample

observation in one time slot

16

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

By K-means clusteringOnly one group at each time slot

Page 17: Probabilistic Similarity Search for Uncertain Time Series

Distance Approximations

17

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Page 18: Probabilistic Similarity Search for Uncertain Time Series

Distance Approximations (cont’d)

18

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Page 19: Probabilistic Similarity Search for Uncertain Time Series

Distance Approximations (cont’d)• Lemma 1

• Lemma 2

19

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Page 20: Probabilistic Similarity Search for Uncertain Time Series

Probabilistic Bounded Range Queries (PBRQ)

20

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

True Hit

True Drop

Page 21: Probabilistic Similarity Search for Uncertain Time Series

Outline• Introduction• Background• Time Series• Similarity Search

• Motivation & Contribution• Uncertain Time Series Query • Uncertainty Approximation• Step-wise Refinement

• Evaluation• Related Literature Review• Q & A

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

21

Page 22: Probabilistic Similarity Search for Uncertain Time Series

Step-Wise Refinement• When to refine?• Time series that could not be filtered or determined simply by

comparing the interval of lower and upper bound• Refinement Goal• To identify an uncertain time series as true hit or true drop

• Condition to increase the lower bound

• Increase of the number of qualified distance

22

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Page 23: Probabilistic Similarity Search for Uncertain Time Series

Step-Wise Refinement (cont’d)

• Refinement heuristics

23

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Page 24: Probabilistic Similarity Search for Uncertain Time Series

Outline• Introduction• Background• Time Series• Similarity Search

• Motivation & Contribution• Uncertain Time Series Query • Uncertainty Approximation• Step-wise Refinement

• Evaluation• Related Literature Review• Q & A

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

24

Page 25: Probabilistic Similarity Search for Uncertain Time Series

Evaluation• Benchmark• UCI Time Series Data Mining Archive• CBF, GUN/POINT, CONTROL CHART, OSU LEAF

• Uncertainty• Generating samples uniformly distributed around the given exact

values

• Evaluation• Overall Speed-Up• Refinement Speed-Up

25

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Page 26: Probabilistic Similarity Search for Uncertain Time Series

Evaluation (cont’d)• Speed-up for Probabilistic Bounded Range Query (PBRQ)

26

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Page 27: Probabilistic Similarity Search for Uncertain Time Series

Evaluation (cont’d)• Speed-up for Probabilistic Rank Range Query (PRRQ)

27

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Page 28: Probabilistic Similarity Search for Uncertain Time Series

Evaluation (cont’d)• Speed-up w.r.t. scalability

28

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Page 29: Probabilistic Similarity Search for Uncertain Time Series

Evaluation (cont’d)• Refinement• S-S: using proposed strategy• R-R: randomly processing for both steps

• Logarithm value of required calculations

29

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Page 30: Probabilistic Similarity Search for Uncertain Time Series

Outline• Introduction• Background• Time Series• Similarity Search

• Motivation & Contribution• Uncertain Time Series Query • Uncertainty Approximation• Step-wise Refinement

• Evaluation• Related Literature Review• Q & A

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

30

Page 31: Probabilistic Similarity Search for Uncertain Time Series

Q & A

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

31

• Thank You