40
Themis Palpanas 1 VLDB - Aug 2004 Fair Use Agreement Fair Use Agreement This agreement covers the use of all slides on this CD-Rom, please read carefully. You may freely use these slides for teaching, if You send me an email telling me the class number/ university in advance. • My name and email address appears on the first slide (if you are using all or most of the slides), or on each slide (if you are just taking a few slides). You may freely use these slides for a conference presentation, if • You send me an email telling me the conference name in advance. • My name appears on each slide you use. • You may not use these slides for tutorials, or in a

Fair Use Agreement

  • Upload
    jens

  • View
    19

  • Download
    0

Embed Size (px)

DESCRIPTION

Fair Use Agreement. This agreement covers the use of all slides on this CD-Rom, please read carefully. You may freely use these slides for teaching, if You send me an email telling me the class number/ university in advance. - PowerPoint PPT Presentation

Citation preview

Page 1: Fair Use Agreement

Themis Palpanas 1VLDB - Aug 2004

Fair Use AgreementFair Use Agreement This agreement covers the use of all slides on this CD-Rom, please read carefully.

• You may freely use these slides for teaching, if • You send me an email telling me the class number/ university in advance.• My name and email address appears on the first slide (if you are using all or most of the slides), or on each slide (if you are just taking a few slides).

• You may freely use these slides for a conference presentation, if • You send me an email telling me the conference name in advance.• My name appears on each slide you use.

• You may not use these slides for tutorials, or in a published work (tech report/ conference paper/ thesis/ journal etc). If you wish to do this, email me first, it is highly likely I will grant you permission.

(c) Eamonn Keogh, [email protected]

Page 2: Fair Use Agreement

Indexing Large Human-Motion Databases

Eamonn Keogh, Themis Palpanas Victor B. Zordan,Dimitrios Gunopulos

University of California, RiversideMarc Cardle

University of Cambridge

Page 3: Fair Use Agreement

Themis Palpanas 3VLDB - Aug 2004

Motion Capture

records motion data from live actors

Page 4: Fair Use Agreement

Themis Palpanas 4VLDB - Aug 2004

Motion Capture

records motion data from live actors used for data-driven animation

Page 5: Fair Use Agreement

Themis Palpanas 5VLDB - Aug 2004

Motion Capture in Games Industry

Street NBA

Madden

Page 6: Fair Use Agreement

Themis Palpanas 6VLDB - Aug 2004

Motion Capture in Movie Industry

Troy

Lord of the Rings

Page 7: Fair Use Agreement

Themis Palpanas 7VLDB - Aug 2004

Motivation

motion capture data segmented in short sequences, stored in motion libraries composed to create long, realistic motion sequences

important to find similar sequences form pool of similar sequences choose the most promising, to continue the motion

Page 8: Fair Use Agreement

Themis Palpanas 8VLDB - Aug 2004

Motivation Dynamic Time Warping (DTW)

Considers only local adjustments in time, to match two time series However sometimes global adjustments are required

DTW is being extensively used uniform scaling is complementary

combination of both techniques offers rich, high-quality result set

DTW Uniform Scaling

Page 9: Fair Use Agreement

Themis Palpanas 9VLDB - Aug 2004

Uniform Scaling

time series query, Q, length n candidate, C, length m (m>n)

0 100 200 300 400

0 100 200 300 400

C

Q

Page 10: Fair Use Agreement

Themis Palpanas 10VLDB - Aug 2004

Uniform Scaling

time series query, Q, length n candidate, C, length m (m>n)

stretch Q to length p (n≤p≤m): Qp

Qpj = Q┌j*n/p┐, 1 ≤ j ≤ p

scaling factor, sf = p/n max scaling factor, sfmax= m/n

0 100 200 300 400

0 100 200 300 400

C

Q

0 100 200 300 400

0 100 200 300 400

Q

Qp

Page 11: Fair Use Agreement

Themis Palpanas 11VLDB - Aug 2004

Problem Statement

given time series, Q database of candidate time series, {D}

find argminp{ dist(Qp, {D} ) } dist(Qp, {D} )= Euclidean Distance between time series

Page 12: Fair Use Agreement

Themis Palpanas 12VLDB - Aug 2004

Problem Statement

given time series, Q database of candidate time series, {D}

find argminp{ dist(Qp, {D} ) } dist(Qp, {D} )= Euclidean Distance between time series

challenges quickly solve the problem for two time series extend solution to scale-up to large time series

databases

Page 13: Fair Use Agreement

Themis Palpanas 13VLDB - Aug 2004

Outline

Speeding Up Search Scaling Up To Large Databases Experimental Evaluation Related Work Conclusions

Page 14: Fair Use Agreement

Themis Palpanas 14VLDB - Aug 2004

Best Uniform Scaling Match

brute force algorithm: for each time series in {D}

for each sf, 1 ≤ sf ≤ sfmax

compute distance between the two time series find the best overall match

time complexity: O(|D|(m-n)) extremely expensive!

Page 15: Fair Use Agreement

Themis Palpanas 15VLDB - Aug 2004

Lower Bounding Uniform Scaling

lower bound distance between two time series,for any sf, 1 ≤ sf ≤ sfmax

desiderata: fast to compute tight bound

results in fast pruning of candidates that are guaranteed not to belong to the solution compute distance only for time series not pruned by

lower bound

Page 16: Fair Use Agreement

Themis Palpanas 16VLDB - Aug 2004

Lower Bounding Uniform Scaling

assume: candidate C, length 100 query Q, length 80 wish to find best match for any

scaling of Q between 80-100

0 10 20 30 40 50 60 70 80 90 100

C

m = 100

Page 17: Fair Use Agreement

Themis Palpanas 17VLDB - Aug 2004

Lower Bounding Uniform Scaling

assume: candidate C, length 100 query Q, length 80 wish to find best match for any

scaling of Q between 80-100 build envelopes, length 80:

0 10 20 30 40 50 60 70 80 90 100

U

L

n = 80Ui = max( C (i-1)*m/n +1,…, C i*m/n )

Li = min( C (i-1)*m/n +1,…, C i*m/n )

Page 18: Fair Use Agreement

Themis Palpanas 18VLDB - Aug 2004

Lower Bounding Uniform Scaling

assume: candidate C, length 100 query Q, length 80 wish to find best match for any

scaling of Q between 80-100 build envelopes, length 80:

0 10 20 30 40 50 60 70 80 90 100

Q

Ui = max( C (i-1)*m/n +1,…, C i*m/n )

Li = min( C (i-1)*m/n +1,…, C i*m/n )

Page 19: Fair Use Agreement

Themis Palpanas 19VLDB - Aug 2004

Lower Bounding Uniform Scaling

assume: candidate C, length 100 query Q, length 80 wish to find best match for any

scaling of Q between 80-100 build envelopes, length 80:

0 10 20 30 40 50 60 70 80 90 100

Ui = max( C (i-1)*m/n +1,…, C i*m/n )

Li = min( C (i-1)*m/n +1,…, C i*m/n )

Page 20: Fair Use Agreement

Themis Palpanas 20VLDB - Aug 2004

Lower Bounding Uniform Scaling

assume: candidate C, length 100 query Q, length 80 wish to find best match for any

scaling of Q between 80-100 compute lower bound:

0 10 20 30 40 50 60 70 80 90 100

n

iiiii

iiii

otherwiseLQifLQUQifUQ

CQKeoghLB1

2

2

0)()(

),(_

Page 21: Fair Use Agreement

Themis Palpanas 21VLDB - Aug 2004

Envelope Indexing

dimensionality of envelopes is high

0 10 20 30 40 50 60 70 80 90 100

80 points

Page 22: Fair Use Agreement

Themis Palpanas 22VLDB - Aug 2004

Envelope Indexing

dimensionality of envelopes is high reduce dimensionality by approximating them

Piecewise Constant Approximation

0 10 20 30 40 50 60 70 80 90 100

8 points

UU

U

L

Page 23: Fair Use Agreement

Themis Palpanas 23VLDB - Aug 2004

Envelope Indexing

dimensionality of envelopes is high reduce dimensionality by approximating them

Piecewise Constant Approximation

assume query Q, length 80

0 10 20 30 40 50 60 70 80 90 100

Q

Page 24: Fair Use Agreement

Themis Palpanas 24VLDB - Aug 2004

Envelope Indexing

dimensionality of envelopes is high reduce dimensionality by approximating them

Piecewise Constant Approximation

assume query Q, length 80 we approximate it with 8 points

0 10 20 30 40 50 60 70 80 90 100

Q

Page 25: Fair Use Agreement

Themis Palpanas 25VLDB - Aug 2004

Envelope Indexing

dimensionality of envelopes is high reduce dimensionality by approximating them

Piecewise Constant Approximation

assume query Q, length 80 approximated with 8 points

compute approximation of lower bound:

0 10 20 30 40 50 60 70 80 90 100

N

iiiii

iiii

otherwiseLQifLQUQifUQ

NnRQMINDIST

1

2

2

0

ˆ)ˆ(

ˆ)ˆ()ˆ,(

Page 26: Fair Use Agreement

Themis Palpanas 26VLDB - Aug 2004

Algorithms for Secondary Storage

use a multidimensional index VA-file -> FastScan algorithm R-tree -> RtreeProbe algorithm

2-pass algorithms:1. scan approximated envelopes,

prune search space2. find exact answer using original series

Page 27: Fair Use Agreement

Themis Palpanas 27VLDB - Aug 2004

Outline

Speeding Up Search Scaling Up To Large Databases Experimental Evaluation Related Work Conclusions

Page 28: Fair Use Agreement

Themis Palpanas 28VLDB - Aug 2004

Datasets Used

motion capture data from 124 sensors placed on human actors

mixed bag time series coming from:

medicine, manufacturing, environmental monitoring, economics, sensor data

experimented with time series databases of: size 5,000 – 80,000 time series length 64 – 1,024 points

Page 29: Fair Use Agreement

Themis Palpanas 29VLDB - Aug 2004

Main Memory Experiments

assume database fits in memory measure pruning power:

fraction of times each approach calls distance function

our technique: 1 order of magnitude

faster than CD-criterion

256

128

64

256

128

64

1.20

1.10

1.05

0

0.05

0.1

0.15

0.2

0.25

LB_Keogh

CD- criterion

Page 30: Fair Use Agreement

Themis Palpanas 30VLDB - Aug 2004

Main Memory Experiments

assume database fits in memory measure pruning power:

fraction of times each approach calls distance function

our technique: 1 order of magnitude

faster than CD-criterion 3 orders of magnitude

faster than brute force

256

128

64

256

128

64

1.20

1.10

1.05

0

0.05

0.1

0.15

0.2

0.25

LB_Keogh

CD- criterion

brute force

Page 31: Fair Use Agreement

Themis Palpanas 31VLDB - Aug 2004

Disk-Based Experiments

comparison of: brute force FastScan RtreeProbe

25612864

25612864

25612864

1.201.101.05

0

5

10

15

20

25

LinearScan

FastScan

RtreeProbe

Sec

onds

25612864

25612864

25612864

1.201.101.05

0

5

10

15

20

25

LinearScan

FastScan

RtreeProbe

Sec

onds

Page 32: Fair Use Agreement

Themis Palpanas 32VLDB - Aug 2004

Disk-Based Experiments

comparison of: FastScan RtreeProbe

LinearScanLB

FastScan

RtreeBF

RtreeProbe

Sec

onds

64

0

10

20

30

40

50

60

70

80

1282565121024LinearScanLB

FastScan

RtreeBF

RtreeProbe

Sec

onds

64

0

10

20

30

40

50

60

70

80

0

10

20

30

40

50

60

70

80

1282565121024

Page 33: Fair Use Agreement

Themis Palpanas 33VLDB - Aug 2004

Disk-Based Experiments

comparison of: FastScan RtreeProbe

Sec

onds

0

LinearScanLB

FastScan

RtreeBF

RtreeProbe

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

500010000200004000080000

Sec

onds

0

LinearScanLB

FastScan

RtreeBF

RtreeProbe

LinearScanLB

FastScan

RtreeBF

RtreeProbe

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

500010000200004000080000

500010000200004000080000

Page 34: Fair Use Agreement

Themis Palpanas 34VLDB - Aug 2004

Case Study

video

Page 35: Fair Use Agreement

Themis Palpanas 35VLDB - Aug 2004

Outline

Speeding Up Search Scaling Up To Large Databases Experimental Evaluation Related Work Conclusions

Page 36: Fair Use Agreement

Themis Palpanas 36VLDB - Aug 2004

Related Work

Dynamic Time Warping (DTW) [Yi & Faloutsos’00][Keogh’02][Zhu & Shasha’03][Fung &

Wong’03]

Longest Common SubSequence (LCSS) [Das et al.’97][Vlachos et al.’03]

uniform scaling [Argyros & Ermopoulos’03]

Page 37: Fair Use Agreement

Themis Palpanas 37VLDB - Aug 2004

Outline

Speeding Up Search Scaling Up To Large Databases Experimental Evaluation Related Work Conclusions

Page 38: Fair Use Agreement

Themis Palpanas 38VLDB - Aug 2004

Conclusions

studied utility of uniform scaling similarity matching applications in:

motion capture libraries, music retrieval, historical handwritten archives

introduced first lower bounding technique proposed indexing method for bounding envelopes

suitable for very large time series databases experimentally evaluated efficiency of technique demonstrated quality of results with real motion

capture data

Page 39: Fair Use Agreement

Themis Palpanas 39VLDB - Aug 2004

Outline

Page 40: Fair Use Agreement

Themis Palpanas 40VLDB - Aug 2004

Lower Bounding Uniform Scaling

assume: candidate C, length 100 query Q, length 80 wish to find best match for any

scaling of Q between 80-100 build envelopes, length 80:

0 10 20 30 40 50 60 70 80 90 100

Ui = max( C (i-1)*m/n +1,…, C i*m/n )

Li = min( C (i-1)*m/n +1,…, C i*m/n )