Upload
antonia-dawson
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
Mining for Empty Rectangles in Large Data Sets
Jeff Edmonds
Jarek Gryz
Dongming Liang
Renee Miller
2
0 0 1 1 0 0 0 0 1
1 2 3 6 7 8
Matrix representation
A B 3 1 3
6 7 8
A,B(R S)
3
0 0 1 1 0 0 0 0 1
1 2 3 6 7 8
Find All Maximal 0-Rectangles
A,B(R S)
000
0 00
al
00
0
um
A B 3 1 3
6 7 8
4
0 0 1 1 0 0 0 0 1
95 96 97 BMW Z3 Honda L2 Toyota 6A
Example
A,B(R S)
0 0Car Year
…
First BMW Z3 series cars were made in 1997.
5
Relation to Previous Work
[Lui, Ku, Hsu] & [Orlowski] Our Work
Problem:
Purpose:• Machine Learning• Computational Geometry
• Query Optimization
• between points in real plane
• within a 0-1 matrix
Find all maximal empty rectangles
# of maximal 0-rectangles:• O( (# 1’s)2 ) • O( #0’s )
[Namaad, Hsu, Lee]
6
Relation to Previous WorkOur Work
Time:
Space:• O(|X||Y|) • O(min(|X|, |Y|))
• only two rows of matrix kept in memory
• O( # 1’s log(#1’s) + # rectangles ) = O(|X||Y|)
• O( #0’s ) = O(|X||Y|)
[Lui, Ku, Hsu] & [Orlowski][Namaad, Hsu, Lee]
7
Relation to Previous WorkOur Work
Practical Implementation:
Scalable:• Scales Badly • Scales well wrt
• # of tuples in join• # of maximal rectangles• # of values |X| & |Y|
• Intensive random memory access
Requires a single scan of the sorted data
Practical?• IBM paid us $25,000
to patent it!
[Lui, Ku, Hsu] & [Orlowski][Namaad, Hsu, Lee]
8
Structure of Algorithmloop y = 1..|Y|
loop x = 1..|X|• Output all maximal 0-rectangles
with <x,y> as bottom-right corner• Maintain the loop invariant
1
1
1
1
1
X
•0
Y
0
1
Timing
O(1) amortized time per <x,y>
<x,y> *
9
Designing an Algorithm Define Problem Define Loop
InvariantsDefine Measure of Progress
Define Step Define Exit Condition Maintain Loop Inv
Make Progress Initial Conditions Ending
km
79 km
to school
Exit
Exit
79 km 75 km
Exit
Exit
0 km Exit
10
1
1
1
1
1•00
1
XY
<x,y> *
Define the Loop Invariant• We have read the matrix up to <x,y>
and cannot reread the matrix.• We must output all maximal 0-rectangles
with <x,y> as bottom-right corner• What must we remember?
11
0
step
1
1
1
1
1 0
( x ,y )r r
( x ,y )1 1
( x ,y )2 2
( x ,y )3 3
( x ,y )4 4
( x ,y )5 5
Stack of steps 1
1
X
Y
<x,y> *1 0 0 0 0
10
00
0
0
x*
y*
12
1
1
1
1
1
1
1
X
Y
0
1 0 0 0 0
10
00
0
0
( x ,y )r r
( x ,y )1 1
( x , y )
0
<x,y> *
Constructing Maximal Rectangles
13
1
1
1
1
1
1
1
X
Y
0
1 0 0 0 0
10
00
0
0
( x ,y )r r
( x ,y )1 1
( x , y )
0
• Too Narrow • Maximal• Too short
<x,y> *
Constructing Maximal Rectangles
14<x-1,y> *
Constructing staircase(x,y)from staircase(x-1,y)
1
1
1
1
1
1
1
1 0 0 0 0
00
00
0
0
0
00
00
0
1
0
00
0
Case 1
<x,y> *
0
151
1
1
1
1
1
1
X
Y
0
1 0 0 0 0
1
0
00
0
0
( x ,y )r r
( x ,y )1 1
( x, y )
0<x-1,y> *
Constructing staircase(x,y)from staircase(x-1,y)
0
Case 2
161
1
1
1
1
1
1
X
Y
0
1 0 0 0 0
1
0
00
0
0
( x ,y )r r
( x ,y )1 1
( x, y )
0
• Too Narrow • Maximal• Too short
<x-1,y> *
Constructing staircase(x,y)from staircase(x-1,y)
00
Delete
Keep
<x,y> *
0
17
Constructing x* & y*
1
1
1
1
1
1
1
0
1 0 0 0 0
( x ,y )r r
( x ,y )1 1
( x, y )
0<x,y> *
00
00
0
0
01
0
x*
y*
18X
Y
<x,y>
10
0
00
00
0
100
00
0
01
0
1
00
0
00
00
0
0
01
000
00
0
0
100
00
0
0
10
01
0
0
10
00
0
0
10
0
00
00
0
0
100
00
0
0
01
000
00
0
0
10
Location of last 1 seen in each column
*
19
Structure of Algorithmloop y = 1..|Y|
loop x = 1..|X|• Construct staircase(x,y)• Output all maximal 0-rectangles
with <x,y> as bottom-right corner
1
1
1
1
1
X
•0
Y
<x.y>
0
1
Timing
O(1) amortized time per <x,y>
Third
<x,y> *
201
1
1
1
1
1
1
X
Y
0
1 0 0 0 0
1
0
00
0
0
( x ,y )r r
( x ,y )1 1
( x, y )
0
• Too Narrow • Maximal• Too short
<x,y> *
Timing
00
Delete
0
Only work that is not constant Time
21
TimingAmortized # of steps deleted (per <x,y>)
= # of steps created (per <x,y>) 1£
<x-1,y> *1
1
1
1
1
1
1
1 0 0 0 0
00
00
0
0
0
00
00
0
1
0
00
0
22
Number of Maximal Rectangles
# of maximal 0-rectangles:
• O( (# 1’s)2 ) [Namaad, Hsu, Lee]• Running time of alg = O( #0’s )
£
£
23
Designing an Algorithm Define Problem Define Loop
InvariantsDefine Measure of Progress
Define Step Define Exit Condition Maintain Loop Inv
Make Progress Initial Conditions Ending
km
79 km
to school
Exit
Exit
79 km 75 km
Exit
Exit
0 km Exit