10
Spatial (or N-Dimensional) Search in a Relational World Jim Gray

Spatial (or N-Dimensional) Search in a Relational World Jim Gray

Embed Size (px)

Citation preview

Page 1: Spatial (or N-Dimensional) Search in a Relational World Jim Gray

Spatial (or N-Dimensional) Searchin a Relational World

Jim Gray

Page 2: Spatial (or N-Dimensional) Search in a Relational World Jim Gray

Equations Define Subspaces

• For (x,y) above the lineax+by > c

• Reverse the space by-ax + -by > -c

• Intersect a 3 volumes: a1x + b1y > c1

a2x + b2y > c2

a3x + b3y > c3

x

y

x=c/a

y=c/b

ax + by = c

x

y

Page 3: Spatial (or N-Dimensional) Search in a Relational World Jim Gray

Domain is Union of Convex Hulls

• Simple volumes are unions of convex hulls.

• Higher order curves also work

• Complex volumes have holes and their holes have holes. (that is harder).

Not a convex hull

+

Page 4: Spatial (or N-Dimensional) Search in a Relational World Jim Gray

Now in Relational Termscreate table HalfSpace (

domainID int not null -- domain name foreign key references Domain(domainID), convexID int not null, -- grouping a set of ½ spaces halfSpaceID int identity(), -- a particular ½ space x float not null, -- the (a,b,..) parameters y float not null, -- defining the ½ space z float not null, c float not null, -- the constant (“c” above) primary key (domainID, convexID, halfSpaceID)

(x,y,z) inside a convex if it is inside all lines of the convex(x,y,z) inside a convex if it is NOT OUTSIDE ANY line of the convex

select convexID -- return the convex hullsfrom HalfSpace -- from the constraintswhere @x * x + @y * y + @x * z < l -- point outside the line?group by all convexID -- consider all the lines of a

convexIDhaving count(*) = 0 -- count outside == 0

Page 5: Spatial (or N-Dimensional) Search in a Relational World Jim Gray

The Algebra is Simple (Boolean)@domainID = spDomainNew (@type varchar(16), @comment varchar(8000))@convexID = spDomainNewConvex (@domainID int)@halfSpaceID = spDomainNewConvexConstraint (@domainID int, @convexID int, @x float, @y float, @z float, @l float)@returnCode = spDomainDrop(@domainID)

select * from fDomainsContainPoint(@x float, @y float, @z float) Once constructed they can be manipulated with the Boolean operations.@domainID = spDomainOr (@domainID1 int, @domainID2 int, @type varchar(16), @comment varchar(8000))@domainID = spDomainAnd (@domainID1 int, @domainID2 int, @type varchar(16), @comment varchar(8000))@domainID = spDomainNot (@domainID1 int, @type varchar(16), @comment varchar(8000))

Page 6: Spatial (or N-Dimensional) Search in a Relational World Jim Gray

What! No Bounding Box?

• Bounding box limits search.A subset of the convex hulls.

• If query runs at 3M halfspace/sec then no need for bounding box, unless you have more than 10,000 lines.

• But, if you have a lot of half-spaces then bounding box is good.

Page 7: Spatial (or N-Dimensional) Search in a Relational World Jim Gray

A Different Problem• Table-valued function

find points near a point– Select * from fGetNearbyEq(ra,dec,r)

• Use Hierarchical Triangular Mesh www.sdss.jhu.edu/htm/

– Space filling curve, bounding triangles…– Standard approach

• 13 ms/call… So 70 objects/second.• Too slow, so precompute neighbors:

Materialized view.• At 70 objects/sec

it takes 6 months to compute a billion objects.

Page 8: Spatial (or N-Dimensional) Search in a Relational World Jim Gray

Zone Based Spatial Join• Divide space into zones• Key points by Zone, offset

(on the sphere this need wrap-around margin.)

• Point search look in a few zonesat a limited offset: ra ± ra bounding box that has

1-π/4 false positives• All inside the relational engine• Avoids “impedance mismatch” • Can “batch” all-all comparisons• 33x faster and parallel

6 days, not 6 months!

r ra-zoneMax

√(r2+(ra-zoneMax)2)cos(radians(zoneMax))

zoneMax

x

Ra ± x

Page 9: Spatial (or N-Dimensional) Search in a Relational World Jim Gray

In SQL

select o1.objID -- find objectsfrom zone o1 -- in the zoned tablewhere o1.zoneID between -- where zone #

floor((@dec-@r)/@zoneHeight) and -- overlaps the circlefloor((@dec+@r)/@zoneHeight)

and o1.ra between @ra - @r and @ra + @r -- quick filter on ra and o1.dec between @dec-@r and @dec+@r -- quick filter on dec and ( (sqrt( power(o1.cx-@cx,2)+power(o1.cy-@cy,2)+power(o1.cz-@cz,2))))

< @r -- careful filter on distance

Eliminates the ~ 21% = 1-π/4False positives

Bounding box

Page 10: Spatial (or N-Dimensional) Search in a Relational World Jim Gray

Summary

• SQL is a set oriented language

• You can express constraints as rows

• Then You – Can evaluate LOTS of predicates per second– Can do set algebra on the predicates.

• Benefits from SQL parallelism

• SQL == Prolog?