Upload
shelley-howell
View
24
Download
0
Embed Size (px)
DESCRIPTION
Thinking in G i *(d) calculation with Map-Reduce. 2010-3-29. Preprocessing Generate Data Table Divide domain into cells, count number of points in every cell; Accumulate cells into quads; Put all points into quads (I/O intensive operation? need Map Reduce?) Generate Index Table:O (n 2½ )? - PowerPoint PPT Presentation
Citation preview
Thinking in Gi*(d) calculation with Map-Reduce
2010-3-29
• Preprocessing– Generate Data Table
• Divide domain into cells, count number of points in every cell;• Accumulate cells into quads;• Put all points into quads(I/O intensive operation? need Map Reduce?)
– Generate Index Table:O(n2½)?• For every quad, increase its boundary by step, till it covers the whole domain.
– In every step, calculate quads which intersect with;(need spatial index?)– Store the deduplicate index item into index table.
• Calculation of Gi*(d)
– Algorithm of Gi*(d) in M-R(?)
• counts how many neighbor quads should be used by index table;• Copy current quad to nodes which neighbor quads reside;• Do map task to calculate Gi
*(d) in all neighbored nodes;
• Do reduce task to calculate Gi*(d).
– C/C++ should be used in Gi*(d) calculation
– GPU may be helpful in calculation.– Hotspot cells/quads should be reside in memory/most of nodes– How to accelerate calculation by tuning MR parameters/ Gi*(d) algorithm
parameters?
Structure of Tables• DATA_TABLE
– Row : Quad_id– Family : data
• Count : points in Quad• Body
– point info : point1/point2/point3/……– Each point record : x/y/z(3 float point number, 12 bytes)
• INDEX_TABLE– Row: Quad_id– Family : border
• XS• XE• YS• YE
– Family : D• D1• D2• …• Dn
Storage model• Data distribution strategies
– Evenly distributed in all nodes– Locality distributed
• Data Cache Strategies– ??– ??
• Application model– Batch processing of Gi*(d) (per cell/per
quad)– Interactive processing of Gi*(d) (per point)
• Support for different storage strategies
node0 0 1 … 99
node1 100 101 … 199
…
node9 900 901 … 999
node0 0 10 … 990
node1 1 11 … 991
…
node9 9 19 … 999
Evenly distributed
Locality distributed