Upload
daisy-francis
View
216
Download
3
Tags:
Embed Size (px)
Citation preview
Bit Sequential (bSQ) Data Modeland
Peano Count Trees (P-trees)
Department of Computer ScienceNorth Dakota State University, USA
(the bSQ and P-tree technology is patented by NDSU)
Background on Spatial Data
Pixel – a point in a spaceBand – feature attribute of the pixelsValue – usually one byte (0~255)Images have different numbers of bands
– TM4/5: 7 bands (B, G, R, NIR, MIR, TIR, MIR2)– TM7: 8 bands (B, G, R, NIR, MIR, TIR, MIR2, PC)– TIFF: 3 bands (B, G, R)– Ground data: individual bands (Yield, Moisture,
Nitrate level, Temperature, elevation…)
RSI dataset example
TIFF image Yield Map
RSI data can be viewed as collection of pixels. Each pixel has a value for each feature attribute
For example, the RSI dataset above has 320 rows and 320 columns of pixels (102,400 pixels) and 4 feature attributes (B,G,R,Y). The (B,G,R) feature bands are in the TIFF image and the Y feature is color coded in the Yield
Map.
Spatial Data Formats
Existing formats– BSQ (Band Sequential) – BIL (Band Interleaved by Line) – BIP (Band Interleaved by Pixel)
New format– bSQ (bit Sequential)
Spatial Data Formats (Cont.)
BAND-1 254 127 (1111 1110) (0111 1111)
14 193 (0000 1110) (1100 0001)
BAND-237 240(0010 0101) (1111 0000)
200 19(1100 1000) (0001 0011)
BSQ format (2 files)
Band 1: 254 127 14 193 Band 2: 37 240 200 19
Spatial Data Formats (Cont.)
BAND-1 254 127 (1111 1110) (0111 1111)
14 193 (0000 1110) (1100 0001)
BAND-237 240(0010 0101) (1111 0000)
200 19(1100 1000) (0001 0011)
BSQ format (2 files)
Band 1: 254 127 14 193 Band 2: 37 240 200 19
BIL format (1 file)
254 127 37 240 14 193 200 19
Spatial Data Formats (Cont.)
BAND-1 254 127 (1111 1110) (0111 1111)
14 193 (0000 1110) (1100 0001)
BAND-237 240(0010 0101) (1111 0000)
200 19(1100 1000) (0001 0011)
BSQ format (2 files)
Band 1: 254 127 14 193 Band 2: 37 240 200 19
BIL format (1 file)
254 127 37 240 14 193 200 19
BIP format (1 file)
254 37 127 240 14 200 193 19
Spatial Data Formats (Cont.)
BAND-1 254 127 (1111 1110) (0111 1111)
14 193 (0000 1110) (1100 0001)
BAND-237 240(0010 0101) (1111 0000)
200 19(1100 1000) (0001 0011)
BSQ format (2 files)
Band 1: 254 127 14 193 Band 2: 37 240 200 19
BIL format (1 file)
254 127 37 240 14 193 200 19
BIP format (1 file)
254 37 127 240 14 200 193 19
bSQ format (16 files)B11 B12 B13 B14 B15 B16 B17 B18 B21 B22 B23 B24 B25 B26 B27 B28 1 1 1 1 1 1 1 0 0 0 1 0 0 1 0 1 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 0 1 1 0 0 1 0 0 0 1 1 0 0 0 0 0 1 0 0 0 1 0 0 1 1
bSQ Format
Split each band into eight separate files, one for each bit position.
Reasons of using bSQ format– Different bits contribute to the value differently. – bSQ format facilitates the representation of a
precision hierarchy (from 1 bit up to 8 bit precision). – bSQ format facilitates the creation of an efficient data
structure P-tree, P-tree algebra and T-cube.
The “tabular” formats (inverted list)
BSQ and bSQ are “tabular” formats– BSQ consist of a separate table for each feature band
– bSQ consist of a separate table for each bit of each band
One can view it this way:– The data set is initially one “relation” or table, R(K1,..,Kk, A1, A2,
…, An) where K1,..,Kk are the structure attributes and each Ai is a feature attribute.
• The structure attributes of a 2-D image are the X and Y coordinates of the pixels (rows).
• The feature attributes are the bands, B,G,R, NIR, …
• In BSQ we separate each feature into a separate file and suppress the structure attributes altogether (under the assumption that the pixels are always arranged in raster order.
• In bSQ we separate each bit of each feature into a separate file (same raster order assumption)
Peano Count Tree (P-tree)
P-tree represents spatial bSQ data bit-by-bit in a recursive quadrant-by-quadrant arrangement.
An P-tree is a lossless representation of the original data.
A P-tree is a compressed structure.A P-tree is “count pre-computed”.
An example of Ptree
Peano or Z-ordering Pure (Pure-1/Pure-0) quadrant Root Count
Level Fan-out QID (Quadrant ID)
1 1 1 1 1 1 0 01 1 1 1 1 0 0 0 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1
55
16 8 15 16
3 0 4 1 4 4 3 4
1 1 1 0 0 0 1 0 1 1 0 1
16 16
55
0 4 4 4 4
158
1 1 1 0
3
0 0 1 0
1
1 1
3
0 1
55
16 8 15 16
3 0 4 1 4 4 3 4
1 1 1 0 0 0 1 0 1 1 0 1
An example of Ptree
Peano or Z-ordering Pure (Pure-1/Pure-0) quadrant Root Count
Level Fan-out QID (Quadrant ID)
1 1 1 1 1 1 0 01 1 1 1 1 0 0 0 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1
0 1 2 3
111
( 7, 1 ) ( 111, 001 ) 10.10.11
2
3
2 . 2 . 3
001
P-tree variation – PM-tree
Peano Mask tree (PM-tree) uses mask instead of count. 1 denotes pure-1, 0 denotes pure-0 and m denotes mixed. It provides an efficient way for ANDing.
1 1 1 1 1 1 0 01 1 1 1 1 0 0 0 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1
m
1 m m 1
m 0 1 m 1 1 m 1
1 1 1 0 0 0 1 0 1 1 0 1
Ptree Algebra
AndOrComplementOther (XOR, etc)
Ptree: 55 ____________/ / \ \___________ / ___ / \___ \ / / \ \ 16 ____8__ _15__ 16 / / | \ / | \ \ 3 0 4 1 4 4 3 4 //|\ //|\ //|\ 1110 0010 1101
Complement: 9 ____________/ / \ \___________ / ___ / \___ \ / / \ \ 0 ____8__ __1__ 0 / / | \ / | \ \ 1 4 0 3 0 0 1 0 //|\ //|\ //|\ 0001 1101 0010
Ptree ANDing Operation
PM-tree1: m ______/ / \ \______ / / \ \ / / \ \ 1 m m 1 / / \ \ / / \ \ m 0 1 m 1 1 m 1 //|\ //|\ //|\ 1110 0010 1101
PM-tree2: m ______/ / \ \______ / / \ \ / / \ \ 1 0 m 0 / / \ \ 1 1 1 m //|\ 0100
Result: m ________ / / \ \___ / ____ / \ \ / / \ \ 1 0 m 0 / | \ \ 1 1 m m //|\ //|\ 1101 0100
0 100 101 102 12 132 20 21 220 221 223 23 3 & 0 20 21 22 231 RESULT0 0 0 20 20 20 21 21 21 220 221 223 22 220 221 223 23 231 231
Depth-first Pure 1 path code
Basic, Value and Tuple Ptrees
Value Ptrees(i.e., P1, 001 = P11’ AND P12’ AND P13)
Tuple Ptrees(i.e., P001, 010, 111 = P1, 001 AND P2, 010 AND P3, 111)
AND
AND
Basic Ptrees(i.e., P11, P12, …, P18, P21, …, P28, …, P71, …, P78)
Algorithm Build the set of confident rules, C (initially empty) as follows:
– Start with 1-bit values, 2 bands; – then 1-bit values and 3 bands; …– then 2-bit values and 2 bands;– then 2-bit values and 3 bands; …– . . .– At each stage defined above, do the following:
• Find all confident rules by rolling-up the T-cube along each potential consequent set using summation.
• Comparing these sums with the support threshold to isolate rule support sets with the minimum support.
• Compare the normalized T-cube values (divide by the rolled-up sum) with the minimum confidence level to isolate the confident rules.
• Place any new confident rule in C, but only if the rank is higher than any of its generalizations already in C.