Upload
robyn-booker
View
212
Download
0
Embed Size (px)
Citation preview
Closed and Iceberg Cubes
Reduction necessity
• Data cube produces large outputs– 1,015,367 tuples (39MB)– 210,343,580 tuples (8GB)(200 times)
• Two methods to reduce outputs– Iceberg cube– Closed cube
Closed Iceberg cube
Cells and Measures
• Cell– In an n-dimension data cube, a cell c = (a1,a2,
…,an: m) (where m is a measure) is called a k-dimensional group-by cell, if and only if there are exactly k (k<=n) values among {a1,a2,…,an} which are not * (i.e., all).
– Further denote M(c) = m and V(c) = (a1,a2,…,an).
Notion of Cover
• Cover– Given two cells c = (a1,a2,…,an:m) and c’ =
(a1’,a2’,…,an’:m’), we denote V(c)<= V(c’) if for each ai (i = 1,…,n) which is not *, ai’ = ai.
– A cell c is said to be covered by another cell c’ if c’’ such that V(c)<=V (c’’)<=V (c’), M(c’’) = M(c’).
Closed and iceberg cells
• Closed cell– A cell is called a closed cell if it is not covered
by any other cells.
• Closed Iceberg cell– Closed cell which satisfies the iceberg
constraints
Closed and iceberg cell contd
• Let the measure be count, and the iceberg constraint be count>=2.
• Cell1 = (a1,b1,c1,*: 2), and cell2 = (a1,*, *, * : 3) are closed iceberg cells;
• Cell3 = (a1,*, c1,* : 2) and cell4 = (a1, b2, c2, d2 : 1)are not, because the former is covered by cell1, where as the latter does not satisfy the iceberg constraint.
Methods of computation
• Top-down method
• Bottom-up method
• These methods of computing the cubes such as BUC, Multi array aggregation and Star Cubing shall be explained in detail in the next resource in this module.