23
Data Cube and OLAP Server Madhavi Gundavarapu

Data Cube and OLAP Server Madhavi Gundavarapu. Data Cube and OLAP Server2 Outline What is Data Analysis? Steps in Data Analysis SQL-92 Aggregate Functions

  • View
    220

  • Download
    4

Embed Size (px)

Citation preview

Page 1: Data Cube and OLAP Server Madhavi Gundavarapu. Data Cube and OLAP Server2 Outline What is Data Analysis? Steps in Data Analysis SQL-92 Aggregate Functions

Data Cube and

OLAP Server

Madhavi Gundavarapu

Page 2: Data Cube and OLAP Server Madhavi Gundavarapu. Data Cube and OLAP Server2 Outline What is Data Analysis? Steps in Data Analysis SQL-92 Aggregate Functions

Madhavi Gundavarapu

Data Cube and OLAP Server 2

Outline

• What is Data Analysis?

• Steps in Data Analysis

• SQL-92 Aggregate Functions

• Limitations of GROUP BY

• OLAP Server

• CUBE Operator

• ROLLUP Operator

Page 3: Data Cube and OLAP Server Madhavi Gundavarapu. Data Cube and OLAP Server2 Outline What is Data Analysis? Steps in Data Analysis SQL-92 Aggregate Functions

Madhavi Gundavarapu

Data Cube and OLAP Server 3

What is Data Analysis?

• User issues a query, receives a response and formulates the next query based on the response

• This process repeats until the user gets the required result

• Fundamentally an iterative process

DATA ANALYSIS

query

exactresponse

Page 4: Data Cube and OLAP Server Madhavi Gundavarapu. Data Cube and OLAP Server2 Outline What is Data Analysis? Steps in Data Analysis SQL-92 Aggregate Functions

Madhavi Gundavarapu

Data Cube and OLAP Server 4

Why Data Analysis?

• Search for unusual patterns of data

• Summarize data values

• Extract statistical information

• Contrast one category with another

• Provide a consolidated view of enterprise data buried in OLTP databases – Help Decision makers understand business trends

• Derive intelligible results from ad hoc, voluminous and scattered data

Page 5: Data Cube and OLAP Server Madhavi Gundavarapu. Data Cube and OLAP Server2 Outline What is Data Analysis? Steps in Data Analysis SQL-92 Aggregate Functions

Madhavi Gundavarapu

Data Cube and OLAP Server 5

Steps in Data Analysis

• Formulate query

• Extract aggregated data

• Visualize results • Analyze

Analyze &Formulate

Visualize

Extract

19901991

1992ALL

Red

Blue0

50

100

150

200 150-200

100-150

50-100

0-50

Page 6: Data Cube and OLAP Server Madhavi Gundavarapu. Data Cube and OLAP Server2 Outline What is Data Analysis? Steps in Data Analysis SQL-92 Aggregate Functions

Madhavi Gundavarapu

Data Cube and OLAP Server 6

• SQL has several aggregate operators:– sum(), count(), avg(), min(), max()

• The basic idea is:– Combine all values in a column

– into a single scalar value

• Syntax– SELECT sum(units)

FROM inventory;

SUM()

Overview of SQL-92

Page 7: Data Cube and OLAP Server Madhavi Gundavarapu. Data Cube and OLAP Server2 Outline What is Data Analysis? Steps in Data Analysis SQL-92 Aggregate Functions

Madhavi Gundavarapu

Data Cube and OLAP Server 7

Overview of SQL-92 (contd.): Distinct Clause

•DISTINCT– Allows aggregation over distinct values

– Example

SELCT COUNT(DISTINCT locations) FROM inventory;

Page 8: Data Cube and OLAP Server Madhavi Gundavarapu. Data Cube and OLAP Server2 Outline What is Data Analysis? Steps in Data Analysis SQL-92 Aggregate Functions

Madhavi Gundavarapu

Data Cube and OLAP Server 8

Overview of SQL-92 (contd.): GROUP BY Clause

• Group By allows aggregates over table sub-groups

• Result is a new table

• Syntax:

SELCT location, sum(units)FROM inventoryGROUP BY locationHAVING nation = “USA”;

TableSUM()

A

B

C

D

attributeA A A B B B B B C C C C C D D

Page 9: Data Cube and OLAP Server Madhavi Gundavarapu. Data Cube and OLAP Server2 Outline What is Data Analysis? Steps in Data Analysis SQL-92 Aggregate Functions

Madhavi Gundavarapu

Data Cube and OLAP Server 9

• Users want CrossTabs – GROUP BY is limited to 0-D and 1-D aggregates

• Users want sub-totals and totals– drill-down & roll-up reports

sum

M T W T F S S AIR

HOTEL

FOOD

MISC

Limitations of GROUP BY

Page 10: Data Cube and OLAP Server Madhavi Gundavarapu. Data Cube and OLAP Server2 Outline What is Data Analysis? Steps in Data Analysis SQL-92 Aggregate Functions

Madhavi Gundavarapu

Data Cube and OLAP Server 10

Multidimensional Data• Measure Attributes

• Dimension Attributes

• ExampleItem-name Color Size NumberSkirt Dark Large 10Skirt Pastel Large 20Skirt White Large 15… … … …

Model Year Color SalesChevy 1990 Red 5Chevy 1990 White 87Chevy 1990 Blue 62… … … …

Page 11: Data Cube and OLAP Server Madhavi Gundavarapu. Data Cube and OLAP Server2 Outline What is Data Analysis? Steps in Data Analysis SQL-92 Aggregate Functions

Madhavi Gundavarapu

Data Cube and OLAP Server 11

OLAP System

• On-Line Analytical Processing System

• Interactive system

• Permits analysts to view summaries of multidimensional data

• On-Line indicates– No long waits to see result of a query– response times within a few seconds for new

summaries

• View data at different levels of granularity

Page 12: Data Cube and OLAP Server Madhavi Gundavarapu. Data Cube and OLAP Server2 Outline What is Data Analysis? Steps in Data Analysis SQL-92 Aggregate Functions

Madhavi Gundavarapu

Data Cube and OLAP Server 12

SQL:1999 OLAP Extensions

• SQL-92 functionality was limited

• SQL:1999 standard defines

– CUBE

– ROLLUP

– as generalizations of GROUP BY clause

Page 13: Data Cube and OLAP Server Madhavi Gundavarapu. Data Cube and OLAP Server2 Outline What is Data Analysis? Steps in Data Analysis SQL-92 Aggregate Functions

Madhavi Gundavarapu

Data Cube and OLAP Server 13

CUBE : Relational Aggregate Operator

CHEVY

FORD 19901991

19921993

REDWHITEBLUE

By Color

By Make & Color

By Make & Year

By Color & Year

By MakeBy Year

Sum

The Data Cube and The Sub-Space Aggregates

REDWHITE

BLUE

Chevy Ford

By Make

By Color

Sum

Cross TabRED

WHITE

BLUE

By Color

Sum

Group By (with total)Sum

Aggregate

•N-dimensional generalization of simple aggregate functions

Page 14: Data Cube and OLAP Server Madhavi Gundavarapu. Data Cube and OLAP Server2 Outline What is Data Analysis? Steps in Data Analysis SQL-92 Aggregate Functions

Madhavi Gundavarapu

Data Cube and OLAP Server 14

CUBE : The Idea

• 0-dimensional Aggregate (sum(), max(),...)• a1, a2, ...., aN, f()

• Super-aggregate over 1-Dimensional sub-cubes• ALL, a2, ...., aN , f()

• a1, ALL, a3, ...., aN , f()

• ...

• a1, a2, ...., ALL, f()

• Super-aggregate over 2-Dimensional sub-cubes• ALL, ALL, a3, ...., aN , f()

• ...

• a1, a2 ,...., ALL, ALL, f()

Page 15: Data Cube and OLAP Server Madhavi Gundavarapu. Data Cube and OLAP Server2 Outline What is Data Analysis? Steps in Data Analysis SQL-92 Aggregate Functions

Madhavi Gundavarapu

Data Cube and OLAP Server 15

An ExampleChevy Sales Cross Tab

Chevy 1990 1991 1992 Total (ALL)

black 50 85 154 289white 40 115 199 354 Total(ALL)

90 200 353 1286

SELECT model, year, color, sum(sales) as sales

FROM sales

WHERE model in (‘Chevy’)

AND year BETWEEN 1990 AND 1992

GROUP BY CUBE (model, year, color);

Page 16: Data Cube and OLAP Server Madhavi Gundavarapu. Data Cube and OLAP Server2 Outline What is Data Analysis? Steps in Data Analysis SQL-92 Aggregate Functions

Madhavi Gundavarapu

Data Cube and OLAP Server 16

CUBE Contd.

SELECT model, year, color, sum(sales) as sales

FROM sales

WHERE model in (‘Chevy’)

AND year BETWEEN 1990 AND 1992

GROUP BY CUBE (model, year, color);

• Computes union of 8 different groupings:

– {(model, year, color), (model, year), (model, color), (year, color), (model), (year), (color), ()}

Page 17: Data Cube and OLAP Server Madhavi Gundavarapu. Data Cube and OLAP Server2 Outline What is Data Analysis? Steps in Data Analysis SQL-92 Aggregate Functions

Madhavi Gundavarapu

Data Cube and OLAP Server 17

Example Contd.

SALES Model Year Color Sales Chevy 1990 red 5 Chevy 1990 white 87 Chevy 1990 blue 62 Chevy 1991 red 54 Chevy 1991 white 95 Chevy 1991 blue 49 Chevy 1992 red 31 Chevy 1992 white 54 Chevy 1992 blue 71 Ford 1990 red 64 Ford 1990 white 62 Ford 1990 blue 63 Ford 1991 red 52 Ford 1991 white 9 Ford 1991 blue 55 Ford 1992 red 27 Ford 1992 white 62 Ford 1992 blue 39

DATA CUBE Model Year Color Sales ALL ALL ALL 942 chevy ALL ALL 510 ford ALL ALL 432 ALL 1990 ALL 343 ALL 1991 ALL 314 ALL 1992 ALL 285 ALL ALL red 165 ALL ALL white 273 ALL ALL blue 339 chevy 1990 ALL 154 chevy 1991 ALL 199 chevy 1992 ALL 157 ford 1990 ALL 189 ford 1991 ALL 116 ford 1992 ALL 128 chevy ALL red 91 chevy ALL white 236 chevy ALL blue 183 ford ALL red 144 ford ALL white 133 ford ALL blue 156 ALL 1990 red 69 ALL 1990 white 149 ALL 1990 blue 125 ALL 1991 red 107 ALL 1991 white 104 ALL 1991 blue 104 ALL 1992 red 59 ALL 1992 white 116 ALL 1992 blue 110

CUBE

Page 18: Data Cube and OLAP Server Madhavi Gundavarapu. Data Cube and OLAP Server2 Outline What is Data Analysis? Steps in Data Analysis SQL-92 Aggregate Functions

Madhavi Gundavarapu

Data Cube and OLAP Server 18

GROUPING Function

• SQL:1999 uses NULL to represent both ALL and regular null values

• GROUPING function

– Can be applied to an attribute

– Returns 1 if NULL value represents ALL

– Returns 0 in all other cases

Page 19: Data Cube and OLAP Server Madhavi Gundavarapu. Data Cube and OLAP Server2 Outline What is Data Analysis? Steps in Data Analysis SQL-92 Aggregate Functions

Madhavi Gundavarapu

Data Cube and OLAP Server 19

GROUPING Example

SELECT model, year, color, sum(sales) as sales,

GROUPING(model) as model_flag,

GROUPING(year) as year_flag,

GROUPING(color) as color_flag

FROM sales

WHERE model in (‘Chevy’)

AND year BETWEEN 1990 AND 1992

GROUP BY CUBE (model, year, color);

Page 20: Data Cube and OLAP Server Madhavi Gundavarapu. Data Cube and OLAP Server2 Outline What is Data Analysis? Steps in Data Analysis SQL-92 Aggregate Functions

Madhavi Gundavarapu

Data Cube and OLAP Server 20

Rollup and Drill down• Allow analysts to view data at any desired

level of granularity

• Rollup – Operation of moving from finer-granularity of

data to a coarser granularity

• Drill Down– Operation of moving from coarser-granularity

of data to a finer granularity– Cannot be generated from coarse-granularity

data– Has to be computed from original data

Page 21: Data Cube and OLAP Server Madhavi Gundavarapu. Data Cube and OLAP Server2 Outline What is Data Analysis? Steps in Data Analysis SQL-92 Aggregate Functions

Madhavi Gundavarapu

Data Cube and OLAP Server 21

ROLLUP Operator

• Rollup example

SELECT model, year, color, sum(sales) as sales

FROM sales

WHERE model in (‘Chevy’)

AND year BETWEEN 1990 AND 1992

GROUP BY ROLLUP (model, year, color);

• Only 4 groupings are generated

– {(model, year, color), (model, year), (model), ()}

Page 22: Data Cube and OLAP Server Madhavi Gundavarapu. Data Cube and OLAP Server2 Outline What is Data Analysis? Steps in Data Analysis SQL-92 Aggregate Functions

Madhavi Gundavarapu

Data Cube and OLAP Server 22

Summary

• SQL-92 has limited functionality to support OLAP operations

• SQL:1999 has introduced extensions to address these limitations– provides operators such as CUBE, GROUPING and ROLLUP

Page 23: Data Cube and OLAP Server Madhavi Gundavarapu. Data Cube and OLAP Server2 Outline What is Data Analysis? Steps in Data Analysis SQL-92 Aggregate Functions

Madhavi Gundavarapu

Data Cube and OLAP Server 23

Questions