16
Ahsan Abdullah Ahsan Abdullah 1 Data Warehousing Data Warehousing Lecture-11 Lecture-11 Multidimensional OLAP (MOLAP) Multidimensional OLAP (MOLAP) Virtual University of Virtual University of Pakistan Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics Research www.nu.edu.pk/cairindex.asp National University of Computers & Emerging Sciences, Islamabad Email: [email protected]

Ahsan Abdullah 1 Data Warehousing Lecture-11 Multidimensional OLAP (MOLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for

Embed Size (px)

Citation preview

Page 1: Ahsan Abdullah 1 Data Warehousing Lecture-11 Multidimensional OLAP (MOLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for

Ahsan AbdullahAhsan Abdullah

11

Data Warehousing Data Warehousing Lecture-11Lecture-11

Multidimensional OLAP (MOLAP)Multidimensional OLAP (MOLAP)

Virtual University of PakistanVirtual University of Pakistan

Ahsan AbdullahAssoc. Prof. & Head

Center for Agro-Informatics Researchwww.nu.edu.pk/cairindex.asp

National University of Computers & Emerging Sciences, IslamabadEmail: [email protected]

Page 2: Ahsan Abdullah 1 Data Warehousing Lecture-11 Multidimensional OLAP (MOLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for

Ahsan AbdullahAhsan Abdullah

22

Multidimensional OLAP (MOLAP)Multidimensional OLAP (MOLAP)

Page 3: Ahsan Abdullah 1 Data Warehousing Lecture-11 Multidimensional OLAP (MOLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for

Ahsan AbdullahAhsan Abdullah

33

OLAP ImplementationsOLAP Implementations

1. 1. MOLAP:MOLAP: OLAP implemented with a multi- OLAP implemented with a multi-dimensional data structure.dimensional data structure.

2. 2. ROLAP:ROLAP: OLAP implemented with a relational OLAP implemented with a relational database.database.

3. 3. HOLAP:HOLAP: OLAP implemented as a hybrid of MOLAP OLAP implemented as a hybrid of MOLAP and ROLAP.and ROLAP.

4. 4. DOLAP:DOLAP: OLAP implemented for desktop decision OLAP implemented for desktop decision support environments.support environments.

Page 4: Ahsan Abdullah 1 Data Warehousing Lecture-11 Multidimensional OLAP (MOLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for

Ahsan AbdullahAhsan Abdullah

44

MOLAP ImplementationsMOLAP ImplementationsOLAP has historically been implemented using a OLAP has historically been implemented using a multi_dimensional data structure or “cube”.multi_dimensional data structure or “cube”.

Dimensions are key business factors for analysis:Dimensions are key business factors for analysis: GeographiesGeographies (city, district, division, province,...) (city, district, division, province,...) ProductsProducts (item, product category, product department,...) (item, product category, product department,...) DatesDates (day, week, month, quarter, year,...) (day, week, month, quarter, year,...)

Very high performance achieved by O(1) time Very high performance achieved by O(1) time lookup into “cube” data structure to retrieve lookup into “cube” data structure to retrieve pre_aggregated results.pre_aggregated results.

Page 5: Ahsan Abdullah 1 Data Warehousing Lecture-11 Multidimensional OLAP (MOLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for

Ahsan AbdullahAhsan Abdullah

55

MOLAP ImplementationsMOLAP Implementations No standard query language for querying MOLAP No standard query language for querying MOLAP

- No SQL !No SQL !

Vendors provide proprietary languages allowing business Vendors provide proprietary languages allowing business users to create queries that involve pivots, drilling down, or users to create queries that involve pivots, drilling down, or rolling up.rolling up.- E.g. MDX of MicrosoftE.g. MDX of Microsoft

- Languages generally involve extensive visual (click and drag) support.Languages generally involve extensive visual (click and drag) support.

- Application Programming Interface (API)’s also provided for probing Application Programming Interface (API)’s also provided for probing the cubes.the cubes.

Page 6: Ahsan Abdullah 1 Data Warehousing Lecture-11 Multidimensional OLAP (MOLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for

Ahsan AbdullahAhsan Abdullah

66

Aggregations in MOLAPAggregations in MOLAP Sales volume as a function of (i) product, (ii) time, Sales volume as a function of (i) product, (ii) time,

and (iii) geographyand (iii) geography

A cube structure created to handle this.A cube structure created to handle this.

Dimensions: Product, Geography, Time

Industry

Category

Product

Hierarchical summarization pathsHierarchical summarization paths

Pro

du

ctGeo

g

Timew1 w2 w3 w4 w5 w6

Milk

Bread

Eggs

Butter

Jam

Juice

NE

WS

1213

458

23

10

Province

Division

District

City

Zone

Year

Quarter

Month Week

Day

Page 7: Ahsan Abdullah 1 Data Warehousing Lecture-11 Multidimensional OLAP (MOLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for

Ahsan AbdullahAhsan Abdullah

77

Cube operationsCube operations Drill down: get more detailsDrill down: get more details

e.g., given summarized sales as above, find breakup e.g., given summarized sales as above, find breakup of sales by city within each region, or within Sindhof sales by city within each region, or within Sindh

Rollup: summarize dataRollup: summarize data e.g., given sales data, summarize sales for last year e.g., given sales data, summarize sales for last year

by product category and regionby product category and region

Slice and dice: select and project Slice and dice: select and project e.g.: Sales of soft-drinks in Karachi during last e.g.: Sales of soft-drinks in Karachi during last

quarterquarter

Pivot: change the view of dataPivot: change the view of data

Page 8: Ahsan Abdullah 1 Data Warehousing Lecture-11 Multidimensional OLAP (MOLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for

Ahsan AbdullahAhsan Abdullah

88

Drill-down

-

2,000

4,000

6,000

8,000

10,000

12,000

Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4

OJ RK 8UP PK MJ BU AJ

2001 2002

Querying the cubeQuerying the cube

-5,000

10,00015,00020,00025,00030,00035,00040,000

2001 2002

Juices Soda Drinks

-

2,000

4,000

6,000

8,000

10,000

12,000

14,000

Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4

Juices Soda Drinks

2001 2002

Drill-Down

Roll-Up

Page 9: Ahsan Abdullah 1 Data Warehousing Lecture-11 Multidimensional OLAP (MOLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for

Ahsan AbdullahAhsan Abdullah

99

Querying the cube: PivotingQuerying the cube: Pivoting

-5,000

10,00015,00020,00025,00030,00035,00040,000

2001 2002

Juices Soda Drinks

-

2,000

4,000

6,000

8,000

10,000

12,000

14,000

16,000

18,000

Orangejuice

Mangojuice

Applejuice

Rola-Kola

8-UP Bubbly-UP

Pola-Kola

2001 2002

Page 10: Ahsan Abdullah 1 Data Warehousing Lecture-11 Multidimensional OLAP (MOLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for

Ahsan AbdullahAhsan Abdullah

1010

MOLAP evaluationMOLAP evaluation

Advantages of MOLAP:

Instant response (pre-calculated aggregates).

Impossible to ask question without an answer.

Value added functions (ranking, % change).

Page 11: Ahsan Abdullah 1 Data Warehousing Lecture-11 Multidimensional OLAP (MOLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for

Ahsan AbdullahAhsan Abdullah

1111

MOLAP evaluationMOLAP evaluation

Drawbacks of MOLAP:

Long load time ( pre-calculating the cube may take days!).

Very sparse cube (wastage of space) for high cardinality (sometimes in small hundreds). e.g. number of heaters sold in Jacobabad or Sibi.

Page 12: Ahsan Abdullah 1 Data Warehousing Lecture-11 Multidimensional OLAP (MOLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for

Ahsan AbdullahAhsan Abdullah

1212

MOLAP Implementation issuesMOLAP Implementation issues

Maintenance issue:Maintenance issue: Every data item Every data item received must be aggregated into received must be aggregated into everyevery cube cube (assuming “to-date” summaries are (assuming “to-date” summaries are maintained). maintained). Lot of work.Lot of work.

Storage issue:Storage issue: As dimensions get less As dimensions get less detailed (e.g., year vs. day) cubes get much detailed (e.g., year vs. day) cubes get much smaller, but storage consequences for smaller, but storage consequences for building hundreds of cubes can be significant. building hundreds of cubes can be significant. Lot of space.Lot of space.

Page 13: Ahsan Abdullah 1 Data Warehousing Lecture-11 Multidimensional OLAP (MOLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for

Ahsan AbdullahAhsan Abdullah

1313

Partitioned CubesPartitioned Cubes To overcome the space limitation of MOLAP, the cube is To overcome the space limitation of MOLAP, the cube is

partitioned. partitioned.

The divide&conquer cube partitioning approach helps The divide&conquer cube partitioning approach helps alleviate the scalability limitations of MOLAP alleviate the scalability limitations of MOLAP implementation.implementation.

One logical cube of data can be spread across multiple One logical cube of data can be spread across multiple physical cubes on separate (or same) servers.physical cubes on separate (or same) servers.

Ideal cube partitioning is completely invisible to end Ideal cube partitioning is completely invisible to end users.users.

Performance Performance degradation does occursdegradation does occurs in case of a join in case of a join across partitioned cubes.across partitioned cubes.

Page 14: Ahsan Abdullah 1 Data Warehousing Lecture-11 Multidimensional OLAP (MOLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for

Ahsan AbdullahAhsan Abdullah

1414

Partitioned Cubes: How it looks Like?Partitioned Cubes: How it looks Like?

Time

Geography

Men’s clothing

Children clothing

Bed linen

Sales data cube partitioned at a major cotton Sales data cube partitioned at a major cotton products sale outletproducts sale outlet

Product

Page 15: Ahsan Abdullah 1 Data Warehousing Lecture-11 Multidimensional OLAP (MOLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for

Ahsan AbdullahAhsan Abdullah

1515

Virtual CubesVirtual CubesUsed to query two dissimilar cubes by creating a Used to query two dissimilar cubes by creating a third “virtual” cube by a join between two cubes.third “virtual” cube by a join between two cubes.

Logically similar to a relational view i.e. linking two Logically similar to a relational view i.e. linking two (or more) cubes along common dimension(s).(or more) cubes along common dimension(s).

Biggest advantage is saving in space by eliminating Biggest advantage is saving in space by eliminating storage of redundant information.storage of redundant information.

Example:Example: Joining the store cube and the list price Joining the store cube and the list price cube along the product dimension, to calculate the cube along the product dimension, to calculate the sale price without redundant storage of the sale sale price without redundant storage of the sale price data.price data.

Page 16: Ahsan Abdullah 1 Data Warehousing Lecture-11 Multidimensional OLAP (MOLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for

Ahsan AbdullahAhsan Abdullah

1616

SummarySummary