50
OLAP – On Line Analytical Processing

Updated Olap Basic Concepts

Embed Size (px)

Citation preview

Page 1: Updated Olap Basic Concepts

OLAP – On Line Analytical Processing

Page 2: Updated Olap Basic Concepts

2

Objectives:

At the end of this session, you will be able to:

> Define On Line Analytical Processing

> Understand the need for OLAP and applications of OLAP in BI

> Describe the various OLAP solutions and Architecture

> Comparison of different OLAP architectures

> Evaluation parameters to be considered for selecting an OLAP tool

Session Objectives

Page 3: Updated Olap Basic Concepts

3

> OLAP (On Line Analytical Processing) applications - designed

for online ad-hoc data access and analysis.

> Data organized into multiple dimensions.

> Access to analytical content such as time series and trend

analysis views and summary level information.

> A set of functionality that attempts to facilitate

multidimensional analysis.

> Offers drill-down, drill-across and slice and dice capabilities.

What is OLAP?

Page 4: Updated Olap Basic Concepts

4

• On Line No piles of paper,

please!

• Analytical Establish patterns

• Processing Data-based

• Fast Analysis of Shared

Multidimensional Information

OLAP - Fast Analysis

Page 5: Updated Olap Basic Concepts

5

• Dimensions can we think in ?

E.g. analysis by branch, product, agent, year !!!

2 or 3

• Types of values we can handle ?

E.g. Sales, Profit, Cost

1 or 2

• How many levels can we handle ?

E.g. number of products we can analyze

Need for OLAP

Page 6: Updated Olap Basic Concepts

6

Many parameters affect a Measure (value)

e.g Sales influenced by product, region, time,

distribution channel, etc.,

Linear analysis = reports

Many totals are at one level

Difficult to identify the key parameters

Need for OLAP

Page 7: Updated Olap Basic Concepts

7

OLAP in an Enterprise

Page 8: Updated Olap Basic Concepts

8

Departments:

Finance

Marketing

Sales

Manufacturing

Analytical Capabilities:

> Used by analysts and managers.

> Offers aggregated view of the data, such as total revenues by

customer profile, by product line, by geographical regions.

Uses of OLAP

Page 9: Updated Olap Basic Concepts

9

> Provides the decision support front-end for data warehousing.

> Advanced statistical, financial, and analytical calculations.

> Appropriate tools to access data from a relational database.

> Appropriate tools to access or manage multidimensional data.

Functionality of OLAP Tools

Page 10: Updated Olap Basic Concepts

10

OLAP analytical features

> Multi-dimensional views of data

> Calculation intensive capabilities

> Time intelligence

The OLAP Calculation engine in OLAP tools have a wide range of

built-in calculations such as:

> Ratios

> Time calculations

> Statistics

> Ranking

> Custom formulas/algorithms

> Forecasting and modeling

Features of OLAP Applications

Page 11: Updated Olap Basic Concepts

Evolution of OLAP

Page 12: Updated Olap Basic Concepts

12

Star Schema

> A Star Schema is a dimensional model created by mapping data entities from operational systems

> It has a central table (fact table) that links all the other tables (dimension tables) together

> Dimension: The same category of information. For example, year, month, day, and week are all part of the Time Dimension.

> Measure: The property that can be summed or averaged using pre computed aggregates.

Page 13: Updated Olap Basic Concepts

13

Facts and Measures

> Facts or Measures are the Key Performance

Indicators of an enterprise

> Factual data about the subject area

> Numeric, summarized

Net ProfitSale

s Rev

enue

Gross Margin

ProfitabilityCost

Page 14: Updated Olap Basic Concepts

14

Dimension

> Dimensions put measures in perspective

> What, when and where qualifiers to the measures

> Dimensions could be products, customers, time, geography

etc.

Sales

Rev

enue

(Mea

sure

) What was sold ? Whom was it sold to ? When was it sold ? Where was it sold ?

Page 15: Updated Olap Basic Concepts

15

Star Schema

Page 16: Updated Olap Basic Concepts

16

Star Schema Example

Page 17: Updated Olap Basic Concepts

17

Star Schema with Sample Data

Page 18: Updated Olap Basic Concepts

Cube

– Multi dimensional databases store information in the form of cubes.

– A cube is a collection of facts and related dimensions stored together in arrays.

Sales

HR

CUBE

Geography

Time

Product

Page 19: Updated Olap Basic Concepts

19

> Hierarchy: A hierarchy defines the navigating path for drilling up and drilling down. All attributes in a hierarchy belong to the same dimension.

> Levels: These are organized into one or more hierarchies, typically from a coarse-grained level (for example, Year) down to the most detailed one (for example, Day).

> Members: The individual category values (for example, 2002 or 21Jan2002).

> Measures: These are the data values that are summarized and analyzed. Examples of measures are sales figures or operational costs.

> Cells: These are the intersection of one member for every dimension and store the data for measures.

Basic Terminology of a Cube

Page 20: Updated Olap Basic Concepts

20

Basic Terminology of a Cube

> Dimensions consist of– Dimension Name

– Level

– Hierarchy

– Member

Time

1999 2000 2001

Q1 Q2 Q3 Q4 Q1 Q2Q3 Q4

YEAR

QUARTER

LevelOf

Detail

Page 21: Updated Olap Basic Concepts

21

Aggregates

sale prodId storeId date amtp1 s1 1 12p2 s1 1 11p1 s3 1 50p2 s2 1 8p1 s1 2 44p1 s2 2 4

Add up amounts for day 1 In SQL: SELECT sum(amt) FROM SALE WHERE date = 1

81

Page 22: Updated Olap Basic Concepts

22

Add up amounts by day In SQL: SELECT date, sum(amt) FROM SALE GROUP BY date

ans date sum1 812 48

sale prodId storeId date amtp1 s1 1 12p2 s1 1 11p1 s3 1 50p2 s2 1 8p1 s1 2 44p1 s2 2 4

Aggregates

Page 23: Updated Olap Basic Concepts

23

Another Example

Add up amounts by day, product In SQL: SELECT date, sum(amt) FROM SALE GROUP BY date, prodId

sale prodId date amtp1 1 62p2 1 19p1 2 48

drill-down

rollup

sale prodId storeId date amtp1 s1 1 12p2 s1 1 11p1 s3 1 50p2 s2 1 8p1 s1 2 44p1 s2 2 4

Page 24: Updated Olap Basic Concepts

24

Aggregates

>Operators: sum, count, max, min, median and avg

>“Having” clause

>Using dimension hierarchy– average by region (within store)– maximum by month (within date)

Page 25: Updated Olap Basic Concepts

25

The MOLAP Cube

sale prodId storeId amtp1 s1 12p2 s1 11p1 s3 50p2 s2 8

s1 s2 s3p1 12 50p2 11 8

Fact table view: Multi-dimensional cube:

dimensions = 2

Page 26: Updated Olap Basic Concepts

26

3-D Cube

dimensions = 3

Multi-dimensional cube:Fact table view:

sale prodId storeId date amtp1 s1 1 12p2 s1 1 11p1 s3 1 50p2 s2 1 8p1 s1 2 44p1 s2 2 4

day 2 s1 s2 s3p1 44 4p2 s1 s2 s3

p1 12 50p2 11 8

day 1

Page 27: Updated Olap Basic Concepts

27

Example

Store

Product

Time

M T W Th F S S

Juice

Milk

Coke

Cream

Soap

Bread

NYSF

LA

10

34

56

32

12

56

56 units of bread sold in LA on M

Dimensions:Time, Product, Store

Attributes:Product (upc, price, …)Store ……

Hierarchies:Product Brand …Day Week QuarterStore Region Country

roll-up to week

roll-up to brand

roll-up to region

Page 28: Updated Olap Basic Concepts

28

Cube Aggregation: Roll-up

day 2 s1 s2 s3p1 44 4p2 s1 s2 s3

p1 12 50p2 11 8

day 1

s1 s2 s3p1 56 4 50p2 11 8

s1 s2 s3sum 67 12 50

sump1 110p2 19

129

. . .

drill-down

rollup

Example: computing sums

Page 29: Updated Olap Basic Concepts

29

Aggregation Using Hierarchies

region A region Bp1 56 54p2 11 8

store

region

country

(store s1 in Region A;stores s2, s3 in Region B)

day 2 s1 s2 s3p1 44 4p2 s1 s2 s3

p1 12 50p2 11 8

day 1

Page 30: Updated Olap Basic Concepts

30

Slicing

day 2 s1 s2 s3p1 44 4p2 s1 s2 s3

p1 12 50p2 11 8

day 1

s1 s2 s3p1 12 50p2 11 8

TIME = day 1

In SQL: SELECT * FROM SALE WHERE date = 1

Page 31: Updated Olap Basic Concepts

OLAP Solutions and Architecture

Page 32: Updated Olap Basic Concepts

32

Online Analytical Processing (OLAP) can be done on:

> Relational databases

> Multidimensional databases

OLAP products are grouped into three categories:

> Relational OLAP (ROLAP)

> Multidimensional OLAP (MOLAP)

> Hybrid OLAP (HOLAP)

OLAP - Classification

Page 33: Updated Olap Basic Concepts

33

Multi-dimensional OLAP

MOLAP is a technology which uses a multi-dimensional

database that stores data as n-dimensional cube

Geography

Age G

roup

Bra

nd

MOLAP

Page 34: Updated Olap Basic Concepts

34

Architecture of MOLAP

Data Mart Server

•RDBMS•Connectivity Middleware

MOLAP Server

•MDDBMS/Data Cube•MOLAP Application

Desktop Systems

MOLAP Client Tools

LANLAN

non-live connection•Used for updating the MOLAP data cube only

IntranetInternet

Thin Clients

•WWW Browser

RouterRouterFirewallFirewall

Cube Size Crit

ical

Cube Size Crit

ical

Issues:• Size of Data Cube• Cubes deployment• Size of Update Data Set

Page 35: Updated Olap Basic Concepts

35

Oracle's Oracle Express Server

Cognos - Powerplay Transformer

Essbase (Hyperion Software)

Holos (Seagate Software)

MOLAP Products

Page 36: Updated Olap Basic Concepts

36

Data Mart Server

•RDBMS•Connectivity Middleware

ROLAP Server

•ROLAP Application

Desktop SystemsDesktop Systems

ROLAP Client Client ToolsTools

LANLAN

IntranetInternet

Thin Clients

•WWW Browser

Router /Router /FirewallFirewallIssues:

• Aggregate Awareness• Response Time•Network Capacity

Architecture of ROLAP

Page 37: Updated Olap Basic Concepts

37

Brio Query Enterprise

Business Objects

Metacube

DSS Server

Information Advantage

ROLAP Products

Page 38: Updated Olap Basic Concepts

38

ROLAP Server

•ROLAP Application

Desktop SystemsDesktop Systems

HOLAP Client Client ToolsTools

LANLAN

Router/FirewallIssues:

•Cube elements•Integration with RDBMS

MOLAP Server

•MDDBMS/Data Cube•MOLAP Application

Architecture of HOLAP

Page 39: Updated Olap Basic Concepts

39

Holos (Seagate Software)

Microsoft SQL Server OLAP Services

Pilot Software's Pilot Decision Support Suite

SAS

HOLAP Products

Page 40: Updated Olap Basic Concepts

MOLAP Vs ROLAP

Page 41: Updated Olap Basic Concepts

41

ArchitecturalFeatures

MOLAP ROLAP

Number of Dimensions Ten or Less Unlimited

Support for Large numberof users

Limited support Good

Scalability Poor Good

ComplexMultidimensional analysis

Easier to achieve Difficult to achieve

Volume of Data storage Up to 50 GB Hundreds ofGigabytes andTerabytes

Storage of Information Through cubes SQL result sets

User Interface &functionality

Good Normal

Common accesslanguage

NA SQL

Nature of Data Stores summarizeddata

Stores Detailed aswell as summarizeddata

Comparison of Architectures

Page 42: Updated Olap Basic Concepts

42

Parameters MOLAP ROLAPApplication design Essentially the

definition ofdimensional modeland calculation rules

It uses two-dimensional tablesthat are stored inRDBMSs. (Data isstored in Starschema or Snowflake schema.)

Aggregation techniques Measures are pre-calculated and storedat each hierarchysummary level duringload time

Summary tables areimplemented in therelational database

Multidimensionalanalysis

Drill down, Drill up,Drill across and Slicing /Dicing

Drill down, Drill up,Slicing and Dicing

Query performance Instant response Slower

Value added functions Supports complexfunctions like%change, ranking etc.,

Limited value addedfunctions

User – definedcalculations

Calculated from cubes Calculated (On thefly )from thedatabase

Strength and Weakness of MOLAP/ROLAP

Page 43: Updated Olap Basic Concepts

43

Parameters MOLAP ROLAPProcessing Over headfor large input data sets

High Low

Support for frequentupdates

Cannot handlefrequent update ofcubes

Suitable for frequentupdates

Resource requirements High Low

Industry standard No current standards SQL standard

Access to the databasethrough ODBC

The databases haveproprietary API and donot provide accessthrough ODBC.

Provides accessthrough ODBC

Strength and Weakness of MOLAP/ROLAP

Page 44: Updated Olap Basic Concepts

OLAP Tool Selection

Page 45: Updated Olap Basic Concepts

45

Parameters FeaturesOpenness Openness to standard reporting tools.

Ad hoc reporting Ad hoc query performance and reportingcapabilities

read-write Multi user read-write applications

Integration Integration with the organization’s enterprisewide environment

Cost Cost of ownership, training, and installation.

Compatibility Compatibility with the enterprise computingenvironment

Database Database size capacity of product

Scalability Ability of tool to scale to the required numberof dimensions

Analysis of detaildata

Ability of tool to support analysis againstatomic data sets

Parameters to be Considered for an OLAP Tool Selection

Page 46: Updated Olap Basic Concepts

46

Parameters FeaturesRDBMSintegration

Ability of the OLAP tool to integrate directlywith relational databases and non-numericrelational data.

Run timecalculations

Ability to perform calculations at run-time

Data Loading Data loading performance of the OLAPproduct.

Key features Key features offered by the tool such aswrite-back, allocation calculations,sophisticated currency conversions, printedreport quality, spreadsheet interface etc.

Integrationwith othersystems

Integration with other related systems, suchas e-mail, data warehouses.

Deploymentarchitectures

Ability to support various deployments suchas stand-alone, high speed client/server,intranet, extranet, Internet

Parameters to be Considered for an OLAP Tool Selection

Page 47: Updated Olap Basic Concepts

47

Features MOLAP ROLAPCalculation intensity, complexity Data Sparsity Database Update Data Volatility Volume of Data Development time, learning curve

Standards, interoperability

Query response time Consistency, Reliability Data Loading time Security Network impact Vendor Stability

Which is Preferred ?

Page 48: Updated Olap Basic Concepts

48

> Offers Fast, flexible data summarization and analysis.

> OLAP servers are a superior technology for BI applications.

> Ability to summarize data in multiple ways and view trends over

time.

> OLAP servers and relational databases can work in harmony.

OLAP - Summary

Page 49: Updated Olap Basic Concepts

49

In this session, We have

> Understood the need for OLAP and significance of

Multidimensional analysis in a Data Warehouse.

> Discussed about the evolution of OLAP.

> Explained architectures, characteristics as well as the merits and

demerits of various OLAP solutions.

Session Summary

Page 50: Updated Olap Basic Concepts

Thank you