38
Copyright © 2006, SAS Institute Inc. All rights reserved. SAS ® 9 OLAP Server Jochen Kirsten Technology Manager Storage SAS EMEA

Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Embed Size (px)

Citation preview

Page 1: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved.

SAS® 9 OLAP ServerJochen KirstenTechnology Manager StorageSAS EMEA

Page 2: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 2

Categories of data storage

MultidimensionalRelational

Page 3: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 3

What does SAS OLAP Server do?

SAS OLAP Server is a multidimensional data store designed from the outset to provide quick access to presummarized data, generated from vast amounts of detailed data.

Why is SAS OLAP Server important?

Decision makers need fast access to accurate information. Instantaneous access to summarizes of vast amounts of data is expected so timely decisions can be based on knowledge instead of gut feelings.

Page 4: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 4

Key Features include:• A multithreaded MDX query engine

• Parallel storage

• A graphical user interface for designing OLAP data sourc

• Special features that facilitate real world use, metadata

management, and cube optimization

SAS OLAP Server is a standards-compliant

OLAP data source that uses multidimensional expressions

(MDX) to query and navigate through multidimensional

information.

Page 5: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 5

Special features

Real-world use

• Time dimension for time based calculations• Geographic dimension for GIS support• Parallel drill hierarchies• Support for ragged and unbalanced hierarchies

Centralized Maintenance• All metadata is stored in the SAS Metadata Server• SAS Metadata Server maintains centralized security information• SAS Management Console is used to administer the OLAP Server

Open Architecture• SAS OLAP Server runs on all major hardware platforms• SAS OLAP Server stores its data in UNICODE• SAS OLAP Server can be accessed using Java or OLE DB for OLAP

Page 6: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 6

SQL can do – can it?

SQLMDX

OLAP type features added by database systems

or DOLAP applications

OLAP type features added by database systems

or DOLAP applications

“Pure”relational queries

“Pure”relational queries“Pure”

multidimensional queries“Pure”

multidimensional queries

Page 7: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 7

OLAP vs. SQL example

Show methe number of active Network portsfor the 5 largest customersfor each ending of monthover the last 12 month.

… that’s how it looks like in SQL:

Page 8: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 8

OLAP vs. SQL Example (cont.)

from ca_status i4 where trunc (i4.status_time) <= add_months (trunc (sysdate, 'MM'), -2) - 1

union all

select distinct i5.customer_id, i5.status_time, i5.ustate, i5.urole, i5.login,i5.ugroup, add_months (trunc (sysdate, 'MM'), -3) - 1

from ca_status i5 where trunc (i5.status_time) <= add_months (trunc (sysdate, 'MM'), -3) - 1

union all

select distinct i6.customer_id, i6.status_time, i6.ustate, i6.urole, i6.login,i6.ugroup, add_months (trunc (sysdate, 'MM'), -4) - 1

from ca_status i6 where trunc (i6.status_time) <= add_months (trunc (sysdate, 'MM'), -4) - 1

) j whereexists (select 'x' from nicetec.ca_status_last v,

nicetec.import_log l where v.customer_id = j.customer_idand v.import_id = l.import_idand l.import_time >= least (j.monat, sysdate - 1)

-- the user has been modified this month or later-- last_import only shows the date of the last status change-- which was not deleted at the point in time marked x

) ) where rk=1

Page 9: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 9

OLAP vs. SQL example (cont.)

SQL cannot handle inter-row calculations

In order to overcome this limitation, several intermediate steps are required to compensate

Intermediate steps can either be sub-select statements (memory consumption) or temporary tables (storage consumption)

SQL can handle dimensions (star schema) but it is not able to deal with hierarchies • SQL does not fulfill a major requirement of analysis• SQL and OLAP problems are two individual domains

Page 10: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 10

Same result in MDX

TopCount([Customers.[AllCustomers].Children,5,(Measures.[ActiveNetworkPorts],

ClosingPeriod(Time.[YMD].[Day],Time.CurrentMember)))

Page 11: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 11

Who uses OLAP?

Everybody who needs to do fast analysis of shared multidimensional information (FASMI)• Finance departments• Marketing departments• Manufacturing sector• Sales departments• …

Page 12: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 12

Basic terminology of a cube

Dimensions consist of• Dimension Name

• Level • Hierarchy

• Member

Time

1999 2000 2001

Q1 Q2 Q3 Q4 Q1 Q2Q3 Q4

Page 13: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 13

Basic terminology of a cube

Dimensions consist of• Dimension Name

• Level • Hierarchy

• Member

Time

1999 2000 2001

Q1 Q2 Q3 Q4 Q1 Q2Q3 Q4

Page 14: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 14

Basic terminology of a cube

Dimensions consist of• Dimension Name

• Level • Hierarchy

• Member

Time

1999 2000 2001

Q1 Q2 Q3 Q4 Q1 Q2Q3 Q4

YEAR

QUARTER

Page 15: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 15

Basic terminology of a cube

Dimensions consist of• Dimension Name

• Level • Hierarchy

• Member

Time

1999 2000 2001

Q1 Q2 Q3 Q4 Q1 Q2Q3 Q4

LevelOf

Detail

Page 16: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 16

Basic terminology of a cube

Dimensions consist of• Dimension Name

• Level • Hierarchy

• Member

Time

1999 2000 2001

Q1 Q2 Q3 Q4 Q1 Q2Q3 Q4

Page 17: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 17

Navigation in multidimensional data

Switzerland

Basel Geneva Zurich

France

Europe

.CurrentMember.PrevMember

.Parent

.Children .LastChild

Page 18: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 18

Unbalanced hierarchy

COO

Director ofComms.

ExecutiveSecretary

CEO

Comms.Specialist

Page 19: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 19

Ragged hierarchy

United States

California

Washington DC

America

San Francisco

United States

Page 20: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 20

Star schema

A Star Schema is a dimensional model created by mapping data entities from operational systems

It has a central table (fact table) that links all the other tables (dimension tables) together

Page 21: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 21

Sample star schema

TIMEKEY Time dim. key

CUSTKEY Customer dim. key

PRDKEY Product dim. key

PONO PO number

POLINNO PO line number

QTYSOLD Quantity sold

UNITPRIC Unit price

SALEAMT Sales amount

SALEDTL Sales fact detail tbl

CUSTKEY Customer dim. key

CUSTNO Customer loyalty numCUSTLNAM Customer last name

CUSTFNAM Customer first namesCUSTADDR Customer address

CUSTPOST Customer postal code

CUSTDIM Customer dim tbl

CUSTREGN Customer region

CUSTCNTR Customer country

PRDKEY Product dim. key

SKU Stock keeping unit

PRDDSC Product description

PRGCOD Product group code

PRGDSC Product group desc.

BRNDNAM Brand name

PRDDIM Product dim tbl

COLRDSC Colour description

TIMEKEY Time dim. key

YYMM Calendar month

YYWW Calendar week

JULDY Julian day number

FYR Fiscal year

CYR Calendar year

TIMEDIM Time dim tbl

MTHNO Month number

WK Week number

DOW Day of week number

Page 22: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 22

OLAP concepts

MOLAP (multidimensional OLAP)• MOLAP is the default storage technology used by SAS OLAP

Server. MOLAP uses SAS’ own storage (SPDE) to store aggregations in a format that is optimized for multidimensional data structures.

ROLAP (relational OLAP)• ROLAP uses a relational database as storage for

multidimensional data. In most cases a star schema would be the foundation and aggregations would be linked in where appropriate.

HOLAP (hybrid OLAP)• HOLAP is a combination of MOLAP and ROLAP that

combines the benefits from both worlds.

Page 23: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 23

MOLAP in SAS9 OLAP ServerBy default SAS9 OLAP Server uses MOLAP to storeaggregations.

Cube designers can specify which file systems shouldbe used to store these aggregations.

The MOLAP storage option uses libraries that areoptimized for accessing multidimensional data. In their fundamental structure they resemble SPDElibraries, but MOLAP uses advanced clusteringand indexing to provide the fastest access to datathat is possible.

Page 24: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 24

ROLAP in SAS9 OLAP ServerIn certain situations (very high cardinalities for certainlevels) it can be interesting to use relational databasesto store the aggregations.

You can select “Cube will also use aggregated data from other tables” in the initial screen, and then pointeach aggregation to the respective table in a relationaldatabase system.

ROLAP can be used to make sure that datais kept within a single storage entity (relationaldatabase) for data security reasons.

Make sure you take all performance impacts intoconsideration when choosing ROLAP: It is yourRDBMS and your network that is now in chargeof performance.

Page 25: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 25

HOLAP in SAS9 OLAP Server

HOLAP is the combination of ROLAP and MOLAP• Your OLAP data source will get based on a fundamental

aggregation (NWAY) stored in a relational database system• All aggregations defined for a HOLAP cube can either be

generated by the OLAP server (MOLAP) or can point to pre-calculated aggregation tables stored in a relational database system.

Using HOLAP • Cubes can compensate large levels (with high cardinalities) by

storing them in relational databases, while providing fast access to standard business views (aggregations) stored in MOLAP.

• The NWAY crossing must be stored in ROLAP.

Page 26: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 26

Structure of the OLAP environment

Querying a Cube• MDX is sent to the OLAP Server• A result set is passed back to the

client

Creating a Cube• PROC OLAP creates a cube

(used by Cube builder)• PROC OLAP is executed on a

Workspace Server

Write Cube Data

MDX Query

Result Set

PROC OLAP

Page 27: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 27

SAS applications around the OLAP Server

SAS Management Console

• Register Servers

• Register Libraries

• Register Users

• Assign User Rights

SAS OLAP Cube Studio

• Register Tables

• Register OLAP Schemas

• Design Cubes

• Create Cubes

Page 28: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 28

Scalability in SAS OLAP Server

One Thread for every Query

One Thread for every Query

One Aggregation Selectionfor every Region affected

One Aggregation Selectionfor every Region affected

ParallelStorage

ParallelStorage

ConfigurableCube Cache

ConfigurableCube Cache

Page 29: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 29

Using parallel storage for cubes

FundamentalCube Data

NWAY

AdditionalAggregations

Index

Single physical path

OLAP Cube

Page 30: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 30

Benefits of parallel storage

datadatadatadata

One file system forall data

One file system forall data

Data is spread across multiple file systems

Data is spread across multiple file systems

Page 31: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 31

Basic cube types - MOLAP

MOLAP cubes can be based on a single flat table or on a star schema

MOLAP stores all cube data inside the SAS OLAP storage facility, optimized for multidimensional data

MOLAP Cube

Detailed datain a single

SAS data setDetailed datastored in a

Star Schema

OLAP Storage

OR

Page 32: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 32

Basic cube types - HOLAP

HOLAP cubes can be based on a single flat table or on a star schema

HOLAP stores cube data wherever it is appropriate:• SAS Storage• Flat files• RDBMS

Storage can be defined for every aggregation individually

HOLAP Cube

OLAP Storage (MOLAP)

RDBMS (ROLAP)

Aggregations with high

cardinality dimensions

Aggregations with low

cardinality dimensions

Page 33: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 33

Parallel drill hierarchies

Time DimensionTime Dimension

YearYear

QuarterQuarter

MonthMonth

YearYear

WeekWeek

WeekdayWeekday

Hierarchy1: Time by Month Hierarchy2: Time by Week

Page 34: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 34

Member properties

FranceFrance GermanyGermany ItalyItaly

LyonLyon NancyNancy CologneCologne BerlinBerlin VeniceVenice MilanoMilano

Location DimensionLocation Dimension

Country

City

Population:57.000.000

Population:82.000.000

Population:58.000.000

Page 35: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 35

FR

FRFR

FRUKRU

UKRU UK RU

UK RU

French data

English data

Russian data

All Languages get loadedinto one single UNICODE cube.

Cube is queried using standard MDX.

The session reports the client language to the server.

Multiple languages in one cube

Page 36: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 36

Accessing SAS OLAP Server

Java Clients Microsoft Clients

SAS Web Report StudioSAS Visual Data Explorer

SAS Information Delivery Portal

SAS Enterprise GuideMicrosoft ExcelProClarity

Page 37: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 37

Work in progress …

User / admin cancellation of run-away queries

Incremental cube updates

Allow for cubes with dimensions of very large cardinality• 2**64 per hierarchy level

Security GUI in SMC

Support for visual and security based totalling capabilities

Page 38: Categories of data storage - sasCommunity · where trunc (i6.status_time) ... SAS OLAP Cube Studio • Register Tables ... SAS Web Report Studio SAS Visual Data Explorer

Copyright © 2006, SAS Institute Inc. All rights reserved. 3838Copyright © 2006, SAS Institute Inc. All rights reserved.