20
Introduction to OLAP and Analysis Services from Microsoft (ONLY for INTERNAL USE) Josef Schiefer IBM Watson Research Center [email protected]

Introduction to OLAP and Analysis Services from Microsoft (ONLY for INTERNAL USE)

Embed Size (px)

DESCRIPTION

Introduction to OLAP and Analysis Services from Microsoft (ONLY for INTERNAL USE). Josef Schiefer IBM Watson Research Center [email protected]. What is OLAP?. Online Analytical Processing - coined by EF Codd in 1994 paper contracted by Arbor Software * - PowerPoint PPT Presentation

Citation preview

Page 1: Introduction to OLAP  and Analysis Services from Microsoft (ONLY for INTERNAL USE)

Introduction to OLAP and Analysis Services from Microsoft(ONLY for INTERNAL USE)

Josef SchieferIBM Watson Research [email protected]

Page 2: Introduction to OLAP  and Analysis Services from Microsoft (ONLY for INTERNAL USE)

Slide 2

What is OLAP? Online Analytical Processing -

coined by EF Codd in 1994 paper contracted by Arbor Software*

Generally synonymous with earlier terms such as Decisions Support, Business Intelligence, Executive Information System

OLAP = Multidimensional Database

MOLAP: Multidimensional OLAP (Arbor Essbase, Oracle Express)

ROLAP: Relational OLAP (Informix MetaCube, Microstrategy DSS Agent)

Page 3: Introduction to OLAP  and Analysis Services from Microsoft (ONLY for INTERNAL USE)

Slide 3

OLAP is FASMI Fast Analysis Shared Multidimensional Information

Nigel Pendse, Richard Creath - The OLAP Report

Page 4: Introduction to OLAP  and Analysis Services from Microsoft (ONLY for INTERNAL USE)

Slide 4

Cubes A cube stores information in a multidimensional

structure and is the central object in a multidimensional database.

Each cube contains a set of dimensions and measures.

Dimensions are derived from the tables and columns in your data that provide the categories you want to analyze.

Measures are the quantitative data derived from your data columns

Page 5: Introduction to OLAP  and Analysis Services from Microsoft (ONLY for INTERNAL USE)

Slide 5

Dimensions The dimensions you build should be

distinct categories you want to add to cubes in your OLAP database.

example: geography, time, or employee dimensions represented in the picture

Page 6: Introduction to OLAP  and Analysis Services from Microsoft (ONLY for INTERNAL USE)

Slide 6

Cube and Dimensions

Month1 2 3 4 76 5

Prod

uct

Gel

JuiceColaMilk

Cream

Soap

Regio

n

WS N

Dimensions: Product, Region, Time

Product Region TimeIndustry Country Year

Category Region Quarter Product City Month Week Office Day

Hierarchical summarization paths

Page 7: Introduction to OLAP  and Analysis Services from Microsoft (ONLY for INTERNAL USE)

Slide 7

Dimensions and Hierarchy Dimensions are the categories used to organize or describe analysis information

Dimensions are used to navigate the information and to summarize the details into more aggregate data.

Frequently used dimensions include time periods, geography, products, organization, and so on.

Often dimensions are hierarchical (World - Continents - Countries)

Page 8: Introduction to OLAP  and Analysis Services from Microsoft (ONLY for INTERNAL USE)

Slide 8

Measures =numercial Values

Measures are the quantitative data in an OLAP database. For example, values such as sales, budget, cost, and so

on, are all examples of measures. Measure values are organized in data cubes according to

dimensions

Page 9: Introduction to OLAP  and Analysis Services from Microsoft (ONLY for INTERNAL USE)

Slide 9

Aggregations Aggregations greatly improve query

efficiency and response time. A cube can hold a number of aggregations.

The aggregation amount is based on several factors - the size of the data, the amount of storage space you allocate for aggregation storage, the mode of storage you select, and how much you want to optimize the aggregations design.

Page 10: Introduction to OLAP  and Analysis Services from Microsoft (ONLY for INTERNAL USE)

Slide 10

Primary OLAP Problems Rigid, inflexible architectures

– MOLAP or ROLAP Significant scalability problems

– Data explosion and sparsity– Poor distributed client/server implementation

Separation of data warehousing from OLAP tools– Lack of integration between user tools and OLAP

Difficult to prototype, develop, deploy– Time and expense

Page 11: Introduction to OLAP  and Analysis Services from Microsoft (ONLY for INTERNAL USE)

Slide 11

MS-AS: Architecture

MOLAPMOLAP: aggregations & details managed in an efficient multidimensional store

ROLAPROLAP: aggregations created in relational store

HOLAPHOLAP: different things to different vendors– AggregationsAggregations: details in relational,

aggregations in MOLAP store– PartitionsPartitions: single logical cube physically

divided into multiple MOLAP and ROLAP partitions

– Virtual cubesVirtual cubes: “view-like” join of multiple MOLAP and ROLAP cubes

Microsoft Analysis Services are optimized for all OLAP architectures and offers seamless integration

Page 12: Introduction to OLAP  and Analysis Services from Microsoft (ONLY for INTERNAL USE)

Slide 12

MS-AS: Scalability MS-AS offer major innovation

– Data explosion managed by partial pre-aggregation

– Automatic elimination of sparse storage

Partitioned cubes– parallel query processing across clustered

servers– fine tuning of aggregations, to better manage

performance and disk space trade-offs

Page 13: Introduction to OLAP  and Analysis Services from Microsoft (ONLY for INTERNAL USE)

Slide 13

MS-AS: Scalability Cooperative client/server query

management and caching– network traffic minimized– server queries processed efficiently

Microsoft Data Cube Service– desktop component ships with next release of

Office– used with Excel, Access, and Web– supports local, offline usage

Page 14: Introduction to OLAP  and Analysis Services from Microsoft (ONLY for INTERNAL USE)

Slide 14

Microsoft Data Cube Service Basic architecture:

– Cache query results and metadata, not disk pages.

Algorithms deduce missing data and transform queries– Aggregation– Filtering– Combination

Instant reply to cached queries

Page 15: Introduction to OLAP  and Analysis Services from Microsoft (ONLY for INTERNAL USE)

Slide 15

MS Data Cube Benefits Efficient distribution of query and

calculation processing across client & server Single component spans Microsoft desktop

and server platforms & products Unifies the MD data access story across

Excel, MS-AS, and SQL Server Enables Microsoft to establish industry

standard for MD data access Basis for MS-AS and Excel mobile story

Page 16: Introduction to OLAP  and Analysis Services from Microsoft (ONLY for INTERNAL USE)

Slide 16

MS-AS: Integration The Microsoft Analysis Services integrate the

maintenance of OLAP with the underlying data warehouse– Design the DW structure– Create the DW tables/cubes– Populate the DW tables/cubes– Maintain by incremental loads– Optimize by actual usage patterns – Manage users, scripts, usage, metadata– Multiple data sources (not just SQLS)

Page 17: Introduction to OLAP  and Analysis Services from Microsoft (ONLY for INTERNAL USE)

Slide 17

MS-AS: Integration OLE DB for OLAP & ADO MD

– based upon existing data access technology– establishes industry standard for MD data

access

OLE DB/ODBC enable MS-AS to access data in all major RDBMs

Third party client applications

Page 18: Introduction to OLAP  and Analysis Services from Microsoft (ONLY for INTERNAL USE)

Slide 18

OLAP Problem: Complexity

OLAP products are traditionally difficult to configure, develop, and deploy– Arcane tools– Heavy consulting– Poor integration

Page 19: Introduction to OLAP  and Analysis Services from Microsoft (ONLY for INTERNAL USE)

Slide 20

3 Tier Architecture & Components

DCubeDCubeOLE DB for OLAPOLE DB for OLAP

Client TierClient Tier•Excel•ActiveX Controls•Third Party Applications

ADO MDADO MD •Data selection & navigation•Presentation and charting•What-if formulas•Client side caching•Desktop object model•Offline usage

MOLAP ROLAPHOLAP

Data Warehouse Data Warehouse TierTier

MS-AS ServerMS-AS Server

•MS-AS Server•SQL Server

•Multidimensional calcs•MOLAP/ROLAP/HOLAP data Modeling/aggregations•Security•Metadata management•Server side caching•Administrative tools•Server object model•Query distribution

OLTP Source TierOLE DB

DTS

•RDBMs

Page 20: Introduction to OLAP  and Analysis Services from Microsoft (ONLY for INTERNAL USE)

Slide 21

Let‘s go to the demonstration ...