Upload
john-chaney
View
19
Download
2
Embed Size (px)
DESCRIPTION
Introduction to OLAP and Analysis Services from Microsoft (ONLY for INTERNAL USE). Josef Schiefer IBM Watson Research Center [email protected]. What is OLAP?. Online Analytical Processing - coined by EF Codd in 1994 paper contracted by Arbor Software * - PowerPoint PPT Presentation
Citation preview
Introduction to OLAP and Analysis Services from Microsoft(ONLY for INTERNAL USE)
Josef SchieferIBM Watson Research [email protected]
Slide 2
What is OLAP? Online Analytical Processing -
coined by EF Codd in 1994 paper contracted by Arbor Software*
Generally synonymous with earlier terms such as Decisions Support, Business Intelligence, Executive Information System
OLAP = Multidimensional Database
MOLAP: Multidimensional OLAP (Arbor Essbase, Oracle Express)
ROLAP: Relational OLAP (Informix MetaCube, Microstrategy DSS Agent)
Slide 3
OLAP is FASMI Fast Analysis Shared Multidimensional Information
Nigel Pendse, Richard Creath - The OLAP Report
Slide 4
Cubes A cube stores information in a multidimensional
structure and is the central object in a multidimensional database.
Each cube contains a set of dimensions and measures.
Dimensions are derived from the tables and columns in your data that provide the categories you want to analyze.
Measures are the quantitative data derived from your data columns
Slide 5
Dimensions The dimensions you build should be
distinct categories you want to add to cubes in your OLAP database.
example: geography, time, or employee dimensions represented in the picture
Slide 6
Cube and Dimensions
Month1 2 3 4 76 5
Prod
uct
Gel
JuiceColaMilk
Cream
Soap
Regio
n
WS N
Dimensions: Product, Region, Time
Product Region TimeIndustry Country Year
Category Region Quarter Product City Month Week Office Day
Hierarchical summarization paths
Slide 7
Dimensions and Hierarchy Dimensions are the categories used to organize or describe analysis information
Dimensions are used to navigate the information and to summarize the details into more aggregate data.
Frequently used dimensions include time periods, geography, products, organization, and so on.
Often dimensions are hierarchical (World - Continents - Countries)
Slide 8
Measures =numercial Values
Measures are the quantitative data in an OLAP database. For example, values such as sales, budget, cost, and so
on, are all examples of measures. Measure values are organized in data cubes according to
dimensions
Slide 9
Aggregations Aggregations greatly improve query
efficiency and response time. A cube can hold a number of aggregations.
The aggregation amount is based on several factors - the size of the data, the amount of storage space you allocate for aggregation storage, the mode of storage you select, and how much you want to optimize the aggregations design.
Slide 10
Primary OLAP Problems Rigid, inflexible architectures
– MOLAP or ROLAP Significant scalability problems
– Data explosion and sparsity– Poor distributed client/server implementation
Separation of data warehousing from OLAP tools– Lack of integration between user tools and OLAP
Difficult to prototype, develop, deploy– Time and expense
Slide 11
MS-AS: Architecture
MOLAPMOLAP: aggregations & details managed in an efficient multidimensional store
ROLAPROLAP: aggregations created in relational store
HOLAPHOLAP: different things to different vendors– AggregationsAggregations: details in relational,
aggregations in MOLAP store– PartitionsPartitions: single logical cube physically
divided into multiple MOLAP and ROLAP partitions
– Virtual cubesVirtual cubes: “view-like” join of multiple MOLAP and ROLAP cubes
Microsoft Analysis Services are optimized for all OLAP architectures and offers seamless integration
Slide 12
MS-AS: Scalability MS-AS offer major innovation
– Data explosion managed by partial pre-aggregation
– Automatic elimination of sparse storage
Partitioned cubes– parallel query processing across clustered
servers– fine tuning of aggregations, to better manage
performance and disk space trade-offs
Slide 13
MS-AS: Scalability Cooperative client/server query
management and caching– network traffic minimized– server queries processed efficiently
Microsoft Data Cube Service– desktop component ships with next release of
Office– used with Excel, Access, and Web– supports local, offline usage
Slide 14
Microsoft Data Cube Service Basic architecture:
– Cache query results and metadata, not disk pages.
Algorithms deduce missing data and transform queries– Aggregation– Filtering– Combination
Instant reply to cached queries
Slide 15
MS Data Cube Benefits Efficient distribution of query and
calculation processing across client & server Single component spans Microsoft desktop
and server platforms & products Unifies the MD data access story across
Excel, MS-AS, and SQL Server Enables Microsoft to establish industry
standard for MD data access Basis for MS-AS and Excel mobile story
Slide 16
MS-AS: Integration The Microsoft Analysis Services integrate the
maintenance of OLAP with the underlying data warehouse– Design the DW structure– Create the DW tables/cubes– Populate the DW tables/cubes– Maintain by incremental loads– Optimize by actual usage patterns – Manage users, scripts, usage, metadata– Multiple data sources (not just SQLS)
Slide 17
MS-AS: Integration OLE DB for OLAP & ADO MD
– based upon existing data access technology– establishes industry standard for MD data
access
OLE DB/ODBC enable MS-AS to access data in all major RDBMs
Third party client applications
Slide 18
OLAP Problem: Complexity
OLAP products are traditionally difficult to configure, develop, and deploy– Arcane tools– Heavy consulting– Poor integration
Slide 20
3 Tier Architecture & Components
DCubeDCubeOLE DB for OLAPOLE DB for OLAP
Client TierClient Tier•Excel•ActiveX Controls•Third Party Applications
ADO MDADO MD •Data selection & navigation•Presentation and charting•What-if formulas•Client side caching•Desktop object model•Offline usage
MOLAP ROLAPHOLAP
Data Warehouse Data Warehouse TierTier
MS-AS ServerMS-AS Server
•MS-AS Server•SQL Server
•Multidimensional calcs•MOLAP/ROLAP/HOLAP data Modeling/aggregations•Security•Metadata management•Server side caching•Administrative tools•Server object model•Query distribution
OLTP Source TierOLE DB
DTS
•RDBMs
Slide 21
Let‘s go to the demonstration ...