36
BI Sematic Model Albert van Dok SQL Zaterdag 12 november 2011

Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

  • View
    227

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

BI Sematic Model

Albert van DokSQL Zaterdag12 november 2011

Page 2: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

Agenda

BackgroundLife Before BISMWhat is BISMBISM PositioningQuestions

Page 3: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

Background

From data towards informationBy nature the demand for (new) information and insights will always evolveTo connect and integrate (new) datasources is an essential partPreparing data for use

Data cleansingDefine relationshipsData enrichmentAdd calculationsVersioning

Page 4: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

Goal is not always easily to achieve

Applications

• Analytical solutions• Operational reports• Dashboards &

Scorecards• Data Mining

Require-

ments• Quick delivery• Integration of data by

business user• Ad hoc reports• Excellent

performance• Flexible

Issues

• Operational reports from an analytical system

• Wrong use of tools or BI tools not flexible

• (Performance) problems• Long implementation

times• Highly depended on IT

Page 5: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

BI across the enterprise

Page 6: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

Life before BISM

DW

Datamart

Datamart

Data Model

Reporting Tool

Reporting Tool

ToolData Source

MOLAP

MOLAP OLAP Browser

OLAP Browser

Reporting Tool

OLTP

Page 7: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

Life before BISM

DW

Datamart

Datamart

Data Model

Reporting Tool

Reporting Tool

ToolData Source

MOLAP

MOLAP OLAP Browser

OLAP Browser

Reporting Tool

OLTP

UDM

Page 8: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

Life before BISM

DW

Datamart

Datamart

Data Model

Reporting Tool

Reporting Tool

ToolData Source

OLAP Browser

OLAP Browser

Analysis Services

Reporting Tool

MOLAP

MOLAP

OLTP

UDM

XM

L/A

Cache

Security

End-user model• Transalations• Actions• KPI…

Calculations

Basic dim. model• Cube &

Dimensions• Storage &

Caching policies• Linked Objects

Datasource view

UDM

Page 9: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

The UDM in SSAS 2008 R2

UDM

Excel 2010

Reporting Services 2008 R2

&Report Builder 3

SharePoint 2010• Excel Services• PerformancePoint

Services• Visio Services

3rd party SSAS clients

MDX MDX

MDX

MDX

Besides the advantages the UDM:

Is often too complex for simple reporting purposesHas a steep learning curveUses MDX which is different than SQL…Must be implemented by a BI professionalNeeds small investment just to start

Page 10: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

The holy grail: Self Service BI

New paradigm“Business intelligence for the masses”“Managed self-service business intelligence”

Put simple, powerful BI tools in the hands of “knowledge workers”

Familiar tools: ExcelPeople who own the data

Excel spreadsheet, Access database or SharePoint list data

Reality: Office power users

Page 11: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

New kid on the block: PowerpivotPowerpivot for Excel

Free Addin for ExcelRunning 32/64bit and lots of RAM… Contains Vertipaq engine (SSAS running in process with Excel)

Powerpivot for SharepointComes with SQL Server 2008 R2 x64Sharepoint 2010 extentionVertipaq running on server sideFor sharing and managing PowerPivot applications

Page 12: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

Powerpivot

PowerPivot has its own semantic model which can be seen as BISM v1

enables connecting data from various data sourcesadd relations between tablesadd calculations, two places:

in tables – calculated columns (DAX)over the whole model – calculated measures (DAX)

works in cached (VertiPaq) mode

Covers personal and team BI segments

Page 13: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

What is Vertipaq

In-memory column-based database

Very high data compression

Doesn’t require the

process of designing and building aggregations and other tunningSupport partitioning and paging on large data sizes

Page 14: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

Relational Database

15

4 Jim … $1,500 5 Liz … $0 6 Dave … $9,000

7 Sue … $1010 8 Bob … $50 9 Jim … $1,300

1 Bob … $3000 2 Sue … $500 3 Ann … $1,700Page 1

Page 2

Page 3

64 bytes

CPU

L2 Cache

L1 Cache

Memory (DBMS

Buffer Pool)1 … $3000

2 … $500 3 … $1700

4 … $1500

5 … $0 6 … $9000

.. $3000 .. $500 .. $1700 .. $1500.. $0 .. $9000 .. $1010 .. $50 .. $1300

7 … $1010

8 … $50 9 … $1300

.. $3000 .. $500.. $1700 .. $1500.. $0 .. $9000.. $1010 .. $50 .. $1300

8K bytes

64 bytes

Select id, name, BalDue from Customers where BalDue > $500

Query summary:• 3 pages read from disk• Up to 9 L1 and L2 cache misses

(one per tuple)

Don’t forget that:- An L2 cache miss can stall the CPU for up to 200 cycles

Page 15: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

Columnstore Database

16

64 bytes

CPU

L2 Cache

L1 Cache

Memory

8K bytes

64 bytes

Id 1 2 3 4 5 6 7 8 9

Name Bob Sue Ann Jim Liz Dave Sue Bob Jim

BalDue 9000 1010 50 1300

3000 500 1700 1500 0

Street … … … … …..… … … … …..… … … … …..… … … … …..

9000 1010 50 1300

3000 500 1700 1500 0

3000 500 1700

3000 500 1700

1500 0

1500 0

9000 1010 50

9000 1010 50 1300

1300

Takeaways:• Each cache miss brings only

useful data into the cache• Processor stalls reduced by up to

a factor of: 8 (if BalDue values are 8 bytes)16 (if BalDue values are 4 bytes)

Caveats:• Not to scale! An 8K byte page of

BalDue values will hold 1000 values (not 5)

• Not showing disk I/Os required to read id and Name columns

Select id, name, BalDue from Customers where BalDue > $500

Page 16: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

An example

Assume:Customer table has 10M rows, 200 bytes/row (2GB total size)Id and BalDue values are each 4 bytes long, Name is 20 bytes

Query:Select id, Name, BalDue from Customer where BalDue > $1000

Row store execution: Scan 10M rows (2GB) @ 80MB/sec = 25 sec.

Column store execution:Scan 3 columns, each with 10M entries 280MB@80MB/sec = 3.5 sec.

(id 40MB, Name 200MB, BalDue 40MB)

About a 7X performance improvement for this query!! But we can do even better using compression

Page 17: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

Demo

Powerpivot

Page 18: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

We are not there yet

Although Powerpivot for Excel is great, it has certain limitations

Limit to 2Gb, no support for partitions, queries Vertipaq cache, daily scheduled data refresh in Sharepoint, acces to workbook

PowerPivot and Analysis Services are two different products hence two models

Powerpivot targets business users, model managed in ExcelAnalysis Services targets BI professionals and IT, model managed on the server

“Can we have one model which integrate both worlds and seamlessly transition BI applications from Personal BI to Team BI to Organizational/Professional BI?”

Page 19: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

And now there is BISM…

Page 20: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

What is coming in Denali

BISM v2One model for all

reporting, analysis, dashboards, scorecardspersonal, team, corporate BI

Has a relational and multidimensional APISupport both cached (Molap & VertiPaq) and the pass-through (realtime) mode

only SQL Server data sources for now

Pass-throughno additional databasedata stays as is in the original structuresideal for the realtime analysis

Page 21: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

Why does this work

In “Denali” every cube automatically becomes a BI Semantic Model

To create a BI semantic model you create a:multidimensional model, tabular model, PowerPivot workbook

Every model looks like cubes/dimensions/measure groups/data sources/data source views under the covers

they share a common Analysis Services file format.this shared underlying structure that makes the BI semantic model work

Page 22: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

BISM Data modelHybrid model supporting multidimensional and tabular data models

Developed using an multidimensional or a tabular projectChoice depends on application needs and skillset

TabularFamiliar model, easier to build, faster time to solutionNot all advanced concepts (e.g. many-to-many) not available natively in the model… need calculations to simulate theseEasy to wrap a model over a raw database or warehouse for reporting & analytics

MultidimensionalSophisticated model, higher learning curveAdvanced concepts baked into the model and optimized (parent-child, many-to-many, attribute relationships, key vs. name, etc.)Ideally suited for OLAP type apps (e.g. planning, budgeting, forecasting) that need the power of the multidimensional model

Page 23: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

BISM Business Logic & Queries

Represents the intelligence or semantics in the modelDefines entities and relations between themUser-orientedDAX

Based on Excel formulas and relational concepts – easy to get startedComplex solutions require steeper learning curve – row/filter context, Calculate, etcCalculated columns enable new scenarios, however no named sets or calc members

MDXBased on understanding of multidimensional concepts – higher initial learning curveComplex solutions require steeper learning curve – CurrentMember, overwrite semantics, etc.Ideally suited for apps that need the power of multidimensional calculations – scopes, assignments, calc members

Page 24: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

BISM Data Access

This layer integrates data from multiple sources – relational databases, business applications, flat files, OData feeds, etc.

Two modes: cached and pass-throughCached:: pulls in data from all the sources and stores it in a compressed data structure

MOLAP and VertiPaq

Passthrough: pushes query processing and business logic down to the data source

ROLAP and DirectQuery

Page 25: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

Analysis Services ‘Denali’ - UDM

UDM

Excel 2010

Reporting Services „Denali”

SharePoint 2010•Excel Services•Reporting Services•PerformancePoint Services

•Visio Services

3rd party SSAS clients

SharePoint 2010•Power View

MDX MDX

MDX

MDX

MDX?

Page 26: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

Analysis Services ‘Denali’ - BISM

BISM

Excel 2010

Reporting Services „Denali”

SharePoint 2010•Excel Services•Reporting Services•PerformancePoint Services

•Visio Services

3rd party SSAS clients

SharePoint 2010•Power View

3rd party SSAS clients

MDX MDX

MDX

MDX

DAX

DAX?

DAX

Page 27: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

Powerpivot workbook

BISM

Excel 2010

Page 28: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

Delali’s new features in BISM

BISM in ‘Denali’ includes:hierarchies, KPIs, parent-child, drillthrough, perspectivesadditional DAX functions (RankX, DistinctCount, GroupBy, Lookup)security (role-based with Active Directory, column/row based)

BISM does not include:some of the UDM features

scripts, actions, translations, role-playing dimensionsobject modelwrite-back

otherrealtime for non-SQL Server data sourcesMDX query support for realtime

Page 29: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

Demo

BISM and the tabular model

Page 30: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

Advantages of BISM

Relatively simple modelFast responseFlexibleDAX calculations are similar to Excel formulasMore understandable and user-friendly to majority of peopleSame model across all scenarios

Easily scale from personal BI to corporate BIFaster development than in UDMPrototyping by end-usersEasier changes of modelReduction of cost in developing the full BI solution

Page 31: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

Positioning of BISMMOLAP is much more complex than PowerPivot, but it offers greater scalability

ROLAP is even more limited, but it scales above 50TB space

PowerPivot models can grow up to 2GB which is the limit set by SharePoint if they want to be shared among others. Otherwise, only the memory is the limit

BISM comes in the middle and fills the space between MOLAP and PowerPivot

For the space way above the 50TB there are new ColumnStore indexes (in the relational engine)

MOLAP

PowerPivot

BISM

RO

LAP

ColumnStore

source: Thomas Kejser, SQLCAT Usability

Sca

labi

lity

50 TB

5 TB

100 Gb

2 Gb

Page 32: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

Current Limitations in “Denali”

Two projects for building a BI Semantic Model

Future plan is to integrate these into 1 model

Use Vertipaq as an SSAS storageUse MDX scripts in tabular projects

DAX queries are not supported in multidimensional projects

and thereby Power Viewer, which uses DAX to retrieve data from the model

Page 33: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

Analysis Services Architecture

Page 34: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

Beyond Denali

BI Semantic Model featuresRole playing dimensionsTranslationsActionsMDX ScriptsRealtime over Oracle, Teradata, DB2…

ProgrammabilityBISM object modelMDX query support for RealtimeWrite back

Page 35: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

Wrapup

BISM is not a replacement for UDMDAX is not a replacement for MDXColumn store databases offering blazing fast performanceEvery model has its advantagesBI architects must decide when to apply which modelBISM v2 in not complete, expect changes!

Page 36: Albert van Dok SQL Zaterdag 12 november 2011. Background Life Before BISM What is BISM BISM Positioning Questions

Questions

Mail to [email protected]