Many-To-Many in DAX
Alberto FerrariSenior ConsultantSQLBI
DBI413
Many-to-many Relationships
Many to Many Account – Customer
Bridge_AccountCustomer
PK,FK1 ID_AccountPK,FK2 ID_Customer
Dim_Account
PK ID_Account
Account
Dim_Customer
PK ID_Customer
CustomerName
Dim_Date
PK ID_Date
Date
Dim_Type
PK ID_Type
Type
Fact_Transaction
PK ID_Transaction
FK1 ID_AccountFK3 ID_TypeFK2 ID_Date Amount
No support for M2M in Tabular
Multidimensional handles M2MDirectly in the enginePerformance may suffer
Tabular does not handle M2MOnly plain vanilla 1:N relationshipsOnly on one columnSeems not to be enough…
But… hey, we have DAX!
Agenda
Classical many-to-many patternCascading many-to-many Survey Data ModelBasket Analysis with many-to-many relationshipsPerformance of many-to-many in DAX
Classical M2M
We will start looking at the final resultThen, we dive into the DAX code
Demo – Classical M2M
The M2M DAX Pattern
LeveragesCALCULATERow ContextsFilter ContextsAutomatic transformation of Row Context into Filter Context using CALCULATE
Next slide: the formulaKeep it in mindIt will appear quite often from now on
The DAX Formula
The pattern of many-to-many is always the same (Later we will see an optimized version)
AmountM2M :=
CALCULATE( SUM( Fact_Transaction[Amount] ), FILTER( Dim_Account, CALCULATE( COUNTROWS( Bridge_AccountCustomer ) ) > 0 ))
AmountM2M :=
CALCULATE ( SUM( Fact_Transaction[Amount] ), FILTER( Dim_Account, CALCULATE( COUNTROWS( Bridge_AccountCustomer ) ) > 0 ))
What the formula should perform
Filter on Dim_Customer
Filter on Dim_Account
Bridge_AccountCustomer
PK,FK1 ID_AccountPK,FK2 ID_Customer
Dim_Account
PK ID_Account
Account
Dim_Customer
PK ID_Customer
CustomerName
Fact_Transaction
PK ID_Transaction
FK1 ID_AccountFK3 ID_TypeFK2 ID_Date Amount
AmountM2M :=
CALCULATE( SUM( Fact_Transaction[Amount] ), FILTER( Dim_Account, CALCULATE( COUNTROWS( Bridge_AccountCustomer ) ) > 0 ))
Bridge_AccountCustomer
PK,FK1 ID_AccountPK,FK2 ID_Customer
Dim_Account
PK ID_Account
Account
Dim_Customer
PK ID_Customer
CustomerName
Fact_Transaction
PK ID_Transaction
FK1 ID_AccountFK3 ID_TypeFK2 ID_Date Amount
How the formula works
This filter is applied by theTabular data model
AmountM2M :=
CALCULATE( SUM( Fact_Transaction[Amount] ), FILTER( Dim_Account, CALCULATE( COUNTROWS( Bridge_AccountCustomer ) ) > 0 ))
Bridge_AccountCustomer
PK,FK1 ID_AccountPK,FK2 ID_Customer
Dim_Account
PK ID_Account
Account
Dim_Customer
PK ID_Customer
CustomerName
Fact_Transaction
PK ID_Transaction
FK1 ID_AccountFK3 ID_TypeFK2 ID_Date Amount
How the formula works
This filter is applied by theTabular data model
FILTER: For each row in Dim_AccountCALCULATE: filters the bridge
AmountM2M :=
CALCULATE( SUM( Fact_Transaction[Amount] ), FILTER( Dim_Account, CALCULATE( COUNTROWS( Bridge_AccountCustomer ) ) > 0 ))
Bridge_AccountCustomer
PK,FK1 ID_AccountPK,FK2 ID_Customer
Dim_Account
PK ID_Account
Account
Dim_Customer
PK ID_Customer
CustomerName
Fact_Transaction
PK ID_Transaction
FK1 ID_AccountFK3 ID_TypeFK2 ID_Date Amount
How the formula works
This filter is applied by theTabular data model
FILTER: For each row in Dim_AccountCALCULATE: filters the bridge
Only rows in Dim_Account whereCOUNTROWS () > 0 survive the FILTER
AmountM2M :=
CALCULATE( SUM( Fact_Transaction[Amount] ), FILTER( Dim_Account, CALCULATE( COUNTROWS( Bridge_AccountCustomer ) ) > 0 ))
Bridge_AccountCustomer
PK,FK1 ID_AccountPK,FK2 ID_Customer
Dim_Account
PK ID_Account
Account
Dim_Customer
PK ID_Customer
CustomerName
Fact_Transaction
PK ID_Transaction
FK1 ID_AccountFK3 ID_TypeFK2 ID_Date Amount
How the formula works
This filter is applied by theTabular data model
FILTER: For each row in Dim_AccountCALCULATE: filters the bridge
Only rows in Dim_Account whereCOUNTROWS () > 0 survive the FILTER
SUM is evaluated only for the accounts filtered from the customers
The Karma of CALCULATE
AmountM2M_Wrong := CALCULATE ( SUM (Fact_Transaction[Amount]), FILTER ( Dim_Account, COUNTROWS (Bridge_AccountCustomer) > 0 ))
AmountM2M_Correct := CALCULATE ( SUM (Fact_Transaction[Amount]), FILTER ( Dim_Account, CALCULATE ( COUNTROWS (Bridge_AccountCustomer) ) > 0 ))
AmountM2M :=
CALCULATE ( SUM (Fact_Transaction[Amount]), FILTER ( Dim_Account, COUNTROWS (Bridge_AccountCustomer) > 0 ))
Wrong formula in action
Customer
Mark
Paul
Robert
Luke
Customer Account
Mark Mark
Paul Paul
Robert Robert
Luke Luke
Mark Mark-Robert
Robert Mark-Robert
Mark Mark-Paul
Paul Mark-Paul
Account
Mark
Paul
Robert
Luke
Mark-Robert
Mark-Paul
All the rows in the Account table survived the FILTER
AmountM2M :=
CALCULATE ( SUM (Fact_Transaction[Amount]), FILTER ( Dim_Account, CALCULATE ( COUNTROWS (Bridge_AccountCustomer) ) > 0 ))
Many-to-many in action
Customer
Mark
Paul
Robert
Luke
Customer Account
Mark Mark
Paul Paul
Robert Robert
Luke Luke
Mark Mark-Robert
Robert Mark-Robert
Mark Mark-Paul
Paul Mark-Paul
Account
Mark
Paul
Robert
Luke
Mark-Robert
Mark-Paul
Two rows in the Account table survived the FILTER
The Karma of CALCULATE
AmountM2M_Wrong := CALCULATE ( SUM (Fact_Transaction[Amount]), FILTER ( Dim_Account, COUNTROWS (Bridge_AccountCustomer) > 0 ))
AmountM2M_Correct := CALCULATE ( SUM (Fact_Transaction[Amount]), FILTER ( Dim_Account, CALCULATE ( COUNTROWS (Bridge_AccountCustomer) ) > 0 ))
DAX Formula with SUMMARIZE
The formula performs much better using SUMMARIZE instead of COUNTROWS.
SUMMARIZE returns the ID_Account that are visible in the bridge with a single step, transforming them in DimAccount column references
AmountM2M :=
CALCULATE( SUM( Fact_Transaction[Amount] ), SUMMARIZE( Bridge_AccountCustomer, DimAccount[ID_Account] ))
AmountM2M :=
CALCULATE( SUM( Fact_Transaction[Amount] ), SUMMARIZE( Bridge_AccountCustomer, DimAccount[ID_Account] ))
SUMMARIZE in action
Customer
Mark
Paul
Robert
Luke
Customer Account
Mark Mark
Paul Paul
Robert Robert
Luke Luke
Mark Mark-Robert
Robert Mark-Robert
Mark Mark-Paul
Paul Mark-Paul
Account
Mark
Paul
Robert
Luke
Mark-Robert
Mark-Paul
The context is movedin a single Vertipaqoperation
DAX Formula with Direct Table Filtering
Behave and performs the same as SUMMARIZEMore compactSomewhat harder to understand
AmountM2M :=
CALCULATE( SUM( Fact_Transaction[Amount] ), Bridge_AccountCustomer)
Many-to-many with SUMMARIZE
The formula isMuch faster, no iteration neededEasier to codeSlightly harder to understand
Using COUNTROWS can still be useful If you want to find missing valuesTo use more complex relationships
Both formulas are worth learning
Many to Many – First Impressions
ComplexityNot in the data modelOnly in the formula
Need to understandCALCULATERelationships in Tabular
Cascading M2M
Many-to-many with more than one bridge table
Cascading many-to-many
Cascading Many To Many
Filter on Dim_Category
Filter on Dim_Account
The pattern is the same, butthis time we need to jumptwo steps to complete ourtask
CALCULATE( SUM( Fact_Transaction[Amount] ), CALCULATETABLE ( SUMMARIZE ( BridgeAccountCustomer, DimAccount[ID_Account] ), SUMMARIZE ( Bridge_CustomerCategory, DimCustomer[ID_Customer] ) ))
Cascading Many To Many
Cascading Many To Many
Generic FormulaWorks with any number of stepsBe carefulStart with the farthest tableMove one step a time towards the fact tableOne SUMMARIZE for each step
Complexity: M x N (geometric…)
Cascading Alternative
Cascading many to many flattenedDim_Date
PK ID_Date
DateDim_Account
PK ID_Account
Account
Dim_Type
PK ID_Type
Type
Dim_Customer
PK ID_Customer
CustomerName
Dim_Category
PK ID_Category
CategoryName
Fact_Transaction
PK ID_Transaction
FK1 ID_AccountFK3 ID_TypeFK2 ID_Date Amount
Bridge_AccountCustomerCategory
FK1 ID_AccountFK3 ID_CustomerFK2 ID_Category
Need for someETL here
The bridge now «feels»three different filters, butthe formula becomes a classical many to manyand runs faster
Flattened Cascading Many To Many
Flattened Data ModelFaster than the cascading oneSimpler formulaNeeds some ETL
A view is enough most of the timeWorth a try in Multidimensional too…
Survey
A first example of the usage of the many-to-many pattern
Using M2M on Surveys
Survey Data Model
Facts:• One customer answers many questions• One question is answered by many customers• We want to use a PivotTable to query this
Survey Scenario
Question:
«What is the yearly income of consultants?»
In other words…
• Take the customers who answered «Consultant» at the «Job» question
• Slice them using the answer to the question «Yearly Income»
Survey Analytical Data Model
Two «Filter» Dimensions• One for «Job» = «Consultants»• One for «Yearly Income»
Question1 = «Job»Answer1 = «Consultant»
Question2 = «Yearly Income»
Filter1
PK ID_Answer
Question Answer
Filter2
PK ID_Answer
Question Answer
Answers
FK1,FK2 ID_AnswerFK3 ID_Customer
Customers
PK ID_Customer
Customer Category
Survey: The Final Result
Value = Customer CountThe fact table, this time, acts as the bridge
Filter1
PK ID_Answer
Question Answer
Filter2
PK ID_Answer
Question Answer
Answers
FK1,FK2 ID_AnswerFK3 ID_Customer
Customers
PK ID_Customer
Customer Category
Survey Analytical Data Model
No relationship between the fact table and the two filter dimensions because there is only one value for ID_Answer
Survey: The DAX Formula
IF ( HASONEVALUE(Filter1[ID_Answer]) && HASONEVALUE(Filter2[ID_Answer]), CALCULATE ( CALCULATE ( COUNTROWS (Customers), FILTER ( Customers, CALCULATE ( COUNTROWS (Answers), Answers[ID_Answer] = VALUES (Filter2[ID_Answer])) > 0 ) ), FILTER ( Customers, CALCULATE ( COUNTROWS (Answers), Answers[ID_Answer] = VALUES (Filter1[ID_Answer])) > 0 ) ))
Filter1
PK ID_Answer
Question Answer
Filter2
PK ID_Answer
Question Answer
Answers
FK1,FK2 ID_AnswerFK3 ID_Customer
Customers
PK ID_Customer
Customer Category
Additional conditions toset the relationships withtwo tables in DAX only
Survey - Conclusions
Very powerful and compact data modelDuplicates only dimensionsDifferent from the same pattern in Multidimensional
Handles any number of questionsWe have shown two, but it is not a limit
Interesting topicsFact table as the bridgeRelationships set in DAXCan be queried with a simple PivotTable
Basket Analysis
A pretty complex scenario, solved with the many-to-many pattern
Basket Analysis Demo
Basket Analysis: The Scenario
Of all the customers who have bought a Mountain Bike, how many have never bought a mountain tire tube?
Basket Analysis in SQL
Two iterations over the fact table needed
SELECTCOUNT (DISTINCT A.CustomerKey)
FROMSales A INNER JOIN Sales B
ON A.CustomerKey = B.CustomerKey WHERE
A.ProductModel = ‘MOUNTAIN TIRE TUBE' AND A.Year <= 2004AND
B.ProductModel = ‘MOUNTAIN-100' AND B.Year <= 2004
Look at the query plan…
This is the fact table…Do you really like to self-join it?
Basket Analysis: The Data Model
Of all the customers who have bought a Mountain Bike, how many have never bought a mountain tire tube?
We can filter «Mountain Tire Tube» with this table but… where do we filter «Mountain Bike»?
Basket Analysis: The Data Model
2° filter: «Mountain Bike»
1° filter: «Mountain Tire Tube»
The Final Result
1° filter: «Mountain Bike»
2° filter: «Mountain Tire Tube»
HavingProduct = Bought BothNotHavingProduct = Bought Bike, No Tire
Inactive Relationships
Inactive Relationships
The formula for HavingProducts
=COUNTROWS ( CALCULATETABLE ( FILTER ( Customers; CALCULATE ( COUNTROWS (Sales); USERELATIONSHIP ( Sales[ProductKey]; ProductsBought[ProductKey]) ) > 0 && CALCULATE ( COUNTROWS (Sales); USERELATIONSHIP ( Sales[ProductKey]; ProductsToCheck[ProductKey]) ) > 0 ); FILTER ( ALL (OrderDate); OrderDate[FullDate] <= MAX (OrderDate[FullDate]) ) ))
Not HavingProducts
=COUNTROWS ( CALCULATETABLE ( FILTER ( Customers; CALCULATE ( COUNTROWS (Sales); USERELATIONSHIP ( Sales[ProductKey]; ProductsBought[ProductKey]) ) > 0 && CALCULATE ( COUNTROWS (Sales); USERELATIONSHIP ( Sales[ProductKey]; ProductsToCheck[ProductKey]) ) = 0 ); FILTER ( ALL (OrderDate); OrderDate[FullDate] <= MAX (OrderDate[FullDate]) ) ))
Nice, and the many-to-many?
=COUNTROWS ( CALCULATETABLE ( FILTER ( Customers; CALCULATE ( COUNTROWS (Sales); USERELATIONSHIP ( Sales[ProductKey]; ProductsBought[ProductKey]) ) > 0 && CALCULATE ( COUNTROWS (Sales); USERELATIONSHIP ( Sales[ProductKey]; ProductsToCheck[ProductKey]) ) = 0 ); FILTER ( ALL (OrderDate); OrderDate[FullDate] <= MAX (OrderDate[FullDate]) ) ))
1° M2M Pattern
2° M2M Pattern
Leveraging Tabular Features
Multiple relationships between tablesRole DimensionsRole Keys
Active / Inactive relationshipsUSERELATIONSHIP
New filter functionSelects a model relationship to use in calculations
Querying M2M
Well… what about performance of m2m in Tabular? Let us see them!
M2M Performance
Audience: The Data Model
Audience: The Numbers
2,700,000
176,000
140230
1,440
9,000
4,000,000,000
Many to Many - Conclusions
Learn DAX. Seriously… learn it!Study and learn many-to-many with COUNTROWSThen, implement with SUMMARIZE, or Table Filtering
If SUMMARIZE is not enoughRevert back to COUNTROWS
Many powerful patterns with many-to-manyVery fast in Tabular when compared with Multidimensional
Related Content
DBI305 Developing and Managing a BI Semantic Model in Analysis Services
DBI413 Many-to-Many Relationships in BISM Tabular
DBI319 BISM: Multidimensional vs. Tabular
DBI62-HOL Optimizing a MS SQL Server 2012 Tabular BI Semantic Model
Book Signing at Book Store on Wednesday, June 13th, 12:30pm-1:00pm
Track Resources
@sqlserver@ms_teched
mvaMicrosoft Virtual Academy
SQL Server 2012 Eval Copy
Get Certified!
Hands-On Labs
Resources
Connect. Share. Discuss.
http://northamerica.msteched.com
Learning
Microsoft Certification & Training Resources
www.microsoft.com/learning
TechNet
Resources for IT Professionals
http://microsoft.com/technet
Resources for Developers
http://microsoft.com/msdn
Complete an evaluation on CommNet and enter to win!
MS Tag
Scan the Tagto evaluate thissession now onmyTechEd Mobile
Required Slide *delete this box when your slide is finalized
Your MS Tag will be inserted here during the final scrub.
Who’s speaking?
BI Expert and ConsultantFounders of www.sqlbi.com
Problem SolvingComplex Project AssistanceDataWarehouse Assesments and DevelopmentCourses, Trainings and Workshops
Book WriterMicrosoft Business Intelligence PartnerSSAS Maestro – MVP – MCP
© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to
be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS
PRESENTATION.