Dimensional Design V26.6.02

Embed Size (px)

Citation preview

  • 8/9/2019 Dimensional Design V26.6.02

    1/123

    Data ModelingData Modeling

  • 8/9/2019 Dimensional Design V26.6.02

    2/123

  • 8/9/2019 Dimensional Design V26.6.02

    3/123

    !usiness"rocess

    #onceptual

    $ogical

    Model"hysical

    Model

    Leel! o" modeling

  • 8/9/2019 Dimensional Design V26.6.02

    4/123

    Leel! o" modelingLeel! o" modeling

    # #onceptual modeling

    Describe data requirements from a business point of

    view without technical details

    # $ogical modeling %efine conceptual models

    Data structure oriented& platform independent

    # "hysical modeling

    Detailed specification of what is physically

    implemented using specific technology

  • 8/9/2019 Dimensional Design V26.6.02

    5/123

    Con$eptual ModelCon$eptual Model

    # A conceptual model shows data through

    business eyes.

    # All entities which have business meaning.

    # 'mportant relationships

    # ew significant attributes in the entities.

    # ew identifiers or candidate("candidates) *eys.

  • 8/9/2019 Dimensional Design V26.6.02

    6/123

    Logi$al ModelLogi$al Model

    # %eplaces many+to+many relationships withassociative entities.

    # Defines a full population of entity attributes.

    # May use non+physical entities for domains andsub+types.

    # ,stablishes entity identifiers.

    # Has no specifics for any %D!M- orconfiguration.

    # ,% Diagram ey !ased Modeling ullyAttributed Model

  • 8/9/2019 Dimensional Design V26.6.02

    7/123

    %&y!i$al%&y!i$alModelModel

    # A "hysical data model may include

    %eferential 'ntegrity

    'nde/es

    0iews

    Alternate *eys and other constraints

    1ablespaces and physical storage ob2ects.

  • 8/9/2019 Dimensional Design V26.6.02

    8/123

    Dimen!ional ModelingDimen!ional Modeling

  • 8/9/2019 Dimensional Design V26.6.02

    9/123

    Dimen!ional ModelingDimen!ional Modeling

    Dimensional Modeling is the pillar of a Data

    Warehouse

    It comes when the business requirements are clear

    and KPIs have been identified

    Typically the logical data model are delivered in theanalysis phase and the physical data model comes in

    the Design stage of the DW project life cycle

  • 8/9/2019 Dimensional Design V26.6.02

    10/123

    !perations"ales andMar#eting

    $ustomer"ervices

    ProductDevelopment

    &e Bu!ine!! %ro$e!!e!&e Bu!ine!! %ro$e!!e!

    # A series of interrelated business processeswhich contribute to increased product

    value for the customer& and to profit for

    the enterprise + "orter 3456

  • 8/9/2019 Dimensional Design V26.6.02

    11/123

    &e Bu!ine!! %ro$e!!e!&e Bu!ine!! %ro$e!!e!

    # !usinesses constantly strive to optimize each

    process in the value chain

    # 7ptimization requires measuring and

    analyzing the effectiveness of each process aswell as the value chain as a whole

    !perations

    "ales and

    Mar#eting

    $ustomer

    "ervices

    Product

    Development

  • 8/9/2019 Dimensional Design V26.6.02

    12/123

    (L% Sy!tem!(L% Sy!tem!

    Manufacturing

    and Process

    $ontrol

    "ales !rder %ntry

    and $ampaign

    Management

    $ustomer

    "upport and

    &elationship

    Management

    "hipping and

    Inventory

    Management

    !perations

    "ales and

    Mar#eting

    $ustomer

    "ervices

    Product

    Development

  • 8/9/2019 Dimensional Design V26.6.02

    13/123

  • 8/9/2019 Dimensional Design V26.6.02

    14/123

  • 8/9/2019 Dimensional Design V26.6.02

    15/123

    (L% Data Model(L% Data Model

    # ocus of 7$1" Design 'ndividual data elements

    Data relationships

    # Design goals Accurately model

    business

    %emove redundancy

  • 8/9/2019 Dimensional Design V26.6.02

    16/123

    *&y (L% De!ign i! +ad "or a D**&y (L% De!ign i! +ad "or a D*

    # #omple/

    # 8nfamiliar to

    business people

    # 'ncomplete history

    # -low query

    performance

  • 8/9/2019 Dimensional Design V26.6.02

    17/123

    &e !olution ) Dimen!ional Model&e !olution ) Dimen!ional Model

    # $ogical modeling technique

    or designing relational database structures

    # Addresses 7$1" design shortcomings

    or use in analytic systems# irst developed early 3459:s

    "ac*aged goods industry

    # "opularized by %alph imball& "hD.

    344; boo*< :1he Data arehouse 1ool*it:

  • 8/9/2019 Dimensional Design V26.6.02

    18/123

    Dimen!ional Modeling Ba!i$!Dimen!ional Modeling Ba!i$!

  • 8/9/2019 Dimensional Design V26.6.02

    19/123

    ypi$al +u!ine!! re,uirement!ypi$al +u!ine!! re,uirement!

    'I need to see

    overall grossmargin by

    category'

    'What are

    outstandingreceivables

    by ()*

    account+'

    Process,oriented business questions

    '-ow do inventory

    levels comparewith sales by

    product and

    warehouse+'

    What is the

    return rate foreach supplier+

    !perations

    "ales and

    Mar#eting

    $ustomer

    "ervices

    Product

    Development

  • 8/9/2019 Dimensional Design V26.6.02

    20/123

    -ey %er"orman$e Indi$ator!-ey %er"orman$e Indi$ator!

    Process,oriented business measures

    gross margin receivables inventory levels.sales return rate

    !perations

    "ales and

    Mar#eting

    $ustomer

    "ervices

    Product

    Development

  • 8/9/2019 Dimensional Design V26.6.02

    21/123

    Brand

    Captain

    Coffee

    Product

    Standard

    Coffee

    Maker

    Thermal

    Coffee

    Maker

    Deluxe

    CoffeeMaker

    All

    Products

    Units Sold

    5,

    !,"

    !,#$

    %,"#$

    /nits "hipped

    $,&

    ',($!

    ',(5&

    0.121

    ) Shipped

    #()

    (&)

    &)

    #5)

    $offee Ma#er 3ulfillment &eport

    *acts*acts

    %ro$e!! Mea!urement)*&at e !ee%ro$e!! Mea!urement)*&at e !ee

    # Measures Metrics or indicators by

    which people evaluate a

    business process

    %eferred to as =acts>

    # ,/amples

    Margin

    'nventory Amount

    -ales Dollars

    %eceivable Dollars

    %eturn %ate

  • 8/9/2019 Dimensional Design V26.6.02

    22/123

    Brand

    Captain

    Coffee

    Product

    Standard

    Coffee

    Maker

    Thermal

    Coffee

    Maker

    Deluxe

    Coffee

    Maker

    All

    Products

    /nits "old

    5,

    !,"

    !,#$

    2.405

    /nits "hipped

    $,&

    ',($!

    ',(5&

    0.121

    6 "hipped

    #()

    (&)

    &)

    076

    $offee Ma#er 3ulfillment &eport

    Dimensions

    %ro$e!! %er!pe$tie!)$ro!! &at%ro$e!! %er!pe$tie!)$ro!! &at

    # Dimensions 1he parameters by which

    measures are viewed

    8sed to brea* out& filter or

    roll up measures

    7ften found after the word=by> in a business question

    Descriptive business terms

    # ,/amples "roduct

    arehouse

    #ustomer

    -upplier

  • 8/9/2019 Dimensional Design V26.6.02

    23/123

    Dimen!ional ModelDimen!ional Model

    # Definition

    $ogical data model used to represent the measures and

    dimensions that pertain to one or more business sub2ect

    areas

    Dimensional Model ? -tar -chema

    # -erves as basis for the design of a relational database

    schema

    # #an easily translate into multi+dimensional database

    design if required

    # 7vercomes 7$1" design shortcomings

  • 8/9/2019 Dimensional Design V26.6.02

    24/123

    Dimen!ional Model dantage!Dimen!ional Model dantage!

    # 8nderstandable

    # -ystematically

    represents history

    # %eliable 2oin paths

    # High performance query

    # ,nterprise scalability

  • 8/9/2019 Dimensional Design V26.6.02

    25/123

    "tore"tore

    Star SchemaStar Schema

    TimeTime

    ProductProduct

    3acts3acts

    Data *are&ou!ing S$&emaData *are&ou!ing S$&ema

    # ewer tables

    Denormalized

    #onsolidated

    # Dimensional

    amiliar to users

    acts go in the fact tables

    Dimensions in dimension

    tables

    # 'ncreases understandability

  • 8/9/2019 Dimensional Design V26.6.02

    26/123

    eer Join %at&! a!ter ,ueryeer Join %at&! a!ter ,uery

  • 8/9/2019 Dimensional Design V26.6.02

    27/123

    eer Join %at&! ) a!ter ,ueryeer Join %at&! ) a!ter ,uery

    per"orman$eper"orman$e

    # -tar schema 2oins

    Defined during schema

    design + not runtime

    !usiness people can easily

    understand theserelationships

    7ne+to+many relations

    between dimensions and

    facts %eferential integrity

    always enforced

  • 8/9/2019 Dimensional Design V26.6.02

    28/123

    (t&er dantage!(t&er dantage!

    # Deterministic querypatterns

    # -tar schema queryoptimizationsupported by allma2or %D!M-

    vendors

  • 8/9/2019 Dimensional Design V26.6.02

    29/123

    Su+e$t rea Model! Data Mart!Su+e$t rea Model! Data Mart!

    Manufacturing and

    Process $ontrol

    "ales !rder %ntry

    and $ampaign

    Management

    $ustomer "upport

    and &elationship

    Management

    "hipping and

    Inventory

    Management

    Su/0ect are

    dimensional

    model

    Su/0ect are

    123 model

    !perations

    "ales and

    Mar#eting

    $ustomer

    "ervices

    Product

    Development

  • 8/9/2019 Dimensional Design V26.6.02

    30/123

    nterpri!e Model!nterpri!e Model!

    1nterprise

    Scope 123

    model

    1nterprise

    scope

    dimensional

    model

  • 8/9/2019 Dimensional Design V26.6.02

    31/123

    Dimen!ional Modeling De!ign Detail!Dimen!ional Modeling De!ign Detail!

  • 8/9/2019 Dimensional Design V26.6.02

    32/123

  • 8/9/2019 Dimensional Design V26.6.02

    33/123

    key

    key

    key

    Dimension

    Dimension

    Dimension

    Dimen!ion -ey!)3enerated "or D*Dimen!ion -ey!)3enerated "or D*

    # -ynthetic *eys

    ,ach table assigned a

    unique primary *ey&

    specifically generated

    for the data warehouse

    "rimary *eys from

    source systems may be

    present in the

    dimension& but are not

    used as primary *eys in

    the star schema

  • 8/9/2019 Dimensional Design V26.6.02

    34/123

    Key

    attribute

    attribute

    attribute

    Key

    attribute

    attribute

    attribute

    Key

    attribute

    attribute

    attribute

    Dimension

    Dimension

    Dimension

    Dimen!ion ttri+ute!Dimen!ion ttri+ute!

    # Dimension attributes -pecify the way in

    which measures are

    viewed< rolled up&

    bro*en out or

    summarized

    7ften follow the word

    =by> as in =-how me

    -ales by %egion and

    @uarter> requently referred to

    as :Dimensions:

  • 8/9/2019 Dimensional Design V26.6.02

    35/123

    3act Table

    fact8

    fact9

    fact5

    a$t a+le!a$t a+le!

    # "rocess measures -tart by assigning one

    fact table per business

    sub2ect area

    act tables store theprocess measures

    (acts)

    #ompared to dimension

    tables& fact tables

    usually have a verylarge number of rows

  • 8/9/2019 Dimensional Design V26.6.02

    36/123

    3act Table

    fact8

    fact9

    fact5

    key

    key

    key

    a$t a+le %rimary -eya$t a+le %rimary -ey

    # ,very fact tableMulti+part primary

    *ey added

    Made up of foreign*eys referencing

    dimensions

  • 8/9/2019 Dimensional Design V26.6.02

    37/123

    a$t a+le Spar!itya$t a+le Spar!ity

    # -parsity 1erm used to describe the very common situation

    where a fact table does not contain a row for every

    combination of every dimension table row for a

    given time period

    !ecause fact tables contain a very small percentage

    of all possible combinations& they are said to be

    sparsely populated or sparse

  • 8/9/2019 Dimensional Design V26.6.02

    38/123

    3act Table

    a$t a+le 3raina$t a+le 3rain

    # Brain 1he level of detail

    represented by a row in

    the fact table

    Must be identified early

    #ause of greatest

    confusion during design

    process

    # ,/ample ,ach row in the fact table

    represents the daily item

    sales total

  • 8/9/2019 Dimensional Design V26.6.02

    39/123

    De!igning a Star S$&emaDe!igning a Star S$&ema

    # ive initial design steps

    # !ased on imball:s si/ steps

    # -tart designing in order

    # %e+visit and ad2ust over pro2ect life

  • 8/9/2019 Dimensional Design V26.6.02

    40/123

  • 8/9/2019 Dimensional Design V26.6.02

    41/123

    a$t a+le Detail!a$t a+le Detail!

  • 8/9/2019 Dimensional Design V26.6.02

    42/123

  • 8/9/2019 Dimensional Design V26.6.02

    43/123

    4ample a$t a+le 5e$ord!4ample a$t a+le 5e$ord!

    time:#ey model:#ey dealer:#ey revenue quantity

    ' ' ' #5&"7!# !

    ' ! ' '5!!(7$# $

    ' $ ' !&$(7'5 '

    ' " ' '$!(#57!! "

    ' 5 ' "$#&%7"5 '

    ' ' ! $5(#&7%& '

    ' $ ! 5#&("7#& !

    ' 5 ! %!(7(# !

    Primar+ 8e+ *acts

    -ales acts

  • 8/9/2019 Dimensional Design V26.6.02

    44/123

  • 8/9/2019 Dimensional Design V26.6.02

    45/123

    4ample6 dditie a$t!4ample6 dditie a$t!

    Model

    model_key

    /rand

    cate.or+

    line

    model

    "ales 3acts

    model_key

    dealer_key

    time_key

    re6enue

    uantit+

    Time

    time_key

    +ear

    uarter

    month

    date

    Dealer

    dealer_key

    re.ion

    state

    cit+

    dealer

    " "

  • 8/9/2019 Dimensional Design V26.6.02

    46/123

    ype! o" a$t!ype! o" a$t!

    # -emi+additive #an be summed across most dimensions but not all

    ,/amples< 'nventory quantities& account balances&

    or personnel counts

    Anything that measures a =level>

    Must be careful with ad+hoc reporting

    7ften aggregated across the =forbidden dimension>

    by averaging

    l S i ddi i l S i dditi t

  • 8/9/2019 Dimensional Design V26.6.02

    47/123

    4ample6 Semi)additie a$t!4ample6 Semi)additie a$t!

    "ales 3acts

    model_key

    dealer_key

    time_key

    in6entor+

    Model

    model_key

    /rand

    cate.or+

    line

    model

    Time

    time_key

    +ear

    uarter

    month

    date

    Dealer

    dealer_key

    re.ion

    state

    cit+

    dealer

    " "

  • 8/9/2019 Dimensional Design V26.6.02

    48/123

    ype! o" a$t!ype! o" a$t!

    # Con+Additive

    #annot be summed across any dimension

    All ratios are non+additive

    !rea* down to fully additive components&

    store them in fact table

    l 7 ddi i l 7 dditi t

  • 8/9/2019 Dimensional Design V26.6.02

    49/123

    4ample6 7on)dditie a$t!4ample6 7on)dditie a$t!

    Margin_rate is non-additiveMargin_rate = margin_amt/revenue

    model_key

    dealer_key

    time_key

    revenue

    margin_amt

    time_key

    year

    quarter

    month

    date

    model_key

    brand

    category

    line

    model

    Model Sales Facts

    dealer_key

    region

    state

    city

    dealer

    Dealer

    Time

    8 i 8 it t

  • 8/9/2019 Dimensional Design V26.6.02

    50/123

    8nit mount!8nit mount!

    # 8nit price& 8nit cost& etc.

    Are numeric& but not measures

    -tore the e/tended amounts which are

    additive

    8nit amounts may be useful as dimensions

    for =price point analysis>

    May store unit values to save space

    l +l tl t +l

  • 8/9/2019 Dimensional Design V26.6.02

    51/123

    a$tle!! a$t a+lea$tle!! a$t a+le

    # A fact table with no measures in it

    # Cothing to measure...

    # ,/cept the convergence of dimensional

    attributes

    # -ometimes store a =3> for convenience

    # ,/amples< Attendance& #ustomer

    Assignments& #overage

  • 8/9/2019 Dimensional Design V26.6.02

    52/123

    Dimen!ion a+le Detail!Dimen!ion a+le Detail!

    l Di i +l l Di i +l

  • 8/9/2019 Dimensional Design V26.6.02

    53/123

    4ample Dimen!ion a+le!4ample Dimen!ion a+le!

    dealer_key

    region

    state

    city

    dealer

    model_key

    brandcategory

    line

    model

    Model time_key

    year

    quarter

    month

    date

    Time

    Dealer

    l Di i +l 5 d l Di i +l 5 d

  • 8/9/2019 Dimensional Design V26.6.02

    54/123

    4ample Dimen!ion a+le 5e$ord!4ample Dimen!ion a+le 5e$ord!

    time:#ey year quarter month date

    ' '%%# 9' :anuar+ '2'52%#

    ! '%%# 9' :anuar+ '2'(2%#

    $ '%%# 9' :anuar+ '2'#2%#

    '5 '%%# 9! April "2'2%#

    ### '%%& 9" ;cto/er '2'$2%&

    S+nthetic 8e+ Attri/utes

    Time Dimension

  • 8/9/2019 Dimensional Design V26.6.02

    55/123

    4ample Dimen!ion a+le 5e$ord!4ample Dimen!ion a+le 5e$ord!

    dealer:#ey region state city dealer

    ' s

    !

  • 8/9/2019 Dimensional Design V26.6.02

    56/123

  • 8/9/2019 Dimensional Design V26.6.02

    57/123

    l S "l ; S & l S "l ; S &

  • 8/9/2019 Dimensional Design V26.6.02

    58/123

    4ample Sno"la;e S$&ema4ample Sno"la;e S$&ema

    category_key

    category

    brand_key

    brand_key

    brand

    Brand

    Category

    line_key

    line

    category_key

    Line

    model_key

    model

    line_key

    Model

    model_key

    dealer_key

    time_key

    revenue

    quantity

    SalesFacts

    date_key

    date

    month_ke

    y

    Day

    month_key

    month

    quarter_ke

    y

    Monthquarter_ke

    y

    quarter

    year_key

    Quarteryear_key

    year

    Year

    dealer_ke

    y

    dealer

    city_key

    Dealercity_key

    citystate_key

    Citystate_key

    state

    region_key

    Stateregion_ke

    y

    region

    Region

    Sl l C& i Di iSloly C&anging Dimen!ion!

  • 8/9/2019 Dimensional Design V26.6.02

    59/123

    Sloly C&anging Dimen!ion!Sloly C&anging Dimen!ion!

    # Dimension source data may change over time# %elative to fact tables& dimension records

    change slowly

    # Allows dimensions to have multiple :profiles:over time to maintain history

    # ,ach profile is a separate record in a dimension

    table

    Sl l C& i Di i lSl l C& i Di i l

  • 8/9/2019 Dimensional Design V26.6.02

    60/123

    Sloly C&anging Dimen!ion 4ampleSloly C&anging Dimen!ion 4ample

    # ,/ample< A woman gets married"ossible changes to customer dimension

    # $ast Came

    # Marriage -tatus

    # Address# Household 'ncome

    ,/isting facts need to remain associated with her

    single profile

    Cew facts need to be associated with her marriedprofile

  • 8/9/2019 Dimensional Design V26.6.02

    61/123

  • 8/9/2019 Dimensional Design V26.6.02

    62/123

    De!igning Load! to

  • 8/9/2019 Dimensional Design V26.6.02

    63/123

    Customer DimensionTableColumn Name SCD TyeCustomer !ey N/"

    Customer #D $

    Name $

    Marital Status $%ome #ncome $

    De!igning Load! to

  • 8/9/2019 Dimensional Design V26.6.02

    64/123

    ype 1 4ampleype 1 4ample

    CustID Name

    MaritalStatus

    1! Sue"ones #$!%& #### ## # ## S

    'omeIncome

    CustIdCust

    ID

    NameStatus

    1 1! Sue "ones S $!%(%

    'omeIncome

    Cust(ey

    Cust(ey

    Day(ey Sales

    1 1$)%1 1 $)%

    Day Dim

    Day(ey

    BusinessDate

    1 1*!1*%1

    Sales &actsCustomer DimCustomer '(T)

    Day(ey

    BusinessDate

    1 1*!1*%1

    *%1*%1

    Day Dim

    Cust(ey

    Day(ey Sales

    1 1$)%1 $#%

    Sales &acts

    CustID Name

    MaritalStatus

    1! Sue Smith M$+%(

    'omeIncome

    Customer '(T)

    Status

    Customer Dim

    CustID Name

    MaritalStatus

    1 1! Sue Smith M $+%(%

    'omeIncome

    Cust(ey Status

    ,LT- Star Schema

    Sue .ets Married *1*%1

    ype 1 4ampleype 1 4ample

  • 8/9/2019 Dimensional Design V26.6.02

    65/123

    ype 1 4ampleype 1 4ample

    # 7bservations #ustomer history is not maintained in the 7$1"

    system

    #ustomer history is not maintained in the star

    schema -ue only has one customer :profile: in customer

    dimension table

    -ueGs sales facts across all history are associatedwith her married profile

    -ales facts that were associated with -ueGs singleprofile have been lost

    De!igning Load! to

  • 8/9/2019 Dimensional Design V26.6.02

    66/123

    Customer DimensionTableColumn Name SCD Tye

    Customer !ey N/"

    Customer #D *

    Name *Marital Status *

    %ome #ncome $

    De!igning Load! to

  • 8/9/2019 Dimensional Design V26.6.02

    67/123

    ype 2 4ampleype 2 4ample

    CustID Name

    MaritalStatus

    1! Sue "ones S!%(

    Day Dim

    'omeIncome

    CustID Name

    MaritalStatus

    1 1! Sue"ones S $!%(%

    'omeIncome

    Cust(ey

    Cust(ey

    Day(ey Sales

    1 1$)%

    Day(ey

    BusinessDate

    1 1*!1*%1

    Sales &actsCustomer DimCustomer '(T)

    Cust(ey

    Day(ey Sales

    1 1$)% $#%

    Sales &acts

    CustID Name

    MaritalStatus

    1 1! Sue "ones S $!%(1

    'omeIncome

    Cust(ey Status

    1! Sue Smith M $+%(%

    Customer Dim

    CustID Name

    MaritalStatus

    1! Sue Smith M$+%(

    'omeIncome

    Customer '(T)

    Status

    ,LT- Star Schema

    Sue .ets Married *1*%1

    Day Dim

    Day(ey

    BusinessDate

    1 1*!1*%1

    *%1*%1

    ype 2 4ampleype 2 4ample

  • 8/9/2019 Dimensional Design V26.6.02

    68/123

    ype 2 4ampleype 2 4ample

    # 1ype E 7bservations

    #ustomer history is not maintained in the 7$1"

    system

    #ustomer history is maintained in the star schema

    -ue has two :profiles: in the customer dimension

    -ueGs sales facts may be analyzed for when she was

    single& when she was married& and across all historyby using the customer id field

    Home income was updatedin the new profile record

    Sloly C&anging Dimen!ion di$eSloly C&anging Dimen!ion di$e

  • 8/9/2019 Dimensional Design V26.6.02

    69/123

    Sloly C&anging Dimen!ion di$eSloly C&anging Dimen!ion di$e

    # hen in doubt& design type E# hen a slowly changing dimension

    speeds up + hy not move into a fact

    Degenerate Dimen!ion!Degenerate Dimen!ion!

  • 8/9/2019 Dimensional Design V26.6.02

    70/123

    Degenerate Dimen!ion!Degenerate Dimen!ion!

    # Dimensions with no other place to go# -tored in the fact table

    # Are not facts

    # #ommon e/amples include invoice

    numbers or order numbers

    DrillingDrilling

  • 8/9/2019 Dimensional Design V26.6.02

    71/123

    3e.ion

  • 8/9/2019 Dimensional Design V26.6.02

    72/123

    +egion

    Northeast

    Southeast

    ,nits Sold +evenue

    Quarterly /uto Sales

    SummaryStateMaine

    Ne .or

    Massachusetts

    &lorida

    0eorgia

    1irginia

    +egion

    Northeast

    Southeast

    Central

    Northest

    Southest

    ,nits Sold +evenue

    Quarterly /uto SalesSummary

    DrillingDrilling

    # %olling up %emoving

    dimensional detail

    %olls up a measure

    Has nothing to do

    with how you drilled

    down

    DrillingDrilling

  • 8/9/2019 Dimensional Design V26.6.02

    73/123

    DrillingDrilling

    # Drilling acrossA query that involves more than one fact

    table

    Cot necessarily an action that changes how auser is loo*ing at the data

    !est resolved by multiple -@$ passes

  • 8/9/2019 Dimensional Design V26.6.02

    74/123

    ggregate S$&ema!ggregate S$&ema!

    ggregate De!ign! %er"orman$e ngleggregate De!ign! %er"orman$e ngle

  • 8/9/2019 Dimensional Design V26.6.02

    75/123

    ggregate De!ign! ) %er"orman$e ngleggregate De!ign! ) %er"orman$e ngle

    # Aggregates "re+stored fact summaries

    Along one or more dimensions

    1he most effective tool for improving performance

    # ,/amples

    -ummary of sales by region& by product& by

    category Monthly sales

    ggregate Ba$;groundggregate Ba$;ground

  • 8/9/2019 Dimensional Design V26.6.02

    76/123

    ggregate Ba$;groundggregate Ba$;ground

    # Aggregate rationale'mprove end user query performance

    %educe required #"8 cycles

    "owerful cost saving tool

    # %estrictions

    Additive facts onlyMust use dimensional design

    ggregate 3uideline!ggregate 3uideline!

  • 8/9/2019 Dimensional Design V26.6.02

    77/123

    ggregate 3uideline!ggregate 3uideline!

    # DonGt start with aggregates

    # Design and build based on usage

    # -ooner or later you:ll need to buildaggregates

    ggregate ype! ) 4ample!ggregate ype! ) 4ample!

  • 8/9/2019 Dimensional Design V26.6.02

    78/123

    ggregate ype! 4ample!ggregate ype! 4ample!

    # -ummary 1ables

    # Materialized 0iews

    ggregate ype!ggregate ype!

  • 8/9/2019 Dimensional Design V26.6.02

    79/123

    ggregate ype!ggregate ype!

    # -eparate 1ables -eparate fact table for every aggregate

    -eparate dimension table for every aggregate

    dimension

    -ame number of fact records as level field tables

    # Advantage

    %emoves possibility of double counting

    -chema clarity

    S t +lS t +l

  • 8/9/2019 Dimensional Design V26.6.02

    80/123

    'ne 2ay"ggregate

    Separate a+le!Separate a+le!

    month_key

    product_key

    market_key

    3uantity

    "mount

    MthlySalesFacts /gg

    time_key

    product_key

    market_key

    3uantity"mount

    Sales Factsproduct_keyCategory

    4rand

    )roduct

    Diet #ndicator

    -roduct

    month_key

    .ear

    &iscal )eriod

    Month

    Month

    market_key

    +egion

    District

    State

    City

    Mar&et

    time_key.ear

    &iscal )eriod

    Month

    Day

    Day o5 2ee

    Time

    Separate a+le!Separate a+le!

  • 8/9/2019 Dimensional Design V26.6.02

    81/123

    To 2ay"ggregate

    Separate a+le!Separate a+le!

    product_ke

    yCategory4rand

    )roduct

    Diet #ndicator

    -roduct

    category_key

    Category

    Category

    month_key

    category_key

    market_key

    3uantity

    "mount

    Mnthly CatSales Facts/gg

    month_key

    .ear

    &iscal )eriod

    Month

    Month

    market_key

    +egion

    District

    State

    City

    Mar&et

    time_key.ear

    &iscal )eriod

    Month

    Day

    Day o5 2ee

    Time

    time_key

    product_key

    market_key

    3uantity

    "mount

    Sales Facts

    ggregate %it"all!ggregate %it"all!

  • 8/9/2019 Dimensional Design V26.6.02

    82/123

    ggregate %it"all!ggregate %it"all!

    # -parsity failure1erm used to describe the result of building

    too many aggregate fact that do not

    summarize enough rows.

    hen -parsity failure occurs& a relatively

    small star schema can grow (in terms of dis*

    size) thousands of times.

    -parsity failure ? aggregate e/plosion

    ggregate De!ign 3uideline!ggregate De!ign 3uideline!

  • 8/9/2019 Dimensional Design V26.6.02

    83/123

    ggregate De!ign 3uideline!ggregate De!ign 3uideline!

    # %ule of twenty

    1o avoid aggregate e/plosion

    Ma*e sure each aggregate record

    summarizes E9 or more lower+level records

    # %emember

    1otal number of possible fact tables in any

    given dimensional model ? cartesian productof all levels in all the dimensions

    ggregate Deploymentggregate Deployment

  • 8/9/2019 Dimensional Design V26.6.02

    84/123

    ggregate Deploymentgg ega e ep oy e

    # 'ncremental

    # !ased on usage

    # 1ransparent to users

    # 1ypically warehouse D!A responsibility

  • 8/9/2019 Dimensional Design V26.6.02

    85/123

    Multiple a$t a+le!Multiple a$t a+le!

    Multiple a$t a+le!Multiple a$t a+le!

  • 8/9/2019 Dimensional Design V26.6.02

    86/123

    Multiple a$t a+le!p

    # Different business processes usually requiredifferent fact tables

    # 1here are also several cases where a single

    business process will require multiple fact

    tables

    #ore and custom

    -napshot and transaction

    #overage Aggregates

    Di""erent Bu!ine!! %ro$e!!e!Di""erent Bu!ine!! %ro$e!!e!

  • 8/9/2019 Dimensional Design V26.6.02

    87/123

    # Different business processes usually requiredifferent fact tables

    # 'n practice& it may be hard to identify what a

    =process> is

    # -ometimes you can spot different processes

    because measures are recorded

    ith different dimensions

    At differing grains

    Di""erent Dimen!ion! or 3rainDi""erent Dimen!ion! or 3rain

  • 8/9/2019 Dimensional Design V26.6.02

    88/123

    Di""erent Dimen!ion! or 3rain

    product_key

    Category

    4rand

    )roduct

    Diet #ndicator

    -roduct

    time_key

    product_ke

    y

    shipper_ke

    y

    market_key

    3uantity

    2eight

    Shi0mentFacts

    shipper_ke

    y

    name

    tye

    mode

    address

    Shi00er

    time_key

    .ear

    &iscal )eriod

    Month

    Day

    Day o5 2ee

    Time

    market_key

    +egion

    District

    State

    City

    Mar&ettime_keyproduct_ke

    y

    market_key

    3uantity

    "mount

    Sales Facts

    Di""erent Dimen!ion! or 3rainDi""erent Dimen!ion! or 3rain

  • 8/9/2019 Dimensional Design V26.6.02

    89/123

    # DonGt ta*e shortcuts with grain

    1he :not applicable: dimension value

    8sing a :not applicable: row in a dimension

    confuses the grain and can introducereporting difficulty

    Di""erent %oint! in imeDi""erent %oint! in ime

  • 8/9/2019 Dimensional Design V26.6.02

    90/123

    # -ometimes& it is not easy to identify thediscrete business processes

    # All measures may have the same

    dimensionality or grain# Different measures are recorded at

    different times

    @uantity sold is not recorded at the sametime as quantity shipped

    Di""erent imingDi""erent iming

  • 8/9/2019 Dimensional Design V26.6.02

    91/123

    gg

    # !uilding a single fact table would requirerecording zero or null for measures that

    are not applicable at a point in time

    # %eports would contain a confusingcombination of zeros& nulls& and absence

    of data

    Di""erent iming ) (ne a$t a+leDi""erent iming ) (ne a$t a+le

  • 8/9/2019 Dimensional Design V26.6.02

    92/123

    market_key

    +egion

    District

    State

    City

    gg

    #nitially ill be null

    time_key

    product_key

    market_key

    3uantity_sold

    "mount_sold

    3uantity_shied

    "mount_shied

    Sales and

    Shi0mentFacts

    time_key

    .ear

    &iscal )eriod

    Month

    DayDay o5 2ee

    Time

    Mar&et

    product_key

    Category

    4rand

    )roduct

    Diet #ndicator

    -roduct

  • 8/9/2019 Dimensional Design V26.6.02

    93/123

    Identi"ying Di""erent %ro$e!!e!Identi"ying Di""erent %ro$e!!e!

  • 8/9/2019 Dimensional Design V26.6.02

    94/123

    y gy g

    # $oo* at the measures in question

    # -ort them into fact tables based on

    Dimensions

    Brain

    Differing timings of events measured

  • 8/9/2019 Dimensional Design V26.6.02

    95/123

    Core and Cu!tom S$&ema!Core and Cu!tom S$&ema!

  • 8/9/2019 Dimensional Design V26.6.02

    96/123

    # 1here is a set of dimension attributes andmeasures shared in all cases

    # Depending on the value in a dimension&

    certain e/tra dimension attributes ormeasures are recorded

    Heterogeneous products

    1ypes of customers

    Core andCore andCC t

  • 8/9/2019 Dimensional Design V26.6.02

    97/123

    Cu!tomCu!tom

    product_key

    666

    -roduct

    customer_ke

    y

    666

    Customer

    checking_key

    666custom checing

    attributes

    Chec&ing /ccounttime_key

    checking_keybranch_key

    customer_key

    4alance

    Transaction_count

    666custom checing

    5acts

    Chec&ing/ccountFacts

    time_key

    product_key

    branch_key

    customer_key

    4alanceTransaction_count

    /ccount Facts

    time_key

    666

    Time

    branch_key

    666

    Branch

    Core and Cu!tomCore and Cu!tom

  • 8/9/2019 Dimensional Design V26.6.02

    98/123

    # #ore fact table and dimensions All attributes shared no matter what

    Appropriate for analysis across entire sub2ect area

    ##ustom fact table andIor dimensions #ontain attributes specific to a particular dimension value (e.g.

    =#hec*ing>)

    7nly appropriate when the business question is limited to that

    particular dimension value

    -hould repeat shared facts to minimize need to access two facttables

    Coerage S$&emaCoerage S$&ema

  • 8/9/2019 Dimensional Design V26.6.02

    99/123

    # A star schema usually measure events thathappen

    # %elationships between the dimensions involved

    are not captured if events do not happen

    # A coverage table fills the gap

    hat did not sell that was on promotionJ

    ho was assigned to that customerJ

    # 8sually =factless>

    Mea!uring *&at

  • 8/9/2019 Dimensional Design V26.6.02

    100/123

    product_key

    Category 4rand

    )roduct

    S!,

    -roduct

    customer_key

    Name

    Comany

    "ccount

    )hone_num

    Customer

    time_key

    product_key

    customer_key

    rep_key

    quantity

    sales_dollars

    Sales Facts

    time_key

    .ear

    &iscal )eriod

    MonthDay

    Day o5 2ee

    Time

    rep_key

    re_namere_hone

    +egion

    District

    State

    City

    Salesre0

    # -ales facts does not reveal who is assigned to acustomer if they do notsell

    Coerage a+leCoerage a+le

  • 8/9/2019 Dimensional Design V26.6.02

    101/123

    # #ustomerKcoverageKfacts shows who is assigned to acustomer at a point in time

    customer_key

    Name

    Comany

    "ccount

    )hone_num

    Customer

    time_key

    customer_key

    rep_key

    Customer

    Co2erageFacts

    time_key

    Year

    &iscal )eriod

    Month

    Day

    Day o5 2ee

    Time

    rep_key

    re_name

    re_hone

    +egion

    District

    State

    City

    Salesre0

    Snap!&ot and ran!a$tionSnap!&ot and ran!a$tion

  • 8/9/2019 Dimensional Design V26.6.02

    102/123

    # 0iewing a single process multiple ways# 1ransactions

    1he changes to what is being measured

    # -napshot

    1he status at a point in time

    # ,/ample

    #hanges to inventory

    #urrent status of inventory

    Snap!&otSnap!&ot

  • 8/9/2019 Dimensional Design V26.6.02

    103/123

    # How much is on hand todayJ# How much was on hand yesterdayJ

    time_key

    .ear

    &iscal )eriod

    Month

    Day

    Day o5 2ee

    product_key

    Category 4rand)roduct

    S!,

    -roduct

    location_key2arehouse

    2%_code

    City

    State

    Location

    time_key

    product_key

    location_key

    quantity_on_hand

    In2entorySna0shot Time

    ran!a$tionran!a$tion

  • 8/9/2019 Dimensional Design V26.6.02

    104/123

    # How did inventory change todayJ

    # How much product was returned due to failed inspectionJ

    product_key

    Category 4rand

    )roduct

    S!,

    -roduct

    location_key2arehouse

    2%_code

    City

    State

    Location

    time_key

    product_key

    location_key

    transaction_type_k

    ey

    transaction_amount

    In2entoryTransactions

    time_key

    .ear

    &iscal )eriod

    MonthDay

    Day o5 2ee

    Time

    transaction_type_key

    transaction_tye_codetransaction_tye

    transaction_category

    Transactionty0e

    ggregate a+le!ggregate a+le!

  • 8/9/2019 Dimensional Design V26.6.02

    105/123

    # Aggregate table

    A fact table that summarizes another fact

    table

    #reated for performance reasons

    #overed in previous section

    Multiple a$t a+le SummaryMultiple a$t a+le Summary

  • 8/9/2019 Dimensional Design V26.6.02

    106/123

    # Different processes need different tables# 'dentified with

    Brain

    Dimensionality

    1iming

    # -ame process may need multiple fact tables

    Heterogeneous attributes

    #overage

    -napshot and transaction

    Aggregates

  • 8/9/2019 Dimensional Design V26.6.02

    107/123

    r$&ite$ted Data Mart!r$&ite$ted Data Mart!

    Data MartData Mart

  • 8/9/2019 Dimensional Design V26.6.02

    108/123

    # Meaning of the term :data mart: hasshifted over the last several years...

    Data Mart r$&ite$ture 1>>?Data Mart r$&ite$ture 1>>?

  • 8/9/2019 Dimensional Design V26.6.02

    109/123

    ,0erationalSystems

    76T6(676T6(6

    So5tareSo5tare

    Data3arehouse

    /nalysis4sers

    3uery 83uery 8

    +eortin+eortin

    gg

    So5tareSo5tare

    76T6(676T6(6

    So5tareSo5tare

    Data Marts

    Data Mart r$&ite$ture 1>>@Data Mart r$&ite$ture 1>>@

  • 8/9/2019 Dimensional Design V26.6.02

    110/123

    ,0erationalSystems

    56T6L6

    So7t8areData Marts

  • 8/9/2019 Dimensional Design V26.6.02

    111/123

    ,0erational Systems

    /nalysis4sers

    Data Mart

    Data3arehouse

    56T6LSo7t8ar

    e

    Query 9Re0ortingSo7t8are

    Data MartData Mart

  • 8/9/2019 Dimensional Design V26.6.02

    112/123

    # arehouse -ub2ect Area

    'ncremental warehouse development

    #entralized architecture

    Cot new

    ell + suited to star schemas

    AAStoepipe Data Mart!Stoepipe Data Mart!

  • 8/9/2019 Dimensional Design V26.6.02

    113/123

    # =-tovepipe> data marts 'nconsistent and overlappingdata

    Difficult and costly to

    maintain

    %edundant data load

    #anGt drill across

    'ntegration requires starting

    over

    # Dimensions not

    conformed

    Store Sales

    Facts

    -roduct

    Time

    :Day;

    -roduct

    Time

    :Day;Shi0ments

    Facts

    3arehouse

    3arehouse In2entory

    Facts

    -roduct

    Month

    Con"ormed Dimen!ion!Con"ormed Dimen!ion!

  • 8/9/2019 Dimensional Design V26.6.02

    114/123

    # Definition

    A dimension is conformed when

    multiple fact tables share that

    dimension

    Con"ormed Dimen!ion!Con"ormed Dimen!ion!

  • 8/9/2019 Dimensional Design V26.6.02

    115/123

    # Description -hared common dimensions

    'ntegrates logical design

    ,nsures consistency between data marts Allows incremental development

    'ndependent of physical location

    -ome re+wor* may be required

    Con"ormed Dimen!ion!Con"ormed Dimen!ion!

  • 8/9/2019 Dimensional Design V26.6.02

    116/123

    # Advantages ,nables an incremental development approach

    ,asier and cheaper to maintain

    Drastically reduces e/traction and loadingcomple/ity

    Answers business questions that cross data marts

    -upports both centralized and distributedarchitectures

    Interlo$;ing Star S$&ema!Interlo$;ing Star S$&ema!

  • 8/9/2019 Dimensional Design V26.6.02

    117/123

    StoreDimensio

    nSales

    Facts

    -roductDimensio

    n

    Time

    Dimensio

    n

    Shi0men

    t Facts

    3arehous

    e

    Dimensio

    n

    In2entor

    y Facts

    Month

    Dimensio

    n

    Con5ormed DimensionsCon5ormed Dimensions

    -im+all9! Data *are&ou!e Bu!-im+all9! Data *are&ou!e Bu!

  • 8/9/2019 Dimensional Design V26.6.02

    118/123

    Store -roduct Day 3arehouse Month

    SalesFacts

    Shi0ment Facts

    In2entory Facts

    *&en to Con"orm*&en to Con"orm

  • 8/9/2019 Dimensional Design V26.6.02

    119/123

    # 1wo approaches8p+front

    As+you+go

    !oth approaches wor*

    # #hoose the approach that wor*s for you

    Con"orm 8p rontCon"orm 8p ront

  • 8/9/2019 Dimensional Design V26.6.02

    120/123

    Cross5nter0rise

    /nalysis

    CreateFirstect/reas

    Con7orm all Dimensions

    Finali?eDesign 9

    BuildSu=>ect/rea 1

    Finali?eDesign 9

    BuildSu=>ect/rea

    Finali?eDesign 9

    BuildSu=>ect/rea !

    Con"orm !)ou)3oCon"orm !)ou)3o

  • 8/9/2019 Dimensional Design V26.6.02

    121/123

    Design = >uild

    "ubject

    uild

    "ubjectuild

    "ubjectuild

    "ubject

  • 8/9/2019 Dimensional Design V26.6.02

    122/123

    Data e?tracted from source system@ !*TP)3lat 3ilesA

    depending on the updation of records

    Temporary

  • 8/9/2019 Dimensional Design V26.6.02

    123/123

    # %ationale for dimensional modeling# Dimensional modeling basics

    # Dimensional modeling details

    # act table details

    # Dimension table details

    # Design process

    # Aggregate schemas

    # Multiple fact tables# Architected data marts