Upload
delilah-bruce
View
212
Download
0
Embed Size (px)
Citation preview
Data Data WarehouseWarehouse
Database Design Database Design MethodsMethods
For Technical IT For Technical IT AudienceAudience
Peter NolanPeter Nolanwww.peternolan.comwww.peternolan.com
Agenda
DW design methodsDW design methods Star SchemasStar Schemas
Why, What, Features?Why, What, Features?
Time Variant + Stability AnalysisTime Variant + Stability Analysis Why and What?Why and What?
DW Design Methods
Three different methods widely usedThree different methods widely used Choice of use depends on many factorsChoice of use depends on many factors Star SchemaStar Schema
Transaction and ROLAPTransaction and ROLAP
Time Variance + Stability AnalysisTime Variance + Stability Analysis Changes to non-txn data eg. CustomerChanges to non-txn data eg. Customer
Third Normal FormThird Normal Form Volumes of changes are lowVolumes of changes are low
Why Use Star Schemas?
Business people understand them!! Business people understand them!! Matches the business modelMatches the business model Often intuitively obviousOften intuitively obvious
Easy to queryEasy to query Supports complex questions easilySupports complex questions easily No wrong answers due to join problemsNo wrong answers due to join problems Excellent performance on star schema Excellent performance on star schema
aware databasesaware databases Oracle, Informix, DB2Oracle, Informix, DB2
What Does a Star Schema Look Like?
Time
Description
Day
Month
Year
Demogrph
Age
Sex
MaritalStatusGeocode
Postcode
Household
Accounts
Status
Category
Description
Branch
Name
Postcode
City
Region
State
Time
Demogrph
Accounts
Transactn
Branch
Amount
Cost
No Txns
Transactn
Code
Group
Debit/Credit
Some Useful Features
Multi-level summaries defined in control tableMulti-level summaries defined in control table New summaries require NO code changesNew summaries require NO code changes
Generated keys for customers and accountsGenerated keys for customers and accounts Demographic & Product Grouping for Demographic & Product Grouping for
customers and accountscustomers and accounts
Why Use Integer Keys?
PerformancePerformance Integers is the fastest data type to operate onIntegers is the fastest data type to operate on
Space and throughputSpace and throughput Integers are shorter than account numbersIntegers are shorter than account numbers Disk savings in tables and indexesDisk savings in tables and indexes Speed from disk to processorSpeed from disk to processor
FlexibilityFlexibility Allows multi-level summary tablesAllows multi-level summary tables No IT involvement to create new summaries!!No IT involvement to create new summaries!!
Large Stars!!Large Stars!!
Why Use TV + SA?
““Be able to show me the value of any field at Be able to show me the value of any field at any time in the past”any time in the past”
Archival databasesArchival databases When people really do not know what they When people really do not know what they
wantwant Not easy to query Not easy to query
Cannot give this to business peopleCannot give this to business people
What Does TV + SA Look Like?
Each 3NF entity passed to the DW is split into 3 (or more) entities based on stability analysis and access analysis.An element of time is added to the key.
Natural Key
Date From
Date To
Highly
VolatileDataElements
Natural Key
Date From
Date To
Medium
VolatileDataElements
Natural Key
Date From
Date To
Low
VolatileDataElements
Summary
DW design methodsDW design methods Star SchemasStar Schemas
Why, What Features?Why, What Features?
Time Variant + Stability AnalysisTime Variant + Stability Analysis Why and What?Why and What?
Thank You for Your Thank You for Your Time!Time!