Upload
singhhansraj28
View
223
Download
0
Embed Size (px)
Citation preview
8/2/2019 Data Warehouse Development & Schemas
1/23
DataData WarehouseWarehouseDevelopment & SchemasDevelopment & Schemas
8/2/2019 Data Warehouse Development & Schemas
2/23
Prof. Pawan Kumar MBA IV SEM (SEC-A) 8- 2
Data Warehouse DevelopmentData Warehouse DevelopmentData warehouse development approaches
I nmon Model: EDW approach (top-down)Kimball Model: Data mart approach (bottom-up)
Which model is best?There is no one-size-fits-all strategy to DW
One alternative is the hosted warehouse
Data warehouse structure:The Star Schema vs. Relational
Real-time data warehousing?
8/2/2019 Data Warehouse Development & Schemas
3/23
Prof. Pawan Kumar MBA IV SEM (SEC-A) 8- 3
I nmon Model: The EDW ApproachI nmon Model: The EDW Approach
Top-down DevelopmentSpiral Development ApproachERD BasedHe insisted that data should be organized into subject oriented,integrated, non volatile and time variant structures.Detailed data is regularly extracted from the ODS and Data martsand temporarily hosted in the staging area for aggregation,summarization and then extracted and loaded into the Datawarehouse.
8/2/2019 Data Warehouse Development & Schemas
4/23
Prof. Pawan Kumar MBA IV SEM (SEC-A) 8- 4
Kimball Model: The Data Mart ApproachKimball Model: The Data Mart Approach
Bottom up approach uses bus structure .Plan big, build smallSubject oriented or department-oriented data warehouse such as marketing orsales.This model strikes a good balance between centralized and localized flexibility.
This architecture makes the data warehouse more of a virtual reality than aphysical reality.
All data marts could be located in one server or could be located on differentservers across the enterprise while the data warehouse would be a virtual entitybeing nothing more than a sum total of all the data marts.
8/2/2019 Data Warehouse Development & Schemas
5/23
Prof. Pawan Kumar MBA IV SEM (SEC-A) 8- 5
DWDW Development ApproachesDevelopment Approaches(Inmon Approach) (Kimball Approach)
8/2/2019 Data Warehouse Development & Schemas
6/23
Prof. Pawan Kumar MBA IV SEM (SEC-A) 8- 6
Data Warehouse Schema ArchitectureData Warehouse Schema Architecture
- Star schema- Snowflake schema
- Fact constellation schema
8/2/2019 Data Warehouse Development & Schemas
7/23
Prof. Pawan Kumar MBA IV SEM (SEC-A) 8- 7
Star schemaStar schema
A star schema can be simple or complex. A simple star consistsof one fact table; a complex star can have more than one facttable.
It contains two types of tables
Fact Tables: A fact table typically has two types of columns:foreign keys to dimension tables and measures those that containnumeric facts. A fact table can contain fact's data on detail oraggregated level.
Dimension Tables: A dimension is a structure usually composedof one or more hierarchies that categorizes data.They are normally descriptive, textual valuesDimension tables are generally small in size then fact table.
8/2/2019 Data Warehouse Development & Schemas
8/23
Prof. Pawan Kumar MBA IV SEM (SEC-A) 8- 8
Star schemaStar schema
8/2/2019 Data Warehouse Development & Schemas
9/23
Prof. Pawan Kumar MBA IV SEM (SEC-A) 8- 9
The main characteristics of starThe main characteristics of starschema:schema:
Simple structure -> easy to understand schemaGreat query effectives -> small number of tables to
join
Relatively long time of loading data into dimensiontables -> de-normalization, redundancy data causedthat size of the table could be large.The most commonly used in the data warehouseimplementations -> widely supported by a largenumber of business intelligence tools
8/2/2019 Data Warehouse Development & Schemas
10/23
Prof. Pawan Kumar MBA IV SEM (SEC-A) 8-10
Snowflake schemaSnowflake schemaThe snowflake schema architecture is a morecomplex variation of the star schema used in a datawarehouse, because the tables which describe thedimensions are normalized.
8/2/2019 Data Warehouse Development & Schemas
11/23
Prof. Pawan Kumar MBA IV SEM (SEC-A) 8-11
Fact constellation schemaFact constellation schemaFor each star schema it is possible to construct fact constellationschema(for example by splitting the original star schema into more starschemes each of them describes facts on another level of dimensionhierarchies)The fact constellation architecture contains multiple fact tables thatshare many dimension tables.The main shortcoming of the fact constellation schema is a morecomplicated design because many variants for particular kinds of aggregation must be considered and selected. Moreover, dimensiontables are still large.
8/2/2019 Data Warehouse Development & Schemas
12/23
Prof. Pawan Kumar MBA IV SEM (SEC-A) 8-12
Fact constellation schemaFact constellation schema
8/2/2019 Data Warehouse Development & Schemas
13/23
8/2/2019 Data Warehouse Development & Schemas
14/23
Prof. Pawan Kumar MBA IV SEM (SEC-A) 8-14
Risks in ImplementingRisks in Implementing DWDWNo mission or objectiveQuality of source data unknownSkills not in place
Inadequate budgetLack of supporting softwareSource data not understoodWeak sponsor
Users not computer literatePolitical problems or turf warsUnrealistic user expectations
(Continued )
8/2/2019 Data Warehouse Development & Schemas
15/23
Prof. Pawan Kumar MBA IV SEM (SEC-A) 8-15
Risks in ImplementingRisks in Implementing DWDW Cont.Cont. Architectural and design risksScope creep and changing requirements
Vendors out of control
Multiple platformsKey people leaving the projectLoss of the sponsorToo much new technology
Having to fix an operational systemGeographically distributed environmentTeam geography and language culture
8/2/2019 Data Warehouse Development & Schemas
16/23
Prof. Pawan Kumar MBA IV SEM (SEC-A) 8-16
Things to Avoid for SuccessfulThings to Avoid for SuccessfulImplementation of Implementation of DWDW
Starting with the wrong sponsorship chainSetting expectations that you cannot meetEngaging in politically naive behaviorLoading the warehouse with information justbecause it is availableBelieving that data warehousing database
design is the same as transactional DB designChoosing a data warehouse manager who istechnology oriented rather than user oriented
(see more on page 356)
8/2/2019 Data Warehouse Development & Schemas
17/23
Prof. Pawan Kumar MBA IV SEM (SEC-A) 8-17
RealReal--timetime DWDW(a.k.a. Active Data Warehousing)(a.k.a. Active Data Warehousing)
Enabling real-time data updates forreal-time analysis and real-time decisionmaking is growing rapidly
Push vs. Pull (of data)
Concerns about real-time BINot all data should be updated continuously
Mismatch of reports generated minutes apartMay be cost prohibitiveMay also be infeasible
8/2/2019 Data Warehouse Development & Schemas
18/23
Prof. Pawan Kumar MBA IV SEM (SEC-A) 8-18
Evolution of DSS & DWEvolution of DSS & DW
8/2/2019 Data Warehouse Development & Schemas
19/23
Prof. Pawan Kumar MBA IV SEM (SEC-A) 8-19
Active Data Warehousing Active Data Warehousing(by(by TeradataTeradata Corporation)Corporation)
8/2/2019 Data Warehouse Development & Schemas
20/23
Prof. Pawan Kumar MBA IV SEM (SEC-A) 8-20
Comparing Traditional and ActiveComparing Traditional and Active DWDW
8/2/2019 Data Warehouse Development & Schemas
21/23
Prof. Pawan Kumar MBA IV SEM (SEC-A) 8-21
Data Warehouse AdministrationData Warehouse Administration
Due to its huge size and its intrinsic nature, aDW requires especially strong monitoring inorder to sustain its efficiency, productivity
and security.The successful administration andmanagement of a data warehouse entailsskills and proficiency that go past what is
required of a traditional databaseadministrator.
Requires expertise in high-performance software,hardware, and networking technologies
8/2/2019 Data Warehouse Development & Schemas
22/23
8/2/2019 Data Warehouse Development & Schemas
23/23
Prof. Pawan Kumar MBA IV SEM (SEC-A) 8-23
End of the ChapterEnd of the Chapter
Questions ?