23
The Data Warehouse Environment

The Data Warehouse Environment. Agenda The Structure of the Data Warehouse Subject Orientation Day 1 – day n Phenomenon Granularity Partitioning as a

  • View
    220

  • Download
    2

Embed Size (px)

Citation preview

Page 1: The Data Warehouse Environment. Agenda The Structure of the Data Warehouse Subject Orientation Day 1 – day n Phenomenon Granularity Partitioning as a

The Data Warehouse Environment

Page 2: The Data Warehouse Environment. Agenda The Structure of the Data Warehouse Subject Orientation Day 1 – day n Phenomenon Granularity Partitioning as a

Agenda

• The Structure of the Data Warehouse• Subject Orientation• Day 1 – day n Phenomenon• Granularity• Partitioning as a Design Approach• Structuring data in the Data Warehouse• Data Warehouse: The Standard Manual• Auditing and the Data Warehouse• Cost Justification

Page 3: The Data Warehouse Environment. Agenda The Structure of the Data Warehouse Subject Orientation Day 1 – day n Phenomenon Granularity Partitioning as a

The Structure of the Data Warehouse

• Older level of detail

• Current level of detail

• A level of lightly summarized data

• A level of highly summarized data

Page 4: The Data Warehouse Environment. Agenda The Structure of the Data Warehouse Subject Orientation Day 1 – day n Phenomenon Granularity Partitioning as a
Page 5: The Data Warehouse Environment. Agenda The Structure of the Data Warehouse Subject Orientation Day 1 – day n Phenomenon Granularity Partitioning as a

Subject Orientation• The data warehouse is oriented to the major

subject areas of the corporation that have been defined in the high-level corporate data model.

• Typical subject areas include the following:– Customer– Product– Transaction or activity– Policy– Claim– Account

Page 6: The Data Warehouse Environment. Agenda The Structure of the Data Warehouse Subject Orientation Day 1 – day n Phenomenon Granularity Partitioning as a
Page 7: The Data Warehouse Environment. Agenda The Structure of the Data Warehouse Subject Orientation Day 1 – day n Phenomenon Granularity Partitioning as a
Page 8: The Data Warehouse Environment. Agenda The Structure of the Data Warehouse Subject Orientation Day 1 – day n Phenomenon Granularity Partitioning as a
Page 9: The Data Warehouse Environment. Agenda The Structure of the Data Warehouse Subject Orientation Day 1 – day n Phenomenon Granularity Partitioning as a

Day 1 – Day n Phenomenon

Page 10: The Data Warehouse Environment. Agenda The Structure of the Data Warehouse Subject Orientation Day 1 – day n Phenomenon Granularity Partitioning as a
Page 11: The Data Warehouse Environment. Agenda The Structure of the Data Warehouse Subject Orientation Day 1 – day n Phenomenon Granularity Partitioning as a

Granularity

• The single most important aspect of design of a data warehouse is the issue of granularity

• Indeed, the issue of granularity permeates the entire architecture that surrounds the data warehouse environment.

• Granularity refers to the level of detail or summarization of the units of data in the data warehouse.

• The more detail there is, the lower the level of granularity

• The less detail there is, the higher the level of granularity

Page 12: The Data Warehouse Environment. Agenda The Structure of the Data Warehouse Subject Orientation Day 1 – day n Phenomenon Granularity Partitioning as a
Page 13: The Data Warehouse Environment. Agenda The Structure of the Data Warehouse Subject Orientation Day 1 – day n Phenomenon Granularity Partitioning as a

Partitioning as a Design Approach

• A second major design issue of data in the warehouse (after that of granularity) is that of partitioning

• Partitioning of data refers to the breakup of data into separate physical units that can be handled independently.

• Proper partitioning can benefit the data warehouse in several ways:– Loading data– Accessing data– Archiving data– Deleting data– Monitoring data– Storing data

• Partitioning data properly allows data to grow and to be managed. Not partitioning data properly does not allow data to be managed or to grow gracefully

Page 14: The Data Warehouse Environment. Agenda The Structure of the Data Warehouse Subject Orientation Day 1 – day n Phenomenon Granularity Partitioning as a

Partitioning of Data

• The purpose of partitioning of current detail data is to break data up into small, manageable physical units.

• Below is some of the tasks that cannot easily be performed when data resides in large physical units:– Restructuring

– Indexing

– Sequential Scanning, if needed

– Reorganization

– Recovery

– Monitoring

Page 15: The Data Warehouse Environment. Agenda The Structure of the Data Warehouse Subject Orientation Day 1 – day n Phenomenon Granularity Partitioning as a
Page 16: The Data Warehouse Environment. Agenda The Structure of the Data Warehouse Subject Orientation Day 1 – day n Phenomenon Granularity Partitioning as a

Partitioning of data (cont’d)• Data can be divided by many criteria, such as:

– By date– By line of business– By geography– By organizational unit– By all of the above

• The choice of partitioning data are strictly up to the developer. As an example of how a life insurance company may choose to partition its data, consider the following physical units of data:

• 2000 health claims, 2001 health claims, 2002 health claims• 1999 life claims, 2000 life claims, 2001 life claims, 2002 life claims• 2000 casuality claims, 2001 casuality claims, 2002 casuality claims• The insurance company has used the criteria of date, that is, year – and

type of claim to partition the data

Page 17: The Data Warehouse Environment. Agenda The Structure of the Data Warehouse Subject Orientation Day 1 – day n Phenomenon Granularity Partitioning as a

Partitioning of data (cont’d)

• Partitioning can be done in many ways:– Partition at the system level– Partition at the application level

• As a rule, it makes sense to partition data warehouse data at the application level

Page 18: The Data Warehouse Environment. Agenda The Structure of the Data Warehouse Subject Orientation Day 1 – day n Phenomenon Granularity Partitioning as a

Structuring data in the Data Warehouse

• There are many more ways to structure data within the data warehouse. The most common are these:– Simple cumulative– Rolling summary– Simple direct– Continuous

Page 19: The Data Warehouse Environment. Agenda The Structure of the Data Warehouse Subject Orientation Day 1 – day n Phenomenon Granularity Partitioning as a
Page 20: The Data Warehouse Environment. Agenda The Structure of the Data Warehouse Subject Orientation Day 1 – day n Phenomenon Granularity Partitioning as a

Structuring data in the Data Warehouse (cont’d)

Page 21: The Data Warehouse Environment. Agenda The Structure of the Data Warehouse Subject Orientation Day 1 – day n Phenomenon Granularity Partitioning as a
Page 22: The Data Warehouse Environment. Agenda The Structure of the Data Warehouse Subject Orientation Day 1 – day n Phenomenon Granularity Partitioning as a

Structuring data in the Data Warehouse (cont’d)

Page 23: The Data Warehouse Environment. Agenda The Structure of the Data Warehouse Subject Orientation Day 1 – day n Phenomenon Granularity Partitioning as a