8
Data Warehouse Upasana Bhasin Inmon vs. Kimball W. H. Inmon’s approach According to Bill Inmon, “A data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management's decision making process” (Ponniah, 2010). Inmon supports the top-down approach for building a data warehouse in which instead of collecting fragments of information, a big enterprise-wide data warehouse is built. In this approach, a data warehouse is a centralized repository of data for the entire enterprise. The data is stored at the lowest level of granularity in the data warehouse and should be available in both, detailed and summarized levels with the help of drilling down and drilling up methods. The information in the data warehouse is stored in 3 rd normal form. The data warehouse consists of a number of dependent data marts who source information from it. In the top-down approach, data is extracted from operational data sources. The data is then loaded into the staging area where it is validated to ensure accuracy. From there, the data is moved to Operation Data Store (ODS). In order 1

In Mon vs Kimball

  • Upload
    uxb5154

  • View
    144

  • Download
    3

Embed Size (px)

Citation preview

Data Warehouse Upasana Bhasin

Inmon vs. Kimball

W. H. Inmon’s approach

According to Bill Inmon, “A data warehouse is a subject-oriented, integrated, time-variant and

non-volatile collection of data in support of management's decision making process” (Ponniah,

2010). Inmon supports the top-down approach for building a data warehouse in which instead of

collecting fragments of information, a big enterprise-wide data warehouse is built. In this

approach, a data warehouse is a centralized repository of data for the entire enterprise. The data

is stored at the lowest level of granularity in the data warehouse and should be available in both,

detailed and summarized levels with the help of drilling down and drilling up methods. The

information in the data warehouse is stored in 3rd normal form. The data warehouse consists of a

number of dependent data marts who source information from it. In the top-down approach, data

is extracted from operational data sources. The data is then loaded into the staging area where it

is validated to ensure accuracy. From there, the data is moved to Operation Data Store (ODS). In

order to avoid data extraction from ODS, in a parallel process; data is transported into the data

warehouse. Data from ODS is regularly extracted for aggregation and summarization into the

staging area and then loaded into the data warehouse. Once data is loaded into the data

warehouse, the data marts extract data from it and perform transformations on the data. After the

data marts are loaded with data, the Online Analytic Processing (OLAP) environment will be

available to the users.

R. Kimball’s approach

1

Data Warehouse Upasana Bhasin

According to R. Kimball, “A data warehouse is nothing more than the union of all the constituent

data marts.” Kimball supports the bottom-up approach for building a data warehouse in which

data marts are created first and contain data at the lowest level of granularity. All the data marts

are then joined together by conforming the dimensions. The data marts are connected to the data

warehouse with a bus structure which contains elements that are common to data marts such as

conformed dimensions, measures etc. In this approach, data from the operational systems is

loaded into the staging area where it is processed and consolidated. It is then moved to the ODS.

Once the ODS is loaded with fresh data, the data is extracted to the staging area where it is

processed and moved to the data marts. The data in the data mart is then moved to the staging

area where it is summarized and loaded into the data warehouse. The end users can access this

data for analysis.

Key differences in approach

According to Inmon, a data warehouse is a collection of data. It is inherently architected and is a

single and central storage of data. He supports the top-down approach in which traditional

relational database tools are used for the development a data warehouse. ER modeling technique

is used in this approach and information in the data warehouse is stored in 3 rd normal form. Data

can be accessed quickly if implemented with iterations. His approach towards data modeling is

subject-oriented. This method has exposure to high risk of failure. End-user accessibility is low.

This approach requires high level of cross-functional skills. The overall process is quite complex.

On the other hand, Kimball defines a data warehouse as a collection of all constituent data marts

with conformed dimensions. The data warehouse bus in the bus architecture helps in integration

of the data marts to create the data warehouse. He is in favor of the bottom-up approach which is

2

Data Warehouse Upasana Bhasin

user driven, comparatively easier to implement, supports multi-dimensional database design and

ensures consistency of metadata. His approach towards data modeling is process oriented. In this

method, star schemas are used to create denormalized dimensional models. The concept of

‘Conformed Dimensions’ is used to avoid data replication. There is no single source of

information. All data marts have their own narrow view of data. The data marts provide

information to the end-users for business analysis. This method has exposure to less risk of

failure. End-user accessibility is high. The main disadvantage of this method is that it causes

data fragmentation. The overall process is quite simple to use.

Key similarities/agreements in approach

In both the approaches, data is collected from various sources into the staging area where is it

integrated, transformed and then loaded into the data warehouse. The time attribute of data is

given importance in both the approaches. Both methods use the ETL process. Both Inmon and

Kimball agree that for enterprise wide data warehouse, stand-alone data marts are of minimal

use. Whether using Inmon’s approach or Kimball’s, it is necessary for the data warehouse team

to hire employees who have good soft skills both, substantially and effectively.

Find an article to discuss the Inmon vs. Kimball controversy, and write a brief critique of

the article. Your opinion as to which approach should produce a better design – with

supporting arguments. You should choose either Inmon or Kimball, Not “it depends”

approach.

The article, ‘Data Warehousing Battle of the Giants: Comparing the Basics of the Kimball and

Inmon Models’ explains the nature and history of data warehouse and helps in understanding

both, the Inmon and Kimball model in detail. It also provides a list of characteristics as a basis

3

Data Warehouse Upasana Bhasin

for determining which approach is appropriate for developing a data warehouse for a particular

organization.

A number of factors have to be considered while developing a data warehouse such as resources,

user requirements, level of granularity etc. I would prefer using Kimball’s approach for

developing a data warehouse because each data mart contains information specific to a particular

business area. Managing individual data marts is much faster and easier than managing a

centralized data warehouse. The overall process is quite simple to use and involves less risk of

failure. The use of dimensional modeling in Kimball’s approach helps in providing high level of

performance. This approach is user driven, ensures consistency of metadata and end-user

accessibility is high.

4

Data Warehouse Upasana Bhasin

REFERENCES

1. Ponniah, P. (2010). Data Warehousing Fundamentals, a Comprehensive Guide for IT

Professionals. Wiley & Sons.

2. Kimball, R & Ross, M. (2002). The Data Warehouse Toolkit. Wiley & Sons.

3. Inmon, W. (2005). Building the Data Warehouse. Wiley & Sons.

4. Retrieved September 20, 2010 from http://www.exforsys.com/tutorials/msas/data-

warehouse-design-kimball-vs-inmon.html

5. Retrieved September 20, 2010 from http://www.bi-bestpractices.com/view-articles/4768

6. Retrieved September 20, 2010 from http://mydbaworld.wordpress.com/2009/07/23/bill-

inmon-vs-ralph-kimball/

7. Retrieved September 20, 2010 from

http://www.information-management.com/infodirect/19990901/1400-1.html

5