12
Data Warehousing by Example A Day at the IPL in Kolkata Prepared By: RANJAN GANGULI M.En(CSE),University Institute Of Technology The University of Burdwan, West Bengal,India email:[email protected] Why? The purpose of this document is to present a ‘Best Practice’ approach to Data Warehouse design based on some experiences as there is no complete and consistent design methodology to design a data warehouse. Getting Started: Here, I use a trip to watch a Cricket match of IPL in Kolkata India, to show how we can apply our approach to a real-world situation and develop a design for a tailored Dimensional Model. This approach leads to the implementation of a Reference Data Architecture and the design of a Data Warehouse. A Canonical Data Model (CDM) is central to this and the Design Patterns based on a CDM. Best Practice suggests that when all the steps have been completed, each item produced should be reviewed and extended or modified as appropriate.

Data Warehousing by Example

Embed Size (px)

Citation preview

Page 1: Data Warehousing by Example

Data Warehousing by Example

A Day at the IPL in Kolkata

Prepared By: RANJAN GANGULI

M.En(CSE),University Institute Of Technology

The University of Burdwan, West Bengal,India

email:[email protected]

Why? The purpose of this document is to present a ‘Best Practice’ approach to Data Warehouse design

based on some experiences as there is no complete and consistent design methodology to design a

data warehouse.

Getting Started:

Here, I use a trip to watch a Cricket match of IPL in Kolkata India, to show how we can apply our approach to

a real-world situation and develop a design for a tailored Dimensional Model. This approach leads to the

implementation of a Reference Data Architecture and the design of a Data Warehouse.

A Canonical Data Model (CDM) is central to this and the Design Patterns based on a CDM.

Best Practice suggests that when all the steps have been completed, each item produced

should be reviewed and extended or modified as appropriate.

Page 2: Data Warehousing by Example

Additional Events:

During my trip to Kolkata, more events can be added as:

Event –I. Buy IPL ticket

Event-II. Ate food in a restaurant

Event-III. Watch the IPL match

The Approach:

The Approach is to follow these Steps: Step 1 – Identify the Events involved Step 2 – Define a Design Pattern based on the Event-driven Canonical Data Model Step 3 - Define a Message Format for the data in each Event Step 4 - Design a 3rd Normal Form Data Warehouse (DWH) and update it for each Event. Step 5 – Define the format for loading data into the DWH for each Message

The reason for all the work that we have done to get to this point is, of course, to produce

Business Intelligence (‘BI’).

Page 3: Data Warehousing by Example

Canonical Data Model

(A General Format)

Typical Events could be: i. A Customer makes a Purchase

ii. A Supplier makes a Delivery of Merchandise Typical Documents could be : i. A Sales Receipt

ii. A Contract Letter

iii. A Delivery Note People and Organisations are examples of the Roles played by Parties. Parties are often shown in Models produce d by professional Data Models. In that case, Best Practice usually dictates that Semantic Models are produced to help business users

understand how Customers and so on, are modelled as Parties and Roles.

Page 4: Data Warehousing by Example

(Design Pattern based Canonical Data Model)

Message Format:

Page 5: Data Warehousing by Example

EVENT-I( BUY TICKET FOR IPL)

Activity Performed:

Step-I: Identify the Events:

Event-I. Buy Ticket for IPL Cricket Match

Step-II: The Design Pattern Below shows how the Design Pattern applies to this Event.

(Please see the above CDM format)

EVENT

(PURCHASE A TICKET)

SERVICES

(CHECK-IN)

STAFF

(PURCHASE A TICKET)

DOCUMENTS

i. Ticket for train/bus

ii. Ticket for IPL

LOCATION

(EDEN GARDEN)

Kolkata

CUSTOMERS

CREDIT CARD

Page 6: Data Warehousing by Example

Step-III: Message Format

This shows the data items on the Ticket:-

EVENT

DATE LOCATION PRICE DETAILS

Purchase a ticket

30.01.2017

KOLKATA

2700

BLOCK-A ROW-123

Step-IV: The 3NF Data ware House

The benefit of adopting a Third-Normal Form ERD is that it enforces a ‘Single View of the Truth’. If we adopt a Dimensional Model it is not so easy to achieve this. This shows the design of the Data Warehouse (DWH) after the first Event of Purchasing a Ticket

CUSTOMERS

CUSTOMER SERVICE

DOCUMENT

STAFF SERVICES

LOCATION

ADDRESS

CREDIT CARD

Page 7: Data Warehousing by Example

Step-V:

Data Mart

This shows the Data Mart for Ticket Sales.

Page 8: Data Warehousing by Example

Similarly:

Event 2 – Get Lunch This shows how we handle the Second Event.

The Design Pattern

This shows how the Design Pattern applies to this Event.

Message Format:

EVENT

DATE LOCATION PRICE DETAILS

Buy Lunch

Date and Time

At Restaurant

Total Price

Rice, Chicken,Dal etc

BUY LUNCH MENU STAFF

CUSTOMERS CREDIT-CARD RESTAURENT

SALES RECEIPT

Page 9: Data Warehousing by Example

DATAWARE HOUSE

Restaurant Data Mart:

This shows the Data Mart for Restaurant data.

Customers Service

Customers

Address

Credit Card

Supplier

Services

Document Location

Page 10: Data Warehousing by Example

Event 3 – Watch the IPL Match:

The Design Pattern This shows how the Design Pattern applies to this Event. In this case, the Event is the Match between two Teams and the Outcome is very important It is quite common for Event to have an Outcome, but so far, it has not been important enough to justify appearing at the top level.

Message Format:

EVENT

DATE LOCATION PRICE DETAILS

Watch the IPL

Date and Time

At Eden Garden

Price of Ticket

Results/Outcome

Competition Cricket Venue Staff

Audience

(Customers)

9

Outcome

My Cricket

Page 11: Data Warehousing by Example

Data Warehouse : This shows the design of the Data Warehouse (DWH) after the third Event of watching the cricket match.

Data Mart : This shows the Data Mart for Cricket Competition Results data.

Customer

Customer Service

Documents Location Outcome

Services

Supplier

Staff

Staff

Staff

Page 12: Data Warehousing by Example

Combined Data Mart:

This shows the three Data Marts: I. IPL Match Results

II. Restaurant Orders

III. Ticket Sales