23
DATABASE SYSTEMS DATA WAREHOUSING I Asma Ahmad 29 th April, 2011

D ATABASE S YSTEMS D ATA W AREHOUSING I Asma Ahmad 29 th April, 2011

Embed Size (px)

Citation preview

Page 1: D ATABASE S YSTEMS D ATA W AREHOUSING I Asma Ahmad 29 th April, 2011

DATABASE SYSTEMSDATA WAREHOUSING I

Asma Ahmad29th April, 2011

Page 2: D ATABASE S YSTEMS D ATA W AREHOUSING I Asma Ahmad 29 th April, 2011

THE EVOLUTION OF DATA WAREHOUSING

Since 1970s, organizations gained competitive advantage through systems that automate business processes to offer more efficient and cost-effective services to the customer.

This resulted in accumulation of growing amounts of data in operational databases.

04/19/23

2

Page 3: D ATABASE S YSTEMS D ATA W AREHOUSING I Asma Ahmad 29 th April, 2011

THE EVOLUTION OF DATA WAREHOUSING

Organizations now focus on ways to use operational data to support decision-making, as a means of gaining competitive advantage.

However, operational systems were never designed to support such business activities.

Businesses typically have numerous operational systems with overlapping and sometimes contradictory definitions.

04/19/23

3

Page 4: D ATABASE S YSTEMS D ATA W AREHOUSING I Asma Ahmad 29 th April, 2011

THE EVOLUTION OF DATA WAREHOUSING

Organizations need to turn their archives of data into a source of knowledge, so that a single integrated / consolidated view of the organization’s data is presented to the user.

A data warehouse was deemed the solution to meet the requirements of a system capable of supporting decision-making, receiving data from multiple operational data sources.

04/19/23

4

Page 5: D ATABASE S YSTEMS D ATA W AREHOUSING I Asma Ahmad 29 th April, 2011

WHAT IS A DATA WAREHOUSE?WHAT IS A DATA WAREHOUSE? A complete repository of historical corporate

data extracted from transaction systems that is available for ad-hoc access by knowledge workers.

04/19/23

5

Page 6: D ATABASE S YSTEMS D ATA W AREHOUSING I Asma Ahmad 29 th April, 2011

WHAT IS A DATA WAREHOUSE?WHAT IS A DATA WAREHOUSE?

Transaction SystemManagement Information System (MIS)Could be typed sheets (NOT transaction

system) Ad-Hoc access

Dose not have a certain access pattern.Queries not known in advance.Difficult to write SQL in advance.

Knowledge workersTypically NOT IT literate (Executives,

Analysts, Managers).NOT clerical workers.Decision makers.

04/19/23

6

Page 7: D ATABASE S YSTEMS D ATA W AREHOUSING I Asma Ahmad 29 th April, 2011

ANOTHER VIEW OF A DWHANOTHER VIEW OF A DWH04/19/23

7

Page 8: D ATABASE S YSTEMS D ATA W AREHOUSING I Asma Ahmad 29 th April, 2011

DATA WAREHOUSING CONCEPTSA subject-oriented, integrated, time-

variant, and non-volatile collection of data in support of management’s decision making process (Inmon, 1993)

04/19/23

8

Page 9: D ATABASE S YSTEMS D ATA W AREHOUSING I Asma Ahmad 29 th April, 2011

SUBJECT-ORIENTED DATAWarehouse is organized around major

subjects of the enterprise (e.g. customers, products, sales) rather than major application areas (e.g. customer invoicing, stock control, product sales)

This is reflected in the need to store decision-support data rather than application-oriented data

04/19/23

9

Page 10: D ATABASE S YSTEMS D ATA W AREHOUSING I Asma Ahmad 29 th April, 2011

INTEGRATED DATAThe data warehouse integrates corporate

application-oriented data from different source systems, which often includes data that is inconsistent

The integrated data source must be made consistent to present a unified view of the data to the users

04/19/23

10

Page 11: D ATABASE S YSTEMS D ATA W AREHOUSING I Asma Ahmad 29 th April, 2011

TIME-VARIANT DATAData in the warehouse in only accurate

and valid at some point I time or over some time interval

Time-variance is also shown in the extended time that data is held, implicit or explicit association of time with all data, and the fact that the data represents a series of snapshots

04/19/23

11

Page 12: D ATABASE S YSTEMS D ATA W AREHOUSING I Asma Ahmad 29 th April, 2011

NON-VOLATILE DATAData in the data warehouse is not updated

in real time but is refreshed from the operational system on the regular basis

New data is always added as a supplement to the database, rather then a replacement

04/19/23

12

Page 13: D ATABASE S YSTEMS D ATA W AREHOUSING I Asma Ahmad 29 th April, 2011

BENEFITS OF DATA WAREHOUSINGPotential high returns on investment

Competitive advantage

Increased productivity of cooperate decision-making

04/19/23

13

Page 14: D ATABASE S YSTEMS D ATA W AREHOUSING I Asma Ahmad 29 th April, 2011

CAUTION!CAUTION!

A Warehouse of Data is NOT a

Data Warehouse

04/19/23

14

Page 15: D ATABASE S YSTEMS D ATA W AREHOUSING I Asma Ahmad 29 th April, 2011

CAUTION!CAUTION!

Size is NOT Everything

04/19/23

15

Page 16: D ATABASE S YSTEMS D ATA W AREHOUSING I Asma Ahmad 29 th April, 2011

WHY A DATA WAREHOUSE?WHY A DATA WAREHOUSE? Businesses demand Intelligence (BI).

Complex questions from integrated data. “Intelligent Enterprise”

04/19/23

16

Page 17: D ATABASE S YSTEMS D ATA W AREHOUSING I Asma Ahmad 29 th April, 2011

WHY A DATA WAREHOUSE?WHY A DATA WAREHOUSE?

Businesses want much more…

What happened? Why it happened? What will happen? What is happening? What do you want to happen?

Stages of Data Warehouse

04/19/23

17

Page 18: D ATABASE S YSTEMS D ATA W AREHOUSING I Asma Ahmad 29 th April, 2011

HOW IS IT DIFFERENT?HOW IS IT DIFFERENT? Combines operational and historical

data. Don’t do data entry into a DWH, OLTP or ERP are

the source systems.

OLTP systems don’t keep history, cant get balance statement more than a year old.

DWH keep historical data, even of bygone customers. Why?

In the context of bank, want to know why the customer left?

What were the events that led to his/her leaving? Why?

Customer retention.

04/19/23

18

Page 19: D ATABASE S YSTEMS D ATA W AREHOUSING I Asma Ahmad 29 th April, 2011

HOW MUCH HISTORY?HOW MUCH HISTORY? Depends on:

Industry.Cost of storing historical data.Economic value of historical data.

Industries and historyTelecomm calls are much much more as

compared to bank transactions- 18 months.

Retailers interested in analyzing yearly seasonal patterns- 65 weeks.

Insurance companies want to do actuary analysis, use the historical data in order to predict risk- 7 years.

Hence, NOT a complete repository of data

04/19/23

19

Page 20: D ATABASE S YSTEMS D ATA W AREHOUSING I Asma Ahmad 29 th April, 2011

HOW MUCH HISTORY?HOW MUCH HISTORY?

Economic value of dataVs.

Storage cost

Data Warehouse acomplete repository of data?

04/19/23

20

Page 21: D ATABASE S YSTEMS D ATA W AREHOUSING I Asma Ahmad 29 th April, 2011

COMPARISONOLTP SYSTEMS

Holds Current Data Stores Detailed Data Data is dynamic Repetitive Processing High volume of transactions Predictable pattern of usage Transaction-driven Application-oriented Supports day-to-day decisions Serves a large number of

users

DATA WAREHOUSING SYSTEMS

Holds historical data Stores detailed and summarised

data Data is generally static Ad hoc, unstructured, heuristic proc. Medium to low volume of

transactions Unpredictable pattern of usage Analysis Driver Subject-oriented Supports strategic decisions Serves low number of managerial

users

04/19/23

21

Page 22: D ATABASE S YSTEMS D ATA W AREHOUSING I Asma Ahmad 29 th April, 2011

TYPICAL ARCHITECTURE OF A DW

04/19/23

22

Page 23: D ATABASE S YSTEMS D ATA W AREHOUSING I Asma Ahmad 29 th April, 2011

TYPICAL APPLICATIONSImpact on organization’s core business is to

streamline and maximize profitability.◦ Fraud detection.◦ Profitability analysis.◦ Direct mail/database marketing.◦ Credit risk prediction.◦ Customer retention modeling.◦ Yield management.◦ Inventory management.

ROI on any one of these applications can justify HW/SW & consultancy costs in most organizations.

04/19/23

23