19
Right In Time Presented By: Maria Baron Written By: Rajesh Gadodia Intelligent Enterprise Feb 7, 2004 Vol. 7, Iss. 2; pg 26

Right In Time Presented By: Maria Baron Written By: Rajesh Gadodia Intelligent Enterprise Feb 7, 2004 Vol. 7, Iss. 2; pg 26

Embed Size (px)

Citation preview

Page 1: Right In Time Presented By: Maria Baron Written By: Rajesh Gadodia Intelligent Enterprise Feb 7, 2004 Vol. 7, Iss. 2; pg 26

Right In Time

Presented By: Maria BaronWritten By: Rajesh GadodiaIntelligent EnterpriseFeb 7, 2004Vol. 7, Iss. 2; pg 26

Page 2: Right In Time Presented By: Maria Baron Written By: Rajesh Gadodia Intelligent Enterprise Feb 7, 2004 Vol. 7, Iss. 2; pg 26

Traditional Data Warehouse

Central repository of transactional data spread across heterogeneous platforms and applications

Focused on strategic reporting and analysis Loaded periodically (nightly, weekly, monthly) Information latency

Page 3: Right In Time Presented By: Maria Baron Written By: Rajesh Gadodia Intelligent Enterprise Feb 7, 2004 Vol. 7, Iss. 2; pg 26

Evolution of The Data Warehouse

First-generation Reporting

Second-generation Analytic processing and data mining Multidimensional tools for drill down

New generation Speed information cycle time Minimize latency Information on demand

Page 4: Right In Time Presented By: Maria Baron Written By: Rajesh Gadodia Intelligent Enterprise Feb 7, 2004 Vol. 7, Iss. 2; pg 26

Why Real Time Data Warehousing?

Active decision support Business activity monitoring (BAM) Alerting Efficiently execute business strategy Monitoring is completed in the background Positions information for use by downstream

applications Can be built on top of existing data

warehouse

Page 5: Right In Time Presented By: Maria Baron Written By: Rajesh Gadodia Intelligent Enterprise Feb 7, 2004 Vol. 7, Iss. 2; pg 26

Traditional Vs. Real-Time Data Warehouse

Traditional Data Warehouse (EDW) Strategic

Passive Historical trends

Batch Offline analysis

Isolated Not interactive

Best effort Guarantees neither availability nor performance

Page 6: Right In Time Presented By: Maria Baron Written By: Rajesh Gadodia Intelligent Enterprise Feb 7, 2004 Vol. 7, Iss. 2; pg 26

Traditional Vs. Real-Time Data Warehouse

Real-Time Data Warehouse (RTDW) Tactical

Focuses on execution of strategy Real-Time

Information on Demand Most up-to-date view of the business

Integrated Integrates data warehousing with business processes

Guaranteed Guarantees both availability and performance

Page 7: Right In Time Presented By: Maria Baron Written By: Rajesh Gadodia Intelligent Enterprise Feb 7, 2004 Vol. 7, Iss. 2; pg 26

Real-Time Integration

Goal of real-time data extraction, transformation and loading Keep warehouse refreshed Minimal delay

Issues How does the system identify what data has been

added or changed since the last extract Performance impact of extracts on the source

system

Page 8: Right In Time Presented By: Maria Baron Written By: Rajesh Gadodia Intelligent Enterprise Feb 7, 2004 Vol. 7, Iss. 2; pg 26

Real-Time Data Warehouse – Logical Architecture

Page 9: Right In Time Presented By: Maria Baron Written By: Rajesh Gadodia Intelligent Enterprise Feb 7, 2004 Vol. 7, Iss. 2; pg 26

Techniques for real-time ETL

Simulated real-time feed Increase the frequency of batch runs Most useful when information is not required to be

‘up to the minute’ Requires minimal changes to existing ETL

infrastructure Easy to implement

Page 10: Right In Time Presented By: Maria Baron Written By: Rajesh Gadodia Intelligent Enterprise Feb 7, 2004 Vol. 7, Iss. 2; pg 26

Techniques for real-time ETL

Trickle Feed Allows continuous update of the RTDW as the

data in the source system changes Messaging infrastructure Perpetually open data pipe Also called streaming Basic elements – Capture, Stage and Apply

Page 11: Right In Time Presented By: Maria Baron Written By: Rajesh Gadodia Intelligent Enterprise Feb 7, 2004 Vol. 7, Iss. 2; pg 26

Techniques for real-time ETL

Trickle feed (cont.) Target and source databases must be configured May require special gateways Source – capture process: automatically capture

changes to data or table structure RTDW records changes as logical change

records (LCRs) that are kept in a staging partition called the message queue

The message queue can be explicitly updated by user applications

Page 12: Right In Time Presented By: Maria Baron Written By: Rajesh Gadodia Intelligent Enterprise Feb 7, 2004 Vol. 7, Iss. 2; pg 26

Techniques for real-time ETL

Trickle feed Role of Target database A process takes the logical change records out of

the message queue and applies changes to selected database objects

Rules are set in message queues to handle data transformation

Require upfront development and can be complex to configure and manage

Page 13: Right In Time Presented By: Maria Baron Written By: Rajesh Gadodia Intelligent Enterprise Feb 7, 2004 Vol. 7, Iss. 2; pg 26

Trickle Feed Architecture for Real-Time load

Page 14: Right In Time Presented By: Maria Baron Written By: Rajesh Gadodia Intelligent Enterprise Feb 7, 2004 Vol. 7, Iss. 2; pg 26

Information Delivery

Changes to traditional data warehouse Need to accommodate continuous data trickle

feeds intermixed with liver user queries Schema design Active partition management Data aggregation

Page 15: Right In Time Presented By: Maria Baron Written By: Rajesh Gadodia Intelligent Enterprise Feb 7, 2004 Vol. 7, Iss. 2; pg 26

Designing an RTDW - Options

Trickle And Flip Copy of fact table is made and given a name that

cannot be accessed by queries As new data trickles in, it is appended to copy of

the fact table At certain intervals, the trickle is halted, the copy

fact table is copied, renamed to the active fact table name, (the active fact table is deleted) and the process starts over

Poses scalability problems – may not keep up with the trickle depending on the size of the table

Page 16: Right In Time Presented By: Maria Baron Written By: Rajesh Gadodia Intelligent Enterprise Feb 7, 2004 Vol. 7, Iss. 2; pg 26

Designing an RTDW - Options

Table Partitioning Allows for the creation of large tables that are

handled internally by the database as a series of smaller ones, each with its own indexes

Can rope off partition so it isn’t visible to active queries

Problem: Determining criteria for partitioning

Page 17: Right In Time Presented By: Maria Baron Written By: Rajesh Gadodia Intelligent Enterprise Feb 7, 2004 Vol. 7, Iss. 2; pg 26

Designing an RTDW - Options

Real-Time partitions Create new tables that resemble active fact tables

that are designed for quick updates Interval tables – contain data from only the last

update Truly real-time Can be accessed by analysts and other BI tools

Page 18: Right In Time Presented By: Maria Baron Written By: Rajesh Gadodia Intelligent Enterprise Feb 7, 2004 Vol. 7, Iss. 2; pg 26

Real-Time Partition

Page 19: Right In Time Presented By: Maria Baron Written By: Rajesh Gadodia Intelligent Enterprise Feb 7, 2004 Vol. 7, Iss. 2; pg 26

Conclusion

RTDWs have an a distinct advantage for those business utilizing time-sensitive data Call Centers Performance indicators Fraud detection Yield management Certain financial transactions