Upload
yair
View
68
Download
0
Embed Size (px)
DESCRIPTION
Oracle Change Data Capture. Jack Raitto, Development Manager Oracle NEDC NYOUG Long Island SIG October 7, 2004. Capture your change data for FREE!*. * Zero additional license cost over Oracle10g EE Virtually zero source system processing cost. What is Oracle CDC?. - PowerPoint PPT Presentation
Citation preview
Oracle Corporation
1
Oracle Change Data Capture
Jack Raitto, Development Manager Oracle NEDC
NYOUG Long Island SIGOctober 7, 2004
Oracle Corporation
2
Capture your change data for FREE!*
Before After
Change CaptureCost
* Zero additional license cost over Oracle10g EE Virtually zero source system processing cost
Oracle Corporation
3
What is Oracle CDC?
Captures change data from operational system(s) as it occurs
Part of Extract / Transform / Load (ETL) process for DSS / Data warehouse, potentially other applications
Optimizes the extract phase Unleashes SQL power for
transformations Provides management framework for
change data
Oracle Corporation
4
How was it done before (old way)?
MethodMethod Major IssuesMajor Issues
Application logging / triggers
Maintenance, transaction impacts
Timestamp / change key column
Application design & performance impact, no before image
Table differencing Impractical for large tables, high transport costs, not timely
Log sniffing Not supported, does not track DB releases, security issues, rocket science
Oracle Corporation
5
CDC Advantages
• Built in, custom fit, evolves with the database• Delivers change data when you need it,
where you need it • Offers several tradeoffs between timely
change delivery vs. source system overhead (sync, async hotlog, async autolog, etc.)
• Assumes complete change management responsibility
Oracle Corporation
6
CDC Advantages (concl.)
• Captures all change data along with transaction information – see all changes a given transaction made and who made them
• Transactional consistency for changes across multiple source tables is guaranteed
• Transparently coordinates sharing of change data across users and applications
• You don’t need rocket scientists on your staff!
Oracle Corporation
7
CDC Configurations
Sync CDC Async CDC HotLog
Async CDC AutoLog
Available Oracle 9i EE
Oracle 10g SE
Oracle 10g EE Oracle 10g EE
Source system cost
Transaction delay, system resources
System resources
Minimal (~2%)
Part of txn YES NO NO
Latency Real time Near real time Varies w / topology, checkpoint & log switch interval
Systems 1 1 2
Oracle Corporation
8
How CDC Works: Sync CDC
Uses internal triggers to capture before and/or after images of new and updated rows
Has the same performance implications as capture via user triggers
Delivers change data in real-time Uses the same interface as async CDC
Oracle Corporation
9
Synchronous CDC HotLog
Order
Customer
Combined Source / Operational BI System
Upsert to Load DimensionTables
CDCChange Tables
Direct PathInsert to loadFact Tables
CDC
ETL Process
Triggers
Oracle Corporation
10
How CDC Works: Async CDC
Relational interface to Streams• Prepackaged Streams application
• Asynchronously captures change data from redo/archive logs
• Presents relational interface to change data stream
Can operate on source system (hot log) or staging system (auto log)
Oracle Corporation
11
Foundations of Async CDC
LogMiner
Streams
Async CDCReplication
Message queuingWarehouse loadingEvent notificationData protection
Change captureChange managementWarehouse loading
Redo log inspectionDebugging
AuditingReversing transactions
Oracle Corporation
12
Asynchronous CDC HotLog
Order
Customer
Combined Source / Operational BI System
ActiveRedoLog
LogMiner
Upsert to Load DimensionTables
CDCChange Tables
Direct PathInsert to loadFact Tables
Streams
CDC
ETL Process
Oracle Corporation
13
Asynchronous CDC AutoLog
Order
Customer
SourceDatabase
Data Warehouse / Staging System
RedoLogs
LogMiner
Upsert to Load DimensionTables
CDCChange Tables
Direct PathInsert to loadFact Tables
ArchivedRedo Logs
Arch Process
Streams
CDC
ETL Process
Oracle Corporation
14
Using CDC: Publish/Subscribe
Publisher supplies, subscribers consume change data
Model allows sharing of change data across users and applications
Coordinates retention / purge of change data Prevents application from accidentally
processing change data more than once Guarantees transactional consistency of
change data across source tables via change sets
Oracle Corporation
15
Using CDC: Publish/Subscribe
Publisher
Change Data Publication
Subscriber 1
Subscription CustNo Last First
123 Smith Frank
124 Jones Mary
125 Stein Linda
Subscriber 2
Subscription CustNo Last First
125 Stein Linda
126 Vine Abe
127 Block Greg
CustNo Last First
123 Smith Frank
124 Jones Mary
125 Stein Linda
126 Vine Abe
127 Block Greg
Table Column Type
Cust CustNo number
Cust Last varchar
Cust First varchar
Oracle Corporation
16
Publisher Concepts
Change source• Defines the source system to CDC
Change set• Collection of source tables for which
transactionally consistent change data is needed
Change table• Container to receive change data
• Is published to subscribers
Oracle Corporation
17
Publisher ConceptsSource Database: HQ Staging Database: DW
Change Source: HQ_SRC
Change Set: SH_SETChange table: sales_ctPROD_IDCUST_IDPROMO_IDAMOUNT_SOLD
Change table: promo_ctPROMO_IDPROMO_SUBCATPROMO_CAT
Source table: sh.salesPROD_IDCUST_IDPROMO_IDAMOUNT_SOLD
QUANTITY_SOLD
Source table: sh.promotionsPROMO_IDPROMO_SUBCATPROMO_CAT
PROMO_COST
Oracle Corporation
18
Publish Package
DBMS_CDC_PUBLISH CREATE / ALTER / DROP_AUTOLOG_CHANGE_SOURCE CREATE / ALTER / DROP_CHANGE_SET CREATE / ALTER / DROP_CHANGE_TABLE PURGE PURGE_CHANGE_SET PURGE_CHANGE_TABLE DROP_SUBSCRIPTION
Oracle Corporation
19
Using Change Data: Subscribers
The subscriber creates a subscription from an available publication
The subscription provides a moving window (view) to the change data
Subscriptions go against a single change set and are therefore transactionally consistent
When all subscribers have advanced past old change data, CDC automatically and efficiently purges
Oracle Corporation
20
Subscription: sales_promo_list
Subscriber ConceptsStaging Database: DW
Change Set: SH_SET
Publication on : sh.salesPROD_IDCUST_IDPROMO_IDAMOUNT_SOLD
Publication on: sh.promotionsPROMO_IDPROMO_SUBCATPROMO_CAT
Subscriber view: spl_sales
Subscriber view: spl_promos
Oracle Corporation
21
Subscriber View
Subscriber view: spl_sales
OPERATION$ CSCN$ USERNAME$ PROD_ID CUST_ID PROMO_ID
I 587322 GRIFFIN 12784 12 0
UO 587482 SLOAN 12784 12 0
UN 587482 SLOAN 12784 12 42
I 594312 BRIGGS 14899 302 42
I 602311 GRIFFIN 12498 12 55
D 711413 SLOAN 138922 7934 0
I 796122 BRIGGS 77741 712 55
I 796122 BRIGGS 13846 712 55
Insert
Insert
Insert
Insert
Insert
UpdatebeforeUpdate
after
Delete
Oracle Corporation
22
Subscriber Package
DBMS_CDC_SUBSCRIBE CREATE_SUBSCRIPTION SUBSCRIBE ACTIVATE_SUBSCRIPTION EXTEND_WINDOW PURGE_WINDOW DROP_SUBSCRIPTION
Oracle Corporation
23
Security
Sync publisher must have SELECT access to the source table
Async publisher must have EXECUTE_CATALOG_ROLE privilege
Publisher uses GRANT and REVOKE on change tables to control subscriber access
Oracle Corporation
24
Performance Benchmark*
Objectives:• Determine impact on transaction time• Determine latency
Source system: Oracle 10g R1 Beta, SunFire 4800 SMP 8x900Mhz/16GB w/striped 8 x Sun StorEdge T3 arrays (9X36.4MB each)
Customer insurance quote OLTP application run at Oracle, 250 concurrent users / 175 TPS, system “warmed up” (steady state)
Mixture of Inserts, Updates, Deletes, Singleton Selects, Cursor Fetches, Rollbacks / Commits, savepoints
Capture changes on all tables
* Your mileage will vary!
Oracle Corporation
25
Transaction Performance
0.9
0.95
1
1.05
1.1
1.15
1.2
no CDC Sync CDC (9i) HotLog CDC(10g)
AutoLog CDC(10g)
Transaction elongated by 10%Relative impact varies depending on other overhead
Oracle Corporation
26
Transaction Performance
0.9
0.95
1
1.05
1.1
1.15
1.2
no CDC Sync CDC (9i) HotLog CDC(10g)
AutoLog CDC(10g)
Transaction elongated by 8%Can reduce elongation by adding RAC nodes / CPUs
Oracle Corporation
27
0.9
0.95
1
1.05
1.1
1.15
1.2
no CDC Sync CDC (9i) HotLog CDC(10g)
AutoLog CDC(10g)
Transaction PerformanceTransaction elongation virtually eliminated
Change capture processing moved off system
Oracle Corporation
28
HotLog Latency Performance
0
0.5 1
1.5 2
2.5 3
0
20
40
60
80
100
% C
ha
nges
Arr
ived
Seconds
About ½ the change data arrived in 1 secondVirtually all the change data arrived in 2 seconds
Oracle Corporation
29
Summary
CDC assumes the burden of change capture for you
Change data is guaranteed consistent and complete
Change data can be shared across users and applications effortlessly
CDC delivers change data where you need it, when you need it, and with minimal overhead
Oracle Corporation
30
For More Information Oracle Data Warehousing Guide, 10gR1,
Chapter 16 Oracle PL/SQL Packages and Types
Reference, 10gR1, packages DBMS_CDC_* http://www.oracle.com/technology/oramag/ora
cle/03-nov/o63tech_bi.html http://www.oracle.com/technology/products/bi/
db/10g/pdf/twp_dss_ontime_etl_10gr1_0304.pdf
http://www.rittman.net/archives/000901.html http://www.nyoug.org/cdc.pdf (Oracle9i)
Oracle Corporation
31
Questions?