Upload
greg-turmel
View
6.476
Download
4
Tags:
Embed Size (px)
DESCRIPTION
Enterprise Data warehouse using Oracle and Ellucian products. GoldenGate real-time replication from 20 sources to one DW target
Citation preview
Data Replication: The power of filtering using GoldenGate
Presented by: Greg Turmel Senior Database AdministratorTennessee Board of Regents
1
Summit 2012: Supporting Student Success…
Simple Overview: Classic
capture
trails
pump replicat
Classic Capture Route Transform Delivery
Source Target
trails
CLI: Manager CLI: Manager
GUI: Director
Arc/Redo
Ellucian Banner 9 Ellucian EDW 8.3 / 8.4
2
Simple Overview: Integrated
Log miningServerlet
trailspump
replicat
Integrated Capture Route Transform Delivery
Source Target
trails
CLI: Manager CLI: Manager
GUI: Director
Arc/Redo
Extract
or
1: Direct
2: PumpLegend:
GUI: Graphical User InterfaceCLI: Command Line InterfaceTCP/IP: Network ProtocolsArc/Redo: Transaction loggingTrails: Change Data Extraction filesSource: BannerTarget: Common Repository
Ellucian EDW 8.3 / 8.4Ellucian Banner 9
3
Extract
Oracle © 11g Fusion Middleware: Using Oracle © GoldenGate for Oracle © Database Whitepaper 1539665 03/2012
Oracle Replication Whitepapers
Unidirectional Bidirectional
Peer – to – Peer
4
“Due to the nature of Oracle GoldenGate functionality and operations, as an enterprise software (solution), it involves all layers of the technology stack from sourceto target database to networking”.
What does that mean?
GoldenGate is highly integrated with database software, OS processes, and (even Applications like Ellucian Banner 8 – 9 Change Management.) [e.g.] Different Versions of Operating Systems, Storage Systems, and the applications used throughout the enterprise.
Prusinski, B., Steve Phillips, and Richard Chung, (2011), Expert Oracle GoldenGate, Apress: New York, N.Y.
Oracle Replication Whitepapers
5
Webcast: Oracle GoldenGate 11g Release 2 Launch Webcast (Sept 12 10AM PT)
Oracle Replication Webcast
6
Webcast: Oracle GoldenGate 11g Release 2 Launch Webcast (Sept 12 10AM PT)
Oracle Replication Webcast
7
Webcast: Oracle GoldenGate 11g Release 2 Launch Webcast (Sept 12 10AM PT)
Oracle Replication Webcast
8
Create a service account: ogg (GoldenGate service account – UNIX dba group user)
Create a database account: ggs (GoldenGate service account – DBA user)
Create a firewall port (access point) for ssh (22), ogg (78**), and database (1521)
Space: Separate tablespace for process synchronization (less than 100 meg)
Space: Separate multi‐purpose file system space for the UNIX user
/home/GoldenGate /home/GoldenGate/Installation_files /home/GoldenGate/Campus_extract_trails /home/GoldenGate/Campus_data_pump_exports (20+ gb) 3000+ student system every 12 hours 12000+ student system (expect 400 meg) every 12 hours (Committed) Note: purge policy allows fine tune planning including purge once shipped
Replication: Campus Requirements
9
Defined list of Banner© ERP tables or even selected columns from those tables:
table ALUMNI.AABDUES; table ALUMNI.AABDUES;table ALUMNI.AABMINT; table ALUMNI.AABMINT;table ALUMNI.AABMSHP; table ALUMNI.AABMSHP;
List of tables captured: Defined / limited by the Ellucian ODS/EDW product
Replication Filtering
10
Mapping column values across the two systems using key words and variables:
MAP ALUMNI.AABDUES, TARGET ALUMNI.AABDUES, INSERTALLRECORDS,colmap (usedefaults, AABDUES_SURROGATE_ID = @GETENV ("GGHEADER", "BEFOREAFTEARINDICATOR"),AABDUES_VERSION = @GETENV ("GGHEADER", "OPTYPE"),AABDUES_USER_ID = @GETENV ("GGHEADER", "OPTYPE"),AABDUES_VPDI_CODE = @STRCAT (@GETENV ("RECORD", "FILESEQNO"), @GETENV ("RECORD", "FILERBA")));
Mapping column values across the two systems:
MAP SATURN.TWGRWMRL, TARGET SATURN.TWGRWMRL,INSERTALLRECORDS, colmap (usedefaults, TWGRWMRL_SURROGATE_ID = "",TWGRWMRL_VERSION = "",TWGRWMRL_USER_ID = "",TWGRWMRL_VPDI_CODE = "008863"); (FICE)
Replication Filtering
11
Add Oracle Database logging on the tables sending changes to redo/archive logs:
add trandata SATURN.STVVETCadd trandata SATURN.STVVOEDadd trandata SATURN.STVVTAB
Structural changes to the tables (DDL or Data Definition Language) can also be added but requires source database modification to capture and store in redo.
Extraction process: Encrypted
Pump process: decrypt / encrypt / compress / transports
Table ALUMNI.*;
Replicat: Retrieve / decrypt / load / archive
Table ALUMNI.*;
Replication Filtering
12
Replicat table: data direct on SATURN into ODS 8.3 staging using eload
SATURN.SGBSTDN – success
Replicat table: data direct on SATURN into ODS 8.3 staging using SCN and datapump
SATURN.SGBSTDN – success
Replicat (Processing over on the TARGET: destination database server)
HANDLECOLLISIONSsourcedefs ./dirdef/*_tables.defsDISCARDFILE /u04/*_trails/rws.dsc, purgeDBOPTIONS DEFERREFCONST
Replication Filtering: Instantiation
13
14
Priority 1.
GeneralFimsmgrTaismgrPayrollPosnctlSaturnFaismgrAlumni
Replication Filtering
Validating a replicat run using Oracle GoldenGate CLI:
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNINGREPLICAT RUNNING RWS 00:00:00 00:00:09
REPLICAT RWS Last Started 2012‐05‐02 15:37 Status RUNNINGCheckpoint Lag 00:00:00 (updated 00:00:04 ago)Log Read Checkpoint File /u04/ws_trails/ws000000
2012‐05‐02 15:48:15.995504 RBA 20707111
23> !
REPLICAT RWS Last Started 2012‐05‐02 15:37 Status RUNNINGCheckpoint Lag 00:00:00 (updated 00:00:06 ago)Log Read Checkpoint File /u04/ws_trails/ws000000
2012‐05‐02 15:49:32.993561 RBA 20712107
Replication Filtering
15
Results of 200 hour network outage
Scenario: Hosting site network outage – Juniper router issues School “A” extract continues collecting (Banner source) School “B” extract continues collecting (Banner source)
Target “Z” replicat “A” abend after 60 minutes of auto‐reconnection Target “Z” replicat “B” abend after 60 minutes…
Network comes back online
Target “Z” replicat “A” restarted – load 200 hours of commit in 1.5 minutes Target “Z” replicat “B” restarted – ditto
Total time to re‐sync both ODS pre‐stage tables after network outage was restored:
In 2 minutes (resync of 4400 tables established) Down line application tables loading in real‐time.
Replication Filtering
16
Replication Security Design1. Extract (Source)
– uses protected UNIX service account– select table data using sql account with encrypted password – write to file encrypted and SAN/NAS/ASM can be a protected source file system– file compressed in flight (writing file out to disk)
2. Pump (Source) – decrypts extract / encrypts extract file– uses secure file transfer protocol (SFTP / SSH)
3. Replicat (Target) – uses protected UNIX service account– uses secure file transfer protocol (SFTP / SSH)– decompress / decrypts extract file and writes to protected target file system – uploads to database using encrypted password – uses SCN (change number for instantiation and synchronization between databases)
4. ARGOS (Reporting) – Campus report server integrated with MS Active Directory– Campus report server ADO string secured on MAPS server with password– Access accounts to CR Project tables will be built by module/schema/table– Oracle Virtual Private Database Identifiers for data retrieval/reporting
17
AlignmentAnswering Right
QuestionsGlobal Impact(Tennessee)
Building a structure
Data Elements Defined
Addressed all concernsand Security Issues
Finding Solutions
Help putting the pieces together
Gold Star Effort
Educating customers,Management, stakeholders
Replication Security Questions
18
Replication Design Considerations
19
Data acquisition
Data separation – keeping the loads separated, processes separated
Isolated reporting – Quality assurance processes and campus oversight
Business Process Management – Defines the rules
Replication Project Planning
20
Data Replication Destination
21
Automation ( with tunable parameters)
Delivers low‐impact across heterogeneous systems
Moves committed transactions with minimal overhead
Reduce impact on OLTP system for reporting
Disaster recovery – data replication offers unique opportunities to refine plans
Data distribution, data synchronization, and high availability
Manage dissimilar Oracle or other database versions
Replicate and filter Oracle DDL operations between heterogeneous databases
22
Replication Benefits and Impact
Common reporting system is achievable very quickly
Campus #1 informed us that 400 Banner and 80 ODS reports are used
Campus #2 said they use 100 Banner and 1 ODS report
CR Project sub‐system has Banner ODS stage tables used for kpi reports
Reduces reliance on separate data extracts taking months to near real time
Dedicate a disk to this directory: dirtmp
Extract, Pump, Replicat, and Manager processes must operate as an operating system user that has privileges to read, write, and delete files and subdirectories in the Oracle GoldenGate directory
Replication Benefits and Impact
23
5,000 concurrent Extract and Replicat processes per instance of Oracle GoldenGate.
Each Extract and Replicat process needs approximately 25‐55 MB of memory.
Physical memory used by any Oracle GoldenGate process is controlled by the operating system, not the Oracle GoldenGate program
Classic capturemode, the Extract process reads the redo logs directly
50‐150 MB for installation and 40‐100 MB for the working directories and binaries
Binaries on a shared file system available to all cluster nodes
24
Replication Benefits and Impact
http://docs.oracle.com/cd/E28323_01/doc.1121/e27278.pdf
Extract should not be stopped during a failure
Transaction data might be missed if the transaction logs recycle
Or if removed from the system before the data is completely captured
There must be enough disk space to hold the data or Extract will abend
Trail file clean up is configured and set according to the purge rules
Set with the PURGEOLDEXTRACTS parameter (Source or Target)
You will need to resynchronize target data if the outage outlasts disk capacity. (e.g. – instantiation using datapump or ogg eload)
25
Replication Benefits and Impact
CR Project ‐ Data Transfer Feeds:
FICE Begin End Begin EndOrder Source VPDI Extract Extract Replicat Replicat
1 APSU 003478 1 8 1 8 82 ChSCC 003998 11 18 11 18 83 ClSCC 003999 21 28 21 28 84 CoSCC 003483 31 38 31 38 85 DSCC 006835 41 48 41 48 86 ETSU 003487 51 58 51 58 87 JSCC 004937 61 68 61 68 88 MSCC 006836 71 78 71 78 89 MTSU 003510 81 88 81 88 810 NeSCC 005378 91 98 91 98 811 NSCC 008145 101 108 101 108 812 PSCC 012693 111 118 111 118 813 RSCC 009914 121 128 121 128 814 STCC 010439 131 138 131 138 815 TBR 000001 141 148 141 148 816 TSU 003522 151 158 151 158 817 TTU 003523 161 168 161 168 818 UoM 003509 171 178 171 178 819 VSCC 009912 181 188 181 188 820 WSCC 008863 191 198 191 198 8
160
26
Replication Benefits and Impact
Support multi‐byte character data
The source and target database schema definition must be logically identical
The character sets between the two databases must be one of the following:
o Identical / Equivalento Target is superset of the source: [e.g.] UNICODE is a superset o Multi‐byte data is supported when length semantics are in bytes or characters.
Does not support negative dates: Supports the capture and replication of TIMESTAMP with TIME ZONE as a UTC offset (TIMESTAMP '2011‐01‐01 8:00:00 ‐8:00')
Binary or unprintable characters are not supported
Ignores any virtual column that is part of a unique key or index
27
Replication Benefits and Impact
Replication Benefits and Impact
28
Chancellor Morgan for THEC / Regents / Decision Support
Campus Presidents for System measurements
Vice Chancellor Wendy Thompson for Access and Diversity Researchfor Completion Delivery Unit (CDU)
Vice Chancellor Nicholsfor Complete College Tennessee Act (CCTA)
Interim Vice Chancellor Clarkfor Institutional Researchfor Tennessee Higher Education requests (THEC)for Legislative requests (Ad‐Hoc information)
Campus Operations as necessary (gap analysis)Identifying
Customers a
nd th
eir R
equirements
Supports the capture of direct‐load INSERT(s)
Supplemental logging must be enabled
Database must be in archive log mode
Does not capture from a view
Supports capture from the underlying tables of a view
Can replicate to a view as long as the view is inherently updatable
Materialized views created WITH ROWID are not supported
Truncates on materialized views are not supported.
Replication Limitations
29
Replication Challenges Encrypted tables are not supported in classic capture
Supports the replication of sequence values in a uni‐directional.
Merging / Converging data: from sub‐system to Common Repository
Data Integrity: One way data migration (No Campus edits at target system)
Conflict management:Managing logging and error queue: Resolution
Avoiding a single point of failure – breaks in data stream/time stamps/outages
Change at the Banner Source impacts feeds when columns / structures change
Change includes Banner source DDL and ODS target baseline patching
Campus “A” patching different than Campus “B” – BPM priorities – impact/gap
30
Campus versioning impacts coding requirements on the CR Project server
Physical requirements: Allocated space at campus (direct cost)
[e.g.] /oback runs out of shared space and stopping replication
OGG Processes abend when they can’t write out the data captured
Indexing used for loading integrity vs. indexing used for reporting integrity
GoldenGate Replication Challenges
31
GoldenGate Replication Strategies
Strategies:
Secure access to Banner ERP data adding immediate value to Board of Regents, Presidents, Chancellor and Campus reporting
product issues with OGG©, Oracle STREAMS©, and Ellucian© ODS/EDW
Create and coordinate Banner© patch / apply (change management) processes
Evaluate break/fix for minimizing manual intervention (Automation)
Rolling out stable, integrated, mature products
Coordinating Source / Target Hardware Shutdown / Database Quiesced
32
Data Replication: Objective StatementThe Common Data Repository Project: Supporting the…
Access and Diversity (Investigative Reporting) Completion Delivery Unit (CDU Project) Complete College of Tennessee Act (2010) for a common student experience Executive Reporting needs for system wide decision support (Governance) Legislative Reporting – Requests for Information (Ad‐Hoc queries) Race to the top – Collaboration events in measuring student success Tennessee Higher Education Commission reporting Workforce development efforts at the Board of Regents
(Referred to as CDR or CR Project) It is scoped to provide a real time reporting system for the Regents, Campus Presidents, Chancellor Morgan and his Executive Staff.
TBR supports over 200,000 students. The goal is to have a common system aimed at providing the best possible service using modern technologies in a more efficient approach to information system resources and reporting.
33
Summit 2012: Supporting Student Success…
GoldenGate Replication Questions
Questions?
34
Contact Information
Greg TurmelSr. Database AdministratorOffice of Information TechnologyTennessee Board of Regents
1415 Murfreesboro Rd. #358Nashville, TN. 37217615.366.4467 (Office)
http://itinfo.tbr.edu (IT website)http://twitter.com/datahaulrhttp://www.slideshare.net/gturmelhttp://www.linkedin.com/in/gturmel
35