27
Connect CDC SQData Architecture Version 4.0

Connect CDC SQData... · 2021. 3. 19. · Connect CDC SQData Architecture 7 Introduction Terminology Change Data Capture Terms commonly used when discussing Change Data Capture: Term

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Connect CDC SQData... · 2021. 3. 19. · Connect CDC SQData Architecture 7 Introduction Terminology Change Data Capture Terms commonly used when discussing Change Data Capture: Term

Connect CDC SQData

Architecture

Version 4.0

Page 2: Connect CDC SQData... · 2021. 3. 19. · Connect CDC SQData Architecture 7 Introduction Terminology Change Data Capture Terms commonly used when discussing Change Data Capture: Term

2 Connect CDC SQData Architecture

Architecture

© 2001, 2021 SQData. All rights reserved.

Version 4.0

Last Update: 1/8/2021

Page 3: Connect CDC SQData... · 2021. 3. 19. · Connect CDC SQData Architecture 7 Introduction Terminology Change Data Capture Terms commonly used when discussing Change Data Capture: Term

Architecture

Contents

Introduction ............................................................................................... 4

Architecture Summary ......................................................................... 5

Organization ........................................................................................ 6

Terminology ......................................................................................... 7

Documentation Conventions ............................................................. 10

Related Documentation .................................................................... 11

Solutions Overview ................................................................................. 12

Simple Source to Target Replication ................................................. 13

High Availability ................................................................................ 14

Configuration Options ............................................................................. 16

Framework ......................................................................................... 17

Platform Specific Options .................................................................. 18

z/OS ............................................................................................. 18

Linux/AIX/UNIX ............................................................................ 18

WIndows ...................................................................................... 19

Cross Platform ............................................................................. 19

Components ............................................................................................ 20

Data Capture Agents ......................................................................... 21

Engines .............................................................................................. 22

Control Center ................................................................................... 24

Utilities .............................................................................................. 25

Page 4: Connect CDC SQData... · 2021. 3. 19. · Connect CDC SQData Architecture 7 Introduction Terminology Change Data Capture Terms commonly used when discussing Change Data Capture: Term

4 Connect CDC SQData Architecture

Introduction

This is an introductory guide for anyone using Precisely's Connect CDC SQData data integration product to performone or more of the following activities:

· Legacy data integration

· Data Replication

· Application integration

· Business event publishing

· Information exchange using JSON formatted data

· Complex data extraction, transformation and movement

· Data conversions/migrations

· High performance multi-platform bulk data transfer

This introductory section:

· Summarizes the components and features found in the Connect CDC SQData Architecture

· Describes how this document is organized

· Defines commonly used terms

· Defines documentation syntax conventions

· Identifies complimentary documents

Page 5: Connect CDC SQData... · 2021. 3. 19. · Connect CDC SQData Architecture 7 Introduction Terminology Change Data Capture Terms commonly used when discussing Change Data Capture: Term

5Connect CDC SQData Architecture

Introduction

Architecture Summary

Precisely Connect CDC SQData provides a comprehensive data integration platform addressing multiple businessneeds, including continuous availability of critical applications, streamlined data warehouse propagation, detectionand publishing of key business events, application integration, real-time business intelligence and legacy datamigrations. Connect CDC SQData supports all major operating system platforms and the databases and file systemsavailable on those platforms; including those considered by some to be "legacy" datastores.

Page 6: Connect CDC SQData... · 2021. 3. 19. · Connect CDC SQData Architecture 7 Introduction Terminology Change Data Capture Terms commonly used when discussing Change Data Capture: Term

6 Connect CDC SQData Architecture

Introduction

Organization

The following sections describe the overall architecture of Precisely's Connect CDC SQData product, the platformssupported and the components that together will connect your business infrastructure.

· Solutions Overview

· Configuration Options

· Components

Page 7: Connect CDC SQData... · 2021. 3. 19. · Connect CDC SQData Architecture 7 Introduction Terminology Change Data Capture Terms commonly used when discussing Change Data Capture: Term

7Connect CDC SQData Architecture

Introduction

Terminology

Change Data Capture

Terms commonly used when discussing Change Data Capture:

Term Meaning

Agent Individual components of the Connect CDC SQData product architecture.

CDC Abbreviation for Changed Data Capture.

Datastore An object that contains data such as a hierarchical or relational database, VSAM file, flat file, etc.

ExitA classification for changed data capture components where the implementation utilizes asubsystem exit in IMS, CICS, etc.

File Refers to a sequential (flat) file.

JCL An abbreviation for Job Control Language that is used to execute z/OS processes.

Platform Refers to an operation system instance.

RecordA basic data structure usually consisting of fields in a file, topic or message. A row consisting ofcolumns in a Relational database table. Record may be used interchangeably with row or message.

SegmentA basic data structure consisting of fields in an IMS hierarchical database. Segments are recordshaving parent and child relationships with other records defined by a Database Description (DBD).

Source A datastore monitored for content changes by a Capture Agent.

SQDCONF A Utility that manages configuration parameters used by some data capture components.

SQDXPARMA Utility that manages a set of parameters used by some IMS and VSAM changed data capturecomponents.

TableUsed interchangeably with relational datastore. A table represents a physical structure that containsdata within a relational database management system.

Target A datastore where information is being updated/written.

Apply Engine

Terms commonly used when discussing Apply or Replicator Engines:

Term Meaning

Agent Individual components of the Connect CDC SQData product architecture.

Page 8: Connect CDC SQData... · 2021. 3. 19. · Connect CDC SQData Architecture 7 Introduction Terminology Change Data Capture Terms commonly used when discussing Change Data Capture: Term

8 Connect CDC SQData Architecture

Introduction

CDC Abbreviation for Changed Data Capture. See the Data Capture Guide for additional information.

AliasAn alternative name that given to an object such as a Datastore, Description or field or column withina Description.

CommandReserved words that define the environment, describe source and target datastores, specifyvariable, field and column level attributes and control the processing of source and target datastores.

Column An individual data element in a relational datastore. Used interchangeably with field.

CursorSimilar in function to a Lookup, a cursor provides the means to explicitly execute virtually any SQLstatement.

Database Refers to a collection of tables/segments/records in a database management system.

Datastore An object that contains data such as a file, database table, IMS segment or WebSphere MQ.

DescriptionDefines the structure/layout of source and target. May be a COBOL copybook or DDL describing arelational table.

Engine Refers to the SQData Engine component.

Event Any condition that can be identified by a change in the value of data that has been Captured

ExitIn the Engine context, a program that performs functions not supported by the typicalIntegration/ETL tool but that can be called by the tool.

Field An individual data element in a datastore. Used interchangeably with column in this manual.

File Refers to a sequential (flat) file on any operating system platform.

Function A built-in SQData routine that facilitates the transformation/manipulation of source data elements.

Join Refers to an SQData join table that returns multiple values.

Message A WebSphere MQ message (record). Commonly used with the term record and row and record.

ParserRefers to the SQData Parser component that validates an engine script and translates it into the formused by the SQData Engine component.

Platform Refers to an operation system instance.

Propagation Refers to capturing changed data and applying it to a target datastore(s).

Queue WebSphere MQ, Kafka

Record A group of information within a file. Used interchangeably with row and message.

Page 9: Connect CDC SQData... · 2021. 3. 19. · Connect CDC SQData Architecture 7 Introduction Terminology Change Data Capture Terms commonly used when discussing Change Data Capture: Term

9Connect CDC SQData Architecture

Introduction

Row A record within a relational database table. Used interchangeably with record and message.

ScriptA collection of SQData commands and functions that describe the target Platform, source and targetdatastores and the sequence of instructions that control the transformation of data moving fromsource to target datastore.

Source A datastore from which information is being extracted/read.

Table/ViewUsed interchangeably with relational datastore. A table or view represents a physical structure thatcontains data within a relational database management system.

Target A datastore to which information is being updated/written.

VariableA temporary user specified value that can be used to assist with data transformation and controllingSQData processing flow.

Page 10: Connect CDC SQData... · 2021. 3. 19. · Connect CDC SQData Architecture 7 Introduction Terminology Change Data Capture Terms commonly used when discussing Change Data Capture: Term

10 Connect CDC SQData Architecture

Introduction

Documentation Conventions

The following conventions are used in command and configuration syntax and examples in this

document.

Convention Explanation Example

Regular type Items in regular type must be entered literally usingeither lowercase or uppercase letters. Items in Bold typeare usually "commands" or "Actions". Note, uppercase isoften used in "z/OS" objects for consistency just aslowercase is often used on other platforms where casemay be either enforced or optional.

create

CCSID

/directory

//SYSOUT DD *

<variable> Items between < and > symbols represent variables. Youmust substitute an appropriate numeric or text value forthe variable.

<file_name>

| Bar A vertical Bar indicates that a choice must be madeamong items in a list separated by bars.

'yes' | 'no'

JSON | AVRO

[ ] Brackets Brackets indicate that item is optional. A choice may bemade among multiple items contained in brackets.

[alias]

OR

[+ | -]

-- Double dash Double dashes "--" identify an option keyword. Somekeywords may be abbreviated and preceded by a singledash "-". A double dash in some contexts can be used toindicate the start of a single line comment.

--service=<port>

OR -s <port>

OR --apply

OR -- this is acomment

… Ellipsis An ellipsis indicates that the preceding argument orgroup of arguments may be repeated.

[expression…]

Sequencenumber

A sequence number indicates that a series of argumentsor values may be specified. The sequence number itselfmust never be specified.

field2

' ' Single quotes Single quotation marks that appear in the syntax must bespecified literally.

IF CODE = 'a'

Page 11: Connect CDC SQData... · 2021. 3. 19. · Connect CDC SQData Architecture 7 Introduction Terminology Change Data Capture Terms commonly used when discussing Change Data Capture: Term

11Connect CDC SQData Architecture

Introduction

Related Documentation

Data Capture Guide and Reference manuals - These publications provides comprehensive information about thevarious data capture agents for IMS, Db2, VSAM, UDB (Db2/LUW) and Oracle Server. They also detail the activitiesrequired for the configuration, operation and management of each of the data capture agents.

Engine Reference - Two publications describes the power and functionality of the Apply and Replicator Engines. TheApply Engine Reference includes a detailed description of the structure and commands that make up the proceduralscripting language used in Apply Engine command scripts.

Function Reference - Detailed reference to the syntax and use of the Connect CDC SQData Functions in Apply Enginecommand scripts.

Secure Communications Guide - Describes the security architecture, its component parts and the authenticationprocess used by Connect CDC SQData client-server connections.

Control Center - Detailed reference for the Web Based monitoring and control component.

Utility Guides - These publications describes each of the utilities, including the SQDCONF capture configurationutility, SQDMON daemon control utility, the SQDUTIL multi-purpose utility and the zOS Master Controller.

Messages and Codes - This publication describes the messages and associated codes issued by the Capture Agentsand Utilities and the Parser and Apply and Replicator Engines in all operating environments including z/OS, UNIX, andWindows.

Installation Guide - Describes the installation and maintenance procedures for the for z/OS and Multiplatformproducts.

Start Here / ReadMe Documents - Platform specific Installation instructions that are also bundled with thedistribution.

Quickstart Guides - Tutorial style walk through for some common configuration scenarios including Capture andReplication. z/OS Quickstarts make use of the ISPF interface. While each Quickstart can be viewed here withinWebHelp, you may find it useful to print the PDF version of a Quickstart to use as a checklist.

Page 12: Connect CDC SQData... · 2021. 3. 19. · Connect CDC SQData Architecture 7 Introduction Terminology Change Data Capture Terms commonly used when discussing Change Data Capture: Term

12 Connect CDC SQData Architecture

Solutions Overview

The tools provided by Connect CDC SQData provides address many different business issues with a single installationon your source and target platforms. Common use cases include:

· Real-Time Replication - Synchronization of relational and non-relational data on multiple platforms.

· Real-Time Streaming to "Big Data" - Simplify propagation of business events and operational data directly intoKafka or Hadoop using JSON or AVRO

· Active - Active Replication - Synchronization between two or more like databases, with near-real-time latencyfor load balancing and/or continuous operation and disaster recovery

· Heterogeneous Data Replication - Change Data capture, Transformation and Load/Apply are not restricted toidentical database managers or operating platforms.

· Event Publishing - Key business events are captured and published, with near-real-time latency, to any type ofdownstream process or message queuing tool.

· Data Conversion / Migration - Content from any type of source datastore can be extracted, transformed andloaded to any other type of datastore, either all at once or using CDC, for phased implementations.

Simply stated, Connect CDC SQData offers the best customer value in the marketplace:

· Significantly lower Cost of Ownership - Compare the price to any combination of other third party integrationtools.

· Supports all types of datastores - Initially designed for IMS and VSAM, Connect CDC SQData has a proven trackrecord integrating Legacy datastores into heterogeneous databases. Fully supports CDC and Apply for all majorrelational databases and "Big Data" through open source interfaces to Kafka and Hadoop using industrystandard JSON or AVRO formatting.

· Addresses Multiple Business Needs - Event Publishing, Heterogeneous replication, Streaming data foranalytics, Data Warehouse ETL, Load Balancing, etc.

· High performance - Change Data Capture techniques are tuned for each type of source datastore.

· Exploits zIIP Engines on z/OS for Logstream I/O and Encryption for enhanced CPU cycle efficiency.

· Efficient Transient Storage - Landing captured data before it is applied to target datastore is rarely required.

· Rapid Deployment - Short learning curve, powerful transformation capabilities eliminates customprogramming of User Exits.

· Self-Correcting Synchronization - SmartApply technology provides conflict detection and resolution.

Page 13: Connect CDC SQData... · 2021. 3. 19. · Connect CDC SQData Architecture 7 Introduction Terminology Change Data Capture Terms commonly used when discussing Change Data Capture: Term

13Connect CDC SQData Architecture

Solutions Overview

Simple Source to Target Replication

Installation of Connect CDC SQData on all platforms is a relatively simple process. Once installed a proof of concept(POC) to demonstrate the products effectiveness is a straight forward process that Precisely's Sales Engineeringgroup is ready to help you complete.

Page 14: Connect CDC SQData... · 2021. 3. 19. · Connect CDC SQData Architecture 7 Introduction Terminology Change Data Capture Terms commonly used when discussing Change Data Capture: Term

14 Connect CDC SQData Architecture

Solutions Overview

High Availability

Many customer use Connect CDC SQData as part of their High Availability solution, implementing two and three wayreplication between systems.

Connect CDC SQData components themselves support High Availability through fail-over managed by operationsautomation tools on all platforms.

A more specific example might be Db2 zOS Capture on one LPAR and a Db2 Apply Engine on another LPAR. Either canfail-over independently.

Page 15: Connect CDC SQData... · 2021. 3. 19. · Connect CDC SQData Architecture 7 Introduction Terminology Change Data Capture Terms commonly used when discussing Change Data Capture: Term

15Connect CDC SQData Architecture

Solutions Overview

Page 16: Connect CDC SQData... · 2021. 3. 19. · Connect CDC SQData Architecture 7 Introduction Terminology Change Data Capture Terms commonly used when discussing Change Data Capture: Term

16 Connect CDC SQData Architecture

Configuration Options

Precisely's Connect CDC SQData supports multiple topologies enabling all conceivable combinations of capture andapply.

Page 17: Connect CDC SQData... · 2021. 3. 19. · Connect CDC SQData Architecture 7 Introduction Terminology Change Data Capture Terms commonly used when discussing Change Data Capture: Term

17Connect CDC SQData Architecture

Configuration Options

Framework

The following illustration depicts the core framework and at a high level, each layer's configuration options. Theysymbols >: and >>: are used in the first layer to indicate the Storage and Transport path options of the respectiveCapture Agent type. In two cases, Transient Storage and Transport are merged as captured data is placed directly intoa file.

Page 18: Connect CDC SQData... · 2021. 3. 19. · Connect CDC SQData Architecture 7 Introduction Terminology Change Data Capture Terms commonly used when discussing Change Data Capture: Term

18 Connect CDC SQData Architecture

Configuration Options

Platform Specific Options

As operating systems have matured many platform specific techniques for managing processing have beenconceived. Connect CDC SQData can be configured on each supported platform to take advantage of thosetechniques and facilitates operation in heterogeneous environments using its own TCP/IP based communicationcapabilities.

In addition to the traditional methods of platform specific application installation, configuration and operatingmethods, interest in container technology has grown in recent years. While Connect CDC SQData is not provided as acontainerized product, customers have created custom containers for various components. Precisely is containertool-chain agnostic, we can assist you as you explore integrating containers into your infrastructure.

z/OSThis mainframe operating system provides many unique facilities that Connect CDC SQData exploits forperformance, reliability and cost effectiveness:

· Operate as standard JOBS or Started Tasks

· z/OS Master Controller provides further sub-task flexibility, particularly for zOS Apply Engines

· Console command interface for both JOBS and Started Tasks

· ISPF panel Interface for Configuration and Operation

· Uses System LogStreams when possible for performance and reliability

· zIIP Engine support for enhanced CPU cycle efficiency.

a. IMS Log Capture Log Stream I/O Operations

b. ZLOGC Publisher Log Stream I/O Operations for both captured IMS and VSAM change data capture

c. NACL based Publisher encryption of CDC payload from zOS to other platforms

· Transparently operates under IBM's Application Transparent Transport Layer Security (AT-TLS)

Linux/AIX/UNIXIncreasingly open systems based on UNIX are the platform of choice for distributed and cluster based operations:

· Change Data Capture components may run on small commodity based systems and VM's remotely capturingDB2/LUW and Oracle.

· Apply and Replicator Engines may run on small commodity based systems and VM's remotely writing to allTargets.

· Apply and Replicator Engines may operate as Kafka Producers when running on Linux, the only platform withan officially supported C language API.

· Apply and Replicator Engines may write directly to HDFS/ Hadoop when running on Linux, the only platformwith an officially supported C language API.

Page 19: Connect CDC SQData... · 2021. 3. 19. · Connect CDC SQData Architecture 7 Introduction Terminology Change Data Capture Terms commonly used when discussing Change Data Capture: Term

19Connect CDC SQData Architecture

Configuration Options

WIndowsApply and Replicator Engines may run on Windows and can run as a Service.

Cross PlatformA number of features are available on all platforms that facilitate management, control and security:

· SQDaemon provides cross platform authentication of communication and control between components

· Captured data transported between platforms may be secured by either VPN or the Publisher's NACL basedencryption and using TLS 1.2 between zOS and Linux.

Page 20: Connect CDC SQData... · 2021. 3. 19. · Connect CDC SQData Architecture 7 Introduction Terminology Change Data Capture Terms commonly used when discussing Change Data Capture: Term

20 Connect CDC SQData Architecture

Components

Each component, developed on a common code base, functions seamlessly with each other across environments,providing the consistency and reliability that is expected in an enterprise integration tool. At the core of the productare two components:

Data Capture Agents - The data capture layer consists of near-real-time and asynchronous data capture agentsdesigned to be high performance and best of breed for their respective datastore types. Each Capture Agent istightly coupled to the second and third layers of the framework which provide fully configurable TransientStorage and forward Transport of the captured data.

Apply Engines - The multi-function Apply Engine is the fourth layer of the core framework that performs, basedon a SQL like scripting language, all necessary data filtering, transformation and augmentation required to applythe changed data to a target datastore.

These components, together with a highly efficient transient storage and transport manager, provide a single highperformance enterprise class solution".

Finally, there are several Utilities that configure, manage and control the Capture Agents and Engines.

Page 21: Connect CDC SQData... · 2021. 3. 19. · Connect CDC SQData Architecture 7 Introduction Terminology Change Data Capture Terms commonly used when discussing Change Data Capture: Term

21Connect CDC SQData Architecture

Components

Data Capture Agents

Changed Data Capture (CDC) agents are provided for all the major types of datastores including for some, the optionfor near-real-time and/or asynchronous modes:

Data Capture AgentNear Real

TimeAsync

IMS z/OS Y Y

Db2 z/OS Y Y

Oracle Y Y

UDB (Db2/LUW) Y Y

CICS / VSAM Y Y

Keyed Files Y

Y - indicates native support built in to the capture agent

Please refer to the Change Data Capture Guide and Capture Agent References for detailed information regarding theconfiguration, activation and operation of the data capture agents.

Page 22: Connect CDC SQData... · 2021. 3. 19. · Connect CDC SQData Architecture 7 Introduction Terminology Change Data Capture Terms commonly used when discussing Change Data Capture: Term

22 Connect CDC SQData Architecture

Components

Engines

The Connect CDC SQData provides two type of Engines.

The Apply Engine is controlled by an SQL like scripting language capable of a wide range of operations, fromreplication of identical source and target structures using a single command to complex business rule basedtransformations. Connect CDC SQData commands and functions provide full procedural control of data filtering,mapping and transformation including manipulation of data at its most elemental level if required.

The Replicator Engine is controlled by a simple configuration file that merely identifies source and target datastores.It automatically generates industry standard JSON or AVRO formatted data and provides a seamless interface withConfluent's Schema Registry to further simplify administration while boosting performance.

Page 23: Connect CDC SQData... · 2021. 3. 19. · Connect CDC SQData Architecture 7 Introduction Terminology Change Data Capture Terms commonly used when discussing Change Data Capture: Term

23Connect CDC SQData Architecture

Components

Page 24: Connect CDC SQData... · 2021. 3. 19. · Connect CDC SQData Architecture 7 Introduction Terminology Change Data Capture Terms commonly used when discussing Change Data Capture: Term

24 Connect CDC SQData Architecture

Components

Control Center

The Web based Control Center is a monitoring and control component that provides visibility into volume, speed andlatency of the replication process. It also provides an optional notification function.

Page 25: Connect CDC SQData... · 2021. 3. 19. · Connect CDC SQData Architecture 7 Introduction Terminology Change Data Capture Terms commonly used when discussing Change Data Capture: Term

25Connect CDC SQData Architecture

Components

Utilities

Several Utility programs supplement the Connect CDC SQData base product:

Utility Description

SQDAMASTMaster Task Controller (z/OS only) - manages multiple Engine instancesunder a single started task

SQDAEMONManages communications between components running on multipleplatforms

SQDCONFCreates, maintains and controls the configuration and operation of Captureand Storage agents

SQDUTILPerforms Copy, Move, Dump and Clean functions for transient datastoresassociated with data capture. Also generates Public/Private key pairs usedfor validating communication between platforms.

SQDMONUsed to request status and control operation of agents throughcommunication with the SQDAEMON.

SQDXPARM Creates and maintains the configuration of selected z/OS capture agents.

Please refer to the Connect CDC SQData V4 Utility Guides for information regarding the operation of these utilities.

Page 26: Connect CDC SQData... · 2021. 3. 19. · Connect CDC SQData Architecture 7 Introduction Terminology Change Data Capture Terms commonly used when discussing Change Data Capture: Term

26 Connect CDC SQData Architecture

Index Ind

ex

AActive - Active Replication 12Apply Engine 22

CContainers 18Control Center 24Core Framework 16

Cost of Ownership 12

DData Capture Agents 21Data Conversion / Migration 12

EEvent Publishing 12

HHeterogeneous Data Replication 12

RReal-Time Replication 12

SSwiss Army Knife 20

UUtilities 25

Page 27: Connect CDC SQData... · 2021. 3. 19. · Connect CDC SQData Architecture 7 Introduction Terminology Change Data Capture Terms commonly used when discussing Change Data Capture: Term

2 Blue Hi l l PlazaPearl River, NY 10965USA

precisely.com

© 2001, 2021 SQData. Al l rights reserved.