87
Business Objects Data Integrator Training BODI Version XI3 1

BODI Training v1.1

Embed Size (px)

Citation preview

Page 1: BODI Training v1.1

Business Objects Data IntegratorData Integrator

TrainingBODI Version XI3

1

Page 2: BODI Training v1.1

Audience

• Application Developers

• Consultants

• Database Administrators working on data extraction, data warehousing, or data integration.warehousing, or data integration.

2/18/2011 2

Page 3: BODI Training v1.1

Assumptions

• You understand your source data systems, RDBMS, business intelligence and e-commerce messaging concepts.

• You are familiar with SQL (Structured Query Language).

• You are familiar with Microsoft Windows or UNIX • You are familiar with Microsoft Windows or UNIX platforms to effectively use Data Integrator.

2/18/2011 3

Page 4: BODI Training v1.1

Business Objects Data Integration Platform

• The Data Integration Platform consists of

– Data Integrator: data movement and management server

– Rapid Marts: suite of packaged data caches for speedy delivery and integration of data

2/18/2011 4

Page 5: BODI Training v1.1

Business Objects Data Integration Platform

2/18/2011 5

Page 6: BODI Training v1.1

Rapid Mart SAP R/3 Modules

Account Payable ----> FI-FInanceAccount Receivable ----> FI-FInanceCost Center ----> CO-Controlling Human Resources ----> HR-Human Resources Human Resources ----> HR-Human Resources Inventory ----> MM-Materials Movement Plant Maintenance ----> PM-Plant Maintenance Production Planning ----> PP-Production Planning Project Systems ----> PS-Project Systems Purchasing ----> SD-Sales and Distribution Sales ----> SD-Sales and Distribution

2/18/2011 6

Page 7: BODI Training v1.1

Data Integrator

• DI is a data movement and integration platform

2/18/2011 7

Page 8: BODI Training v1.1

Data Integrator Architecture

2/18/2011 8

Page 9: BODI Training v1.1

Data Integrator operating system platforms

• DI Designer runs on the following Windows platforms:� NT� 2000 Professional� 2000 Server� 2000 Advanced Server� 2000 Advanced Server� 2000 Datacenter Server� XP

• All other DI components run on the above Windows platforms and the following UNIX platforms:� Solaris 2.7 and 2.8 (Sun OS releases 5.7 and 5.8)� HP-UX version 11.00 (PA_RISC 2.0), and 11.1� IBM AIX 4.3.3.75 with maintenance level 4330-10, and AIX

5.1

2/18/2011 9

Page 10: BODI Training v1.1

Data Integrator Components

• Standard components are:– DI Job Server– DI Engine– DI Designer– DI Designer– DI Repository– DI Access Server– DI Administrator– DI Metadata Reports tool– DI Web Server– DI Service– DI SNMP Agent

2/18/2011 10

Page 11: BODI Training v1.1

Data Integrator Component Relationships

2/18/2011 11

Page 12: BODI Training v1.1

Data Integrator Components

• DI Job Server– starts the data movement engine that integrates

data from multiple heterogeneous sources, performs complex data transformations, and manages extractions and transactions. manages extractions and transactions.

– can move data in either batch or real-time mode and uses distributed query optimization, multi-threading, in-memory caching, in-memory data transformations, and parallel pipelining to deliver high data throughput and scalability.

2/18/2011 12

Page 13: BODI Training v1.1

Data Integrator Components

• DI Engine– When DI jobs are executed, the Job Server starts

DI engine processes to perform data extraction, transformation, and movement.transformation, and movement.

– DI engine processes use parallel-pipelining and in-memory data transformations to deliver high data throughput and scalability.

2/18/2011 13

Page 14: BODI Training v1.1

Data Integrator Components

• DI Designer– allows for defining data management applications

which consist of data mappings, transformations, and control logic. and control logic.

– a development tool with a graphical user interface. It enables developers to create objects, then drag, drop, and configure them by selecting icons in flow diagrams, table layouts and nested, workspace pages.

2/18/2011 14

Page 15: BODI Training v1.1

Data Integrator Components

• DI Repository– a set of tables that hold user-created and

predefined system objects, source and target metadata, and transformation rules. It is set up on metadata, and transformation rules. It is set up on an open client/server platform to facilitate the sharing of metadata with other enterprise tools. Each repository is stored on an existing RDBMS.

– associated with one or more DI Job Servers.

2/18/2011 15

Page 16: BODI Training v1.1

Data Integrator Components

– There are two types of repositories:

A local repository is used by an application designer to store definitions of DI objects (like projects, jobs, work flows, and data flows) and source/target metadata.

A central repository is an optional component that can be used to support multi-user development. The central repository provides a shared object library allowing developers to check objects in and out of their local repositories.

2/18/2011 16

Page 17: BODI Training v1.1

Data Integrator Components

• DI Access Server– The Access Server is a real-time, request-reply message

broker that collects message requests, routes them to a real-time service, and delivers a message reply within a user-specified time frame.specified time frame.

– The Access Server queues messages and sends them to the next available real-time service across any number of computing resources. This approach provides automatic scalability because the Access Server can initiate additional real-time services on additional computing resources if traffic for a given real-time service is high.

– Multiple Access Servers can also be configured.

2/18/2011 17

Page 18: BODI Training v1.1

Data Integrator Components

• DI Administrator– browser-based administration of DI resources, including:

• Scheduling, monitoring and executing batch jobs• Configuring, starting and stopping real-time services• Configuring Job Server, Access Server, and repository

usage• Configuring and managing adapters• Managing users• Publishing batch jobs and real-time services via Web

services

2/18/2011 18

Page 19: BODI Training v1.1

Data Integrator Components

• DI Metadata Reports tool– This provides browser-based reports on DI metadata, which

is stored in the repository. Reports are provided for:• Repository summary• Job analysis• Execution statistics• Impact analysis

2/18/2011 19

Page 20: BODI Training v1.1

Data Integrator Components

• DI Web Server– supports browser access to the Administrator and the

Metadata Reports tool. – The Windows service name for this server is DI Web Server

service; – The UNIX equivalent is a daemon named the Tomcat server.

This is the servlet engine used by the DI Web Server.

2/18/2011 20

Page 21: BODI Training v1.1

Data Integrator Components

• DI Service– The DI Service is installed when DI Job and

Access Servers are installed. The DI Service starts Job Servers and Access Servers when you starts Job Servers and Access Servers when you reboot your system.

– The Windows service name is DATA INTEGRATOR Service.

– The UNIX equivalent is a daemon named AL_JobService.

2/18/2011 21

Page 22: BODI Training v1.1

Data Integrator Components

• DI SNMP Agent– DI error events can be communicated using

SNMP-supported applications for better error monitoring.monitoring.

– The DI SNMP agent monitors and records information about the Job Servers and jobs running on the computer where the agent is installed.

2/18/2011 22

Page 23: BODI Training v1.1

Data Integrator Management Tools

• License Server– The License Server allows you to centrally control license

validation for DI components and licensed extensions.

• Repository Manager• Repository Manager– The Repository Manager allows you to create, upgrade, and

check the versions of local and central repositories.

• Server Manager– The Server Manager allows you to add, delete, or edit the

properties of Job Servers and Access Servers. It is automatically installed on each computer on which you install a Job Server or Access Server.

2/18/2011 23

Page 24: BODI Training v1.1

Data Integrator Objects

• All “entities” you create, modify, or work with in DI Designer are called objects. The local object library shows objects such as source and target metadata, system functions, projects, and jobs.

• DI has two types of objects:• DI has two types of objects:� Reusable objects

� Have a single definition� All calls to the object refer to the object definition� Changes in the object definition get propagated to all calls to the object

definition

� Single-use objects� Objects that are defined only within the context of a single job or data

flow E.g. Scripts

2/18/2011 24

Page 25: BODI Training v1.1

Data Integrator Object Relationships

2/18/2011 25

Page 26: BODI Training v1.1

Projects

• A reusable object that allows you to group jobs.• highest level of organization offered by DI.• used to group jobs that have schedules that depend on one

another or that you want to monitor together.• Only one project can be open at a time.• Projects cannot be shared among multiple users.

2/18/2011 26

Page 27: BODI Training v1.1

Jobs

• A job is the only object that is executed.• The following objects can be included in a job

definition:– Data flows– Data flows

• Transforms– Work flows

• Scripts• Conditionals• While Loops• Try/catch blocks.

2/18/2011 27

Page 28: BODI Training v1.1

Datastores

• represent connections between DI and databases or applications, directly or through adapters.

• allow DI to access metadata from a database or application and hence to read from or write to a database or an application.database or an application.

• DI datastores can connect to:– Databases and mainframe file systems.– Applications that have pre-packaged or user-

written DI adapters.– SAP R/3, SAP BW, PeopleSoft, J.D. Edwards One

World, and J.D. Edwards World.

2/18/2011 28

Page 29: BODI Training v1.1

File Formats

• DI can use data stored in files for data sources or data targets.

• File format objects can describe files in:– Delimited format — Characters such as commas – Delimited format — Characters such as commas

or tabs separate each field– Fixed width format — The column width is

specified by the user– SAP R/3 format

2/18/2011 29

Page 30: BODI Training v1.1

Data Flows

• Data flows extract, transform, and load data; reading sources, transforming data, and loading targets, occurs inside a data flow.

• A data flow can be added to a job or a work flow.• From inside a work flow, a data flow can send and receive • From inside a work flow, a data flow can send and receive

information to and from other objects through input and output parameters.

2/18/2011 30

Page 31: BODI Training v1.1

Data Flows

Source(s) Target(s)Data

Transformation

2/18/2011 31

Input Parameters

Source(s) Target(s)TransformationOperations

Output Parameters

Page 32: BODI Training v1.1

Work Flows

• A work flow defines the decision-making process for executing data flows.

• The purpose of a work flow is to prepare for executing data flows and to set the state of the system after the data flows are complete.complete.

• The following objects can be elements in work flows:� Work f lows� Data flows� Scripts� Conditionals� While loops� Try/catch blocks

2/18/2011 32

Page 33: BODI Training v1.1

Work Flows

Control ControlData

2/18/2011 33

Control Operations

ControlOperations

DataFlow

Page 34: BODI Training v1.1

Conditionals

• Conditionals are single-use objects used to implement if/then/else logic in a work flow.

• To define a conditional, you specify a condition and two logical branches:– If A Boolean expression that evaluates to TRUE or

FALSE. You can use functions, variables, and standard operators to construct the expression.

– Then Work flow elements to execute if the If expression evaluates to TRUE.

– Else (Optional) Work flow elements to execute if the If expression evaluates to FALSE.

2/18/2011 34

Page 35: BODI Training v1.1

Conditionals

Work Flow

ConditionalThen

Run Work FlowTrue

2/18/2011 35

If Process

Successful

Run Work Flow

Else Send E-mail

True

False

Page 36: BODI Training v1.1

While Loops

• The while loop is a single-use object that you can use in a work flow.

• The while loop repeats a sequence of steps as long • The while loop repeats a sequence of steps as long as a condition is true.

2/18/2011 36

Page 37: BODI Training v1.1

While Loops

While Number != 0

False

2/18/2011 37

Number != 0

True

Step 1

Step 2

Page 38: BODI Training v1.1

Try / Catch Blocks

• A try/catch block is a combination of one try object and one or more catch objects that allow you to specify alternative work flows if errors occur while DI is executing a job.is executing a job.

• Try/catch blocks:– “Catch” classes of exceptions “thrown” by DI, the DBMS, or

the operating system– Apply solutions that you provide– Continue execution

• Try and catch objects are single-use objects.

2/18/2011 38

Page 39: BODI Training v1.1

Try / Catch Blocks

• Categories of available exceptions are:� Database access errors� Email errors� Engine abort errors� Execution errors� Execution errors� File access errors� Microsoft connection errors� Parser errors� Predefined transform errors� Repository access errors� Resolver errors� System exception errors� User transform errors

2/18/2011 39

Page 40: BODI Training v1.1

Scripts

• Scripts are single-use objects used to call functions and assign values to variables in a work flow.

• A script can contain the following statements:• A script can contain the following statements:– Function calls

– If statements

– While statements

– Assignment statements

– Operators

2/18/2011 40

Page 41: BODI Training v1.1

Types Of Lookup Functions

Retrieves a value in a table or file based on the values in a different source

table or File.

– 1) Lookup– 1) Lookup

– 2) Lookup_Ext

– 3) Lookup_Seq

2/18/2011 41

Page 42: BODI Training v1.1

Variables

• Variables are symbolic placeholders for values.

– Local Variables• Local variables are local to the work flow in which they are

defined—a local variable defined in a work flow is available for use in any of the single-use objects in the work flow.

• The value of the local variable can be passed as a parameter into another work flow or data flow called in the work flow.

2/18/2011 42

Page 43: BODI Training v1.1

Variables

– Global Variables• Global variables are global within a job.

• Once a name for a global variable is used in a job that name becomes reserved for the job.

• Global variables are exclusive within the context of the job in which they are created.

• Setting parameters is not necessary when you use global variables.

2/18/2011 43

Page 44: BODI Training v1.1

Parameters

• Parameters are expressions passed to a work flow or data flow when the work flow or data flow is called.

• Parameters can be defined to pass values into and out of • Parameters can be defined to pass values into and out of work flows, data flows, and custom functions

2/18/2011 44

Page 45: BODI Training v1.1

Transforms

• The following transforms are available from the object library on the Transforms tab. -- Case -- Date_Generation

-- Effective_Date -- Hierarchy_Flattening-- Effective_Date -- Hierarchy_Flattening

-- History_Preserving -- Key_Generation

-- Map_Operation -- Merge

-- Pivot (Columns to Rows) -- Query

-- Reverse Pivot (Rows to Columns) -- Row_Generation

-- SQL -- Table_Comparison

2/18/2011 45

Page 46: BODI Training v1.1

Query Transform

• Retrieves a data set that satisfies conditions that you specify. A query transform is similar to a SQL SELECT statement.

2/18/2011 46

Page 47: BODI Training v1.1

Query Transform

2/18/2011 47

Page 48: BODI Training v1.1

Query Transform

Input Schema

Output Schema

2/18/2011 48

Output Schema

Options

Page 49: BODI Training v1.1

Case Transform

• Specifies multiple paths in a single transform (different rows are processed in different ways).

• Simplifies branch logic in data flows by consolidating case or decision making logic in one transform.or decision making logic in one transform.

• Paths are defined in an expression table.

2/18/2011 49

Page 50: BODI Training v1.1

Case Transform

2/18/2011 50

Page 51: BODI Training v1.1

Case Transform

2/18/2011 51

Page 52: BODI Training v1.1

SQL Transform

• Performs the indicated SQL query operation.

• Use this transform to perform standard SQL operations for things that cannot be performed using other built-in things that cannot be performed using other built-in transforms.

2/18/2011 52

Page 53: BODI Training v1.1

SQL Transform

2/18/2011 53

Page 54: BODI Training v1.1

SQL Transform

2/18/2011 54

Page 55: BODI Training v1.1

Merge Transform

• Combines incoming data sets, producing a single output data set with the same schema as the input data sets.

2/18/2011 55

Page 56: BODI Training v1.1

Merge Transform

2/18/2011 56

Page 57: BODI Training v1.1

Merge Transform

2/18/2011 57

Page 58: BODI Training v1.1

Row_Gen Transform

• Produces a data set with a single column.

• The column values start from zero and increment by one to a specified number of rows.

2/18/2011 58

Page 59: BODI Training v1.1

Row_Gen Transform

2/18/2011 59

Page 60: BODI Training v1.1

Key_Generation Transform

• Generates new keys for new rows in a data set.

• The Key_Generation transform looks up the maximum existing key value from a table and uses it as the starting value to generate new keys.value to generate new keys.

2/18/2011 60

Page 61: BODI Training v1.1

Key_Generation Transform

2/18/2011 61

Page 62: BODI Training v1.1

Key_Generation Transform

2/18/2011 62

Page 63: BODI Training v1.1

Date_Generation Transform

• Produces a series of dates incremented as you specify.

• Use this transform to produce the key values for a time dimension target. From this generated sequence you can populate other fields in the time dimension (such as populate other fields in the time dimension (such as day_of_week) using functions in a query.

2/18/2011 63

Page 64: BODI Training v1.1

Date_Generation Transform

2/18/2011 64

Page 65: BODI Training v1.1

Date_Generation Transform

2/18/2011 65

Date_Generation

Page 66: BODI Training v1.1

Table_Comparison Transform

• Compares two data sets and produces the difference between them as a data set with rows flagged as INSERT or UPDATE.

• The Table_Comparison transform allows you to detect and• The Table_Comparison transform allows you to detect and

• forward changes that have occurred since the last time a target was updated.

2/18/2011 66

Page 67: BODI Training v1.1

Table_Comparison Transform

2/18/2011 67

Page 68: BODI Training v1.1

Map_Operation Transform

• Allows conversions between data manipulation operations.

• The Map_Operation transform allows you to change operation codes on data sets to produce the desired output.

– For example, if a row in the input data set has been updated in some previous operation in the data flow, you can use this transform to map the UPDATE operation to an INSERT. The result could be to convert UPDATE rows to INSERT rows to preserve the existing row in the target.

2/18/2011 68

Page 69: BODI Training v1.1

Map_Operation Transform

2/18/2011 69

Page 70: BODI Training v1.1

Table_Comparison & Map_Operation Transforms

2/18/2011 70

Page 71: BODI Training v1.1

History_Preserving Transform

• The History_Preserving transform allows you to produce a new row in your target rather than updating an existing row.

• You can indicate in which columns the transform identifies • You can indicate in which columns the transform identifies changes to be preserved.

• If the value of certain columns change, this transform creates a new row for each row flagged as UPDATE in the input data set.

2/18/2011 71

Page 72: BODI Training v1.1

Pivot Transform (Columns to Rows)

• Creates a new row for each value in a column that you identify as a pivot column.

• The Pivot transform allows you to change how the relationship between rows is displayed. relationship between rows is displayed.

• For each value in each pivot column, DI produces a row in the output data set.

• You can create pivot sets to specify more than one pivot column.

2/18/2011 72

Page 73: BODI Training v1.1

Pivot Transform (Columns to Rows)

Region Sales-2001 Sales-2002 Sales-2003

North 200 300 400

East 300 600 700

West 350 800 770

South 800 200 3750

2/18/2011 73

South 800 200 3750

Region Year SalesNorth 2001 200North 2002 300North 2003 400

Page 74: BODI Training v1.1

Reverse Pivot Transform (Rows to Columns)

• Creates one row of data from several existing rows.

• The Reverse Pivot transform allows you to combine data from several rows into one row by creating new columns.

• For each unique value in a pivot axis column and each • For each unique value in a pivot axis column and each selected pivot column, DI produces a column in the output data set.

2/18/2011 74

Page 75: BODI Training v1.1

Reverse Pivot Transform (Rows to Columns)

Region Year SalesNorth 2001 200North 2002 300North 2003 400

2/18/2011 75

North 200 300 400Region 2001 2002 2003

Page 76: BODI Training v1.1

Functions

• Functions operate on single values, such as values in specific columns in a data set.

• You can use functions in the following operations:

– Queries– Queries

– Scripts

– Conditionals

• You can use :– Built-in functions (DI functions)

– Custom functions (user-defined functions)

– Database and application functions (functions specific to DBMS)

2/18/2011 76

Page 77: BODI Training v1.1

Procedures

• DI supports the use of stored procedures for Oracle, Microsoft SQL Server, Sybase, and DB2 databases.

• You can call stored procedures from the jobs you create • You can call stored procedures from the jobs you create and run in DI

2/18/2011 77

Page 78: BODI Training v1.1

Debugging

• Execute a job in the Data Scan mode

• View and analyze the output data in the Data Scan window

• Compare and analyze different data samples

2/18/2011 78

Page 79: BODI Training v1.1

Debugging – Data Scan Mode

2/18/2011 79

Page 80: BODI Training v1.1

Debugging – Analyzing The OutputObject List Scan Date and Time

2/18/2011 80Schema Area Data Area

Page 81: BODI Training v1.1

Migration and Repositories

• The development process you use to create your ETL application involves three distinct phases: design, test, and production.

• Each phase may require a different computer in a different environment, and different security settings for each.

• To control the environment differences, each phase may require a different repository.

2/18/2011 81

Page 82: BODI Training v1.1

Migration and Repositories

DesignRepository

TestRepositoryExport to

Test

2/18/2011 82

ProductionRepository

Export to Production Repository

Test Repository

Page 83: BODI Training v1.1

Migration and Repositories

• When moving objects from one phase to another, export jobs from your source repository to either a file or a database, then repository to either a file or a database, then import them into your target repository.

2/18/2011 83

Page 84: BODI Training v1.1

Exporting Objects to a Database

• You can export objects from the current repository to another repository.

• However, the other repository must be the same version as the current one.

• However, the other repository must be the same version as the current one.

• The export process allows you to change environment-specific information defined in datastores and file formats to match the new environment.

2/18/2011 84

Page 85: BODI Training v1.1

Exporting/Importing Objects to/from a File

• You can also export objects to a file.

• If you choose a file as the export destination, DI does not provide options to change environment specific information.specific information.

• Importing objects or an entire repository from a file overwrites existing objects with the same names in the destination repository. You must restart DI after the import process completes.

2/18/2011 85

Page 86: BODI Training v1.1

Parallel Execution

• The maximum number of parallel DI engine processes in the Job Server options (Tools > Options> Job Server > Environment).

• This helps in running the transforms in parallel.

2/18/2011 86

Page 87: BODI Training v1.1

Parallel Work Flows / Data Flows

2/18/2011 87