44
Informatica PowerCenter Training - Day 3

Informatica Power Center - Workflow Manager

Embed Size (px)

Citation preview

Page 1: Informatica Power Center - Workflow Manager

Informatica PowerCenter Training - Day 3

Page 2: Informatica Power Center - Workflow Manager

2

Agenda day 3

Workflow Manager in Detail

Workflow Monitor

Error Logging

Labs

Page 3: Informatica Power Center - Workflow Manager

Workflow Manager – the Details

Page 4: Informatica Power Center - Workflow Manager

4

Workflow Manager - In Detail

Workflow manager is the key to loading the final product into the target database(s).

Used to manager how jobs run, order, criteria

Used for scheduling job runs

Used to notify users when a job as completed / failed

Used to partition loads and perform performance turning

Page 5: Informatica Power Center - Workflow Manager

5

Register Server

Similar to Relational Connection dialog Same parameters apply with the exception of

the new variables for Workflow logs.

Page 6: Informatica Power Center - Workflow Manager

6

Assign to Workflows

While folders are closed it is possible to assign server to a particular session

This dialog allows for individual or globally selected sessions to be assigned to run on a particular server

Page 7: Informatica Power Center - Workflow Manager

7

Links and Conditions

DefinitionDefinition

Links and their underlying conditions are what provide process control to the Links and their underlying conditions are what provide process control to the workflow. When an attached link condition resolves to TRUE then the attached workflow. When an attached link condition resolves to TRUE then the attached object may begin processing. There can be no looping and links can only execute object may begin processing. There can be no looping and links can only execute once per workflow. However more complex branching and decisions can be made once per workflow. However more complex branching and decisions can be made by combining multiple links to a single object or branching into decision type paths. by combining multiple links to a single object or branching into decision type paths. Each link has its own expression editor and can utilize upstream resolved object Each link has its own expression editor and can utilize upstream resolved object variables or user-defined variables for its own evaluation.variables or user-defined variables for its own evaluation.

Link conditionLink condition

Page 8: Informatica Power Center - Workflow Manager

8

Links and Conditions

Object VariablesObject VariablesThe default set of object The default set of object variables from a session variables from a session can provide more can provide more information than just a information than just a status of ‘Completed’. More status of ‘Completed’. More complex evaluation can be complex evaluation can be done for ErrorCode, done for ErrorCode, StartTime, StartTime, SrcSuccessRows, etc. SrcSuccessRows, etc.

In addition to the default In addition to the default object variables, User object variables, User Defined variables can be Defined variables can be created and populated via created and populated via parameter files or changed parameter files or changed in the workflow via in the workflow via Assignment tasks. Also any Assignment tasks. Also any upstream task that has upstream task that has completed can have its completed can have its variables utilized in variables utilized in downstream link conditions.downstream link conditions.

Object Object VariablesVariables

Page 9: Informatica Power Center - Workflow Manager

9

Tasks

Local Tasks – Sessions

Commands

Email

Decision

Assignment

Timer

Control

Event Raise

Event Wait

Global (Reusable) Tasks – Sessions

Commands

Email

Tasks are the default units of work for building the workflow. Global tasks are reusable across workflows. Local tasks are independent and self-contained within workflows.

Page 10: Informatica Power Center - Workflow Manager

10

Sessions

Session -> Workflow NotificationSession -> Workflow NotificationOptions can be set to treat conditional links attached to the object as AND/OR functionality. Also control option to fail the parent (container) if task fails or does not run.

Disabling a task in a workflow allows the task to be skipped instead of having to remove it.

Updated Updated parametersparameters

Page 11: Informatica Power Center - Workflow Manager

11

Sessions - Continued

ComponentsComponentsThe area where commands or email unique to this object can be defined. You can alternately The area where commands or email unique to this object can be defined. You can alternately select a reusable task to use as well. select a reusable task to use as well.

Choice of Choice of reusable or reusable or

local commandlocal command

Page 12: Informatica Power Center - Workflow Manager

12

Non-Reusable Commands

ComponentsComponentsRegardless of reusable or non-reusable it is necessary to name the object since there is Regardless of reusable or non-reusable it is necessary to name the object since there is potential to promote it.potential to promote it.

Option for local Option for local or reusableor reusable

Name of Name of command command

objectobject

Page 13: Informatica Power Center - Workflow Manager

13

Non-Reusable Commands

ComponentsComponentsThe properties tab allows for error control for commands/tasksThe properties tab allows for error control for commands/tasks

Error Control for Error Control for multiple multiple

commands/taskscommands/tasks

Page 14: Informatica Power Center - Workflow Manager

14

Sessions - Continued

PartitionsPartitionsNew partitioning scheme allows for repartitioning after Source Qualifier at almost any other New partitioning scheme allows for repartitioning after Source Qualifier at almost any other transformation object in the mapping. There are four main partition types Pass Through, Round transformation object in the mapping. There are four main partition types Pass Through, Round Robin, Hash Auto Keys, Hash User Keys.Robin, Hash Auto Keys, Hash User Keys.

Add Partition pointsAdd Partition points

Change Partition TypeChange Partition Type

Page 15: Informatica Power Center - Workflow Manager

15

Session Partitions (Partition Points)

Partition points mark thread boundaries as well as divide the pipeline into stages.

The partition point at the source qualifier marks the boundary between the first (reader) and second (transformation) stages. The partition point at the Aggregator transformation marks the boundary between the second and third (transformation) stages. The partition point at the target instance marks the boundary between the third (transformation) and fourth (writer) stage.

Page 16: Informatica Power Center - Workflow Manager

16

Session Partitions (Partition Types)

Round-robin partitioning. The Informatica Server distributes data evenly among all partitions. Use round-robin partitioning where you want each partition to process approximately the same number of rows.

Hash partitioning. The Informatica Server applies a hash function to a partition key to group data among partitions.

Key range partitioning. You specify one or more ports to form a compound partition key.

Pass-through partitioning. The Informatica Server passes all rows at one partition point to the next partition point without redistributing them. Choose pass-through partitioning where you want to create an additional pipeline stage to improve performance, but do not want to change the distribution of data across partitions.

Page 17: Informatica Power Center - Workflow Manager

17

Partitions Defined

First stage. To read data from the three flat files concurrently, you must specify three partitions at the source qualifier. Accept the default partition type, pass-through.

Second Stage. Since the source files vary in size, each partition processes a different amount of data. Set a partition point at the Filter transformation, and choose round-robin partitioning to balance the load going into the Filter transformation.

Third Stage. To eliminate overlapping groups in the Sorter and Aggregator transformations, use hash auto-keys partitioning at the Sorter transformation. This causes the Informatica Server to group all items with the same description into the same partition before the Sorter and Aggregator transformations process the rows.

Fourth Stage. Since the target tables are partitioned by key range, specify key range partitioning at the target to optimize writing data to the target.

Page 18: Informatica Power Center - Workflow Manager

18

Command Tasks

CommandCommandThe command object can be created globally under the Task Developer. It can also be The command object can be created globally under the Task Developer. It can also be promoted here from within a mapping. The command task is used to call a shell commands promoted here from within a mapping. The command task is used to call a shell commands during the workflow.during the workflow.

Created in Task Created in Task DeveloperDeveloper

Page 19: Informatica Power Center - Workflow Manager

19

Command Tasks

CommandCommandThe properties section homes the ability to either run all commands regardless or run them if The properties section homes the ability to either run all commands regardless or run them if each previous command completes. Commands tab is where the actual commands are created. each previous command completes. Commands tab is where the actual commands are created. One command per line. One command per line.

Process Control Process Control for multiple for multiple commandscommands

Page 20: Informatica Power Center - Workflow Manager

20

Email Tasks

EmailEmailEmail task is very similar to the command task since it can be either created in the Task Email task is very similar to the command task since it can be either created in the Task Developer or promoted from a mapping. The properties tab allows for an expression editor for Developer or promoted from a mapping. The properties tab allows for an expression editor for text creation utilizing the built-in variables.text creation utilizing the built-in variables.

Email text Email text creation dialogcreation dialog

Built-in Built-in VariablesVariables

Page 21: Informatica Power Center - Workflow Manager

21

Workflow Variables

Pre-defined VariablesPre-defined VariablesThis is the list of all pre-defined task level variables available to evaluate uponThis is the list of all pre-defined task level variables available to evaluate upon

Variable Task Type Datatype ** Supported Status Returns

ABORTED

DISABLED

FAILED

NOTSTARTED

STARTED

STOPPED

SUCCEEDED

Condition Decision Task IntegerEndTime All tasks Date/timeErrorCode All tasks IntegerErrorMsg All tasks Nstring*FirstErrorCode Session task IntegerFirstErrorMsg Session task Nstring*PrevTaskStatus All tasks IntegerSrcFailedRows Session task IntegerSrcSuccessRows Session task IntegerStartTime All tasks Date/timeStatus** All tasks IntegerTgtFailedRows Session tasks Integer

TgtSuccessRows Sessions Integer

TotalTransErrors Sessions Integer

* Variables of type Nstring can have a maximum length of 600 characters.

Page 22: Informatica Power Center - Workflow Manager

22

Workflow Variables

User-defined VariablesUser-defined VariablesVariables are created at the container level much like the mappings. (Workflows=Mappings, Variables are created at the container level much like the mappings. (Workflows=Mappings, Worklets=Mapplets). Once created, values can be passed to objects within the same container Worklets=Mapplets). Once created, values can be passed to objects within the same container for evaluation. (Assignment Task can modify/calculate variables)for evaluation. (Assignment Task can modify/calculate variables)

Edit VariablesEdit Variables

Page 23: Informatica Power Center - Workflow Manager

23

Workflow Variables

User-defined VariablesUser-defined VariablesA user-defined variable can assist in more complex evaluations. In the above example, an A user-defined variable can assist in more complex evaluations. In the above example, an external parameter file contains the number of expected rows. This in turn is evaluated against external parameter file contains the number of expected rows. This in turn is evaluated against the actual rows successfully read from an upstream session. $ signifies and is reserved for pre-the actual rows successfully read from an upstream session. $ signifies and is reserved for pre-defined variables. User defined variables should maintain $$ naming.defined variables. User defined variables should maintain $$ naming.

User Defined User Defined VariablesVariables

Pre-Defined Pre-Defined VariableVariable

Page 24: Informatica Power Center - Workflow Manager

24

Assignment Task

UsageUsageThe assignment task allows for the user to assign a value to a user-defined workflow variable. To The assignment task allows for the user to assign a value to a user-defined workflow variable. To use the assignment task first create and add the assignment task to the workflow. Then use the assignment task first create and add the assignment task to the workflow. Then configure the assignment task by assigning values or expressions to user defined variables. This configure the assignment task by assigning values or expressions to user defined variables. This assigned value will then be used for the remainder of the workflow.assigned value will then be used for the remainder of the workflow.

Edit VariablesEdit Variables

Page 25: Informatica Power Center - Workflow Manager

25

Event Task

UsageUsageEvent tasks are used to specify the sequence of task execution. The event is triggered based on Event tasks are used to specify the sequence of task execution. The event is triggered based on the completion of a sequence of tasks. Event-Raise task and Event-Wait task help to use event the completion of a sequence of tasks. Event-Raise task and Event-Wait task help to use event tasks in a workflow.tasks in a workflow.

Edit EventsEdit Events

Page 26: Informatica Power Center - Workflow Manager

26

Event Task

UsageUsageIf using Event tags then an Event Raise is used in conjunction with an Event Wait. In the above If using Event tags then an Event Raise is used in conjunction with an Event Wait. In the above example two branches are executed in parallel. The second session of the lower branch will example two branches are executed in parallel. The second session of the lower branch will remain in stasis until the upper branch completes triggering the event. The lower branches remain in stasis until the upper branch completes triggering the event. The lower branches event wait task recognizes the event and allows for the second session to start.event wait task recognizes the event and allows for the second session to start.

Event RaiseEvent Raise

Event WaitEvent Wait

Page 27: Informatica Power Center - Workflow Manager

27

Event Raise

UsageUsageTo configure the Event Raise task the drop-down box allows for selection of the appropriate To configure the Event Raise task the drop-down box allows for selection of the appropriate user-defined Event tag. This will create an entry in the repository for a matching event wait to user-defined Event tag. This will create an entry in the repository for a matching event wait to look for.look for.

Page 28: Informatica Power Center - Workflow Manager

28

Event Wait

UsageUsageThe event wait allows for configuration for an Event Raise (user-defined event) or existence The event wait allows for configuration for an Event Raise (user-defined event) or existence check for an indicator file.check for an indicator file.

User Defined User Defined EventEvent

Indicator FileIndicator File

Page 29: Informatica Power Center - Workflow Manager

29

Event Wait

UsageUsageThe properties section of the Event Wait task allows for further definition of behavior. If your The properties section of the Event Wait task allows for further definition of behavior. If your workflow has failed/suspended after Event Raise but before the Event Wait has resolved, then workflow has failed/suspended after Event Raise but before the Event Wait has resolved, then the Enable Past Events is able to recognize that the Event has happened already. If working the Enable Past Events is able to recognize that the Event has happened already. If working with indicator files you have the ability to either delete the file or allow it to stay in case some with indicator files you have the ability to either delete the file or allow it to stay in case some downstream Event Waits are also looking for that file.downstream Event Waits are also looking for that file.

Resume/Restart Resume/Restart SupportSupport

Flat-file CleanupFlat-file Cleanup

Page 30: Informatica Power Center - Workflow Manager

30

Decision Task

UsageUsageThe decision task allows for True/False based branching of process ordering. The Decision task The decision task allows for True/False based branching of process ordering. The Decision task can home multiple conditions and therefore downstream links can be evaluated simply upon the can home multiple conditions and therefore downstream links can be evaluated simply upon the Decision being True or False.Decision being True or False.

**Note it is possible to have the decision based on SUCCEEDED or FAILED of previous task, **Note it is possible to have the decision based on SUCCEEDED or FAILED of previous task, however if workflow is set to suspend on error than that branch is suspended and the decision however if workflow is set to suspend on error than that branch is suspended and the decision won’t trigger on a FAILED conditionwon’t trigger on a FAILED condition

Page 31: Informatica Power Center - Workflow Manager

31

Control Task

UsageUsageThe control task is utilized in a branching manner to present a level of stoppage during the The control task is utilized in a branching manner to present a level of stoppage during the workflow. Consider if too many sessions have too many failed rows. The options allow for workflow. Consider if too many sessions have too many failed rows. The options allow for different levels such as failing at the object level to Aborting the whole workflow.different levels such as failing at the object level to Aborting the whole workflow.

Page 32: Informatica Power Center - Workflow Manager

32

Timer Task

UsageUsageThe timer task has two main ways to be utilized. The first way is by absolute time that is time The timer task has two main ways to be utilized. The first way is by absolute time that is time evaluated by server time or a user-defined variable (that contains the date/time stamp to start).evaluated by server time or a user-defined variable (that contains the date/time stamp to start).

Page 33: Informatica Power Center - Workflow Manager

33

Timer Task

UsageUsageThe second usage is by Relative time that offers options of time calculated from when the The second usage is by Relative time that offers options of time calculated from when the process reached this (Timer) task, from the start of the container this task, or from the start of the process reached this (Timer) task, from the start of the container this task, or from the start of the absolute top-level workflow.absolute top-level workflow.

Page 34: Informatica Power Center - Workflow Manager

34

Practical

Business CaseBusiness CaseNeed for three sessions to wait for Need for three sessions to wait for indicator file(s) to start each one. indicator file(s) to start each one. Window of opportunity is only between Window of opportunity is only between 10PM and 2AM (next morning). A cutoff 10PM and 2AM (next morning). A cutoff time is needed to stop the process time is needed to stop the process (polling - not existing runs) so that new (polling - not existing runs) so that new activity does not continue between 2AM activity does not continue between 2AM and 10PM. Workflow is scheduled to run and 10PM. Workflow is scheduled to run everyday at 10PMeveryday at 10PM

Objects Used: •Assignment Task – Assigns the appropriate cutoff time for logic•File Wait Tasks – Polls for the appropriate Indicator files•Timer Task – Assigned to start based on the variable assigned by the Assignment task•Command Tasks – After cutoff time the commands will put an indicator file to release the polling

Link Logic – The remainder of the logic is contained within the links themselves. The main sessions evaluate end time of file wait tasks to the cutoff time. If within cutoff then sessions will run. If over cutoff sessions will not run. The cutoff branch also evaluates to see if file wait tasks are running over. If they are still running then the command tasks will fire.

Page 35: Informatica Power Center - Workflow Manager

35

Practical-Descriptive

Products.sql

Page 36: Informatica Power Center - Workflow Manager

36

Labs

Page 37: Informatica Power Center - Workflow Manager

Error Logging

Page 38: Informatica Power Center - Workflow Manager

38

Error Types

Transformation Error Data row has only passed partway through the mapping

transformation logic An error occurs within a transformation

Data reject Data row is fully transformed according to the mapping

logic Due to a data issue, it cannot be written to the target A data reject can be forced by an Update Strategy

Page 39: Informatica Power Center - Workflow Manager

39

Error Types

Error Log Options are set in the Session task (via Workflow Manager)

Error Type Logging OFF (default) Logging ONTransformation Errors

Written to session log then discarded

Appended to flat file or relational tables. Only fatal errors written to session log

Data rejects Appended to reject file (one .bad file per target)

Written to row error tables or file

Page 40: Informatica Power Center - Workflow Manager

40

Error Logging Off

Transformation Errors: Details and data written to the session log Data row is discarded If data flows concatenated, corresponding rows in

parallel flow are also discarded Data Rejects

Conditions causing data to be rejected include: Target database constraint violations, out-of-space errors, logspace

errors, null values not accepted Data-driven records, contain value ‘3’ or DD_REJECT (the reject

has been been forced by an update strategy) Target table properties ‘reject truncated/overflowed rows’

Page 41: Informatica Power Center - Workflow Manager

41

Error Logging to a Relational Database

Option set in Session Configuration

Results written to several tables: PMERR_SESS: Stores metadata about the session run

such as workflow name, session name, repository name, etc.

PMERR_MSG: Error messages for a row of data are logged in this table

PMERR_TRANS: Metadata about the transformation such as transformation group name, source name, port names with datatypes are logged in this table

PMERR_DATA: The row data of the error row as well as the source row data is logged here. The row data is in a string format such as [indicator1:data1 | indicator2 : data2]

Page 42: Informatica Power Center - Workflow Manager

42

Error Logging to a Flat File Option set in Session Configuration Format: Session metadata followed by de-normalized error information Sample session metadata:

Repository GID: 510u6f02-8733-11d7-9db7-00e01823c14dRepository: RowErrorLoggingWorkflow: w_unitTestsSession: s_customersMapping: m_customersWorkflow Run ID: 6079Worklet Run ID: 0Session Instance ID: 806Session Start Time: 10/19/2004 11:24:15Session Start Time (OTC): 1066587856

Row data format:Transformation || Transformation Mapplet Name || Transformation

Group || Partition Index || Transformation Row ID || Error Sequence || Error Timestamp || Error UTC Time || Error Code || Error Message || Error Type || Transformation Data || Source Mapplet Name || Source Name || Source Row ID || Source Row Type || Source Data

Page 43: Informatica Power Center - Workflow Manager

43

Log Source Row Data

Separate checkbox in session task Logs the source row associated with the error row

Logs metadata about source, e.g. Source Qualifier, source row ID, and source row type

NOTE: Source row logging is not available downstream of an Aggregator, Joiner, Sorter, or other transformation (where output rows are not uniquely correlated with input rows).

Page 44: Informatica Power Center - Workflow Manager

44

Labs