12
__________________________________________ ____ EMPTORIS ETL ERROR and AUDIT LOG PROCESS USING INFORMATICA

46404186 ETL Error and Audit Log Process Using A

Embed Size (px)

Citation preview

______________________________________________

EMPTORIS –

ETL ERROR and AUDIT LOG PROCESS

USING

INFORMATICA

______________________________________________

Table of Contents

1 DOCUMENT PURPOSE .......................................................................................................... 3

2 ERROR LOG TABLE DETAILS – .......................................................................................... 4

2.1.1 PMERR_DATA ............................................................................................... 4 2.1.2 PMERR_MSG ................................................................................................. 6 2.1.3 PMERR_SESS ................................................................................................. 8 2.1.4 PMERR_TRANS .............................................................................................. 9

3 DATA ERROR LOG TABLE DETAILS – ............................................................................. 10

3.1.1 ETL Data Error Table .................................................................................. 10

4 AUDITING DETAILS - ........................................................................................................... 11

4.1.1 Auditing Log Table ....................................................................................... 11

1 Document PurposeThe Purpose of this document is to understand the Error Handling & Auditing process for EMPTORIS using Informatica.

There are four types of errors : a) Source Error (Reader Error)b) Transformation Errorc) Target Error (Load error / Writer Error)d) Data Error

Source Error (Reader Error)Examples:

1) Source Database not Available, 2) Table or view does not exist3) SQL Override Query not correct

Transformation ErrorExamples:

1) Look Up error- Having multiple Values2) Records getting rejected for some reasons.3) Data Type mismatch between 2

transformations.

Load / Writer ErrorExamples:

1) Target Table does not exist2) Data Type is different in Informatica and

Target

Data ErrorExamples:

1) Invalid Data 2) Erroneous Data 3) Null data

2 Error Log Table Details –

Understanding the Error Log Tables

PMERR_DATA. Stores data and metadata about a transformation row error and its corresponding source row.

PMERR_MSG. Stores metadata about an error and the error message.

PMERR_SESS. Stores metadata about the session.

PMERR_TRANS. Stores metadata about the source and transformation ports, such as name and datatype, when a transformation error occurs.

2.1.1 PMERR_DATAWhen the Integration Service encounters a row error, it inserts an entry into the PMERR_DATA table. This table stores data and metadata about a transformation row error and its corresponding source row.

The following table describes the structure of the PMERR_DATA table:

Column Name Description

REPOSITORY_GID Unique identifier for the repository.

WORKFLOW_RUN_ID Unique identifier for the workflow.

WORKLET_RUN_ID Unique identifier for the worklet. If a session is not part of a worklet, this value is “0”.

SESS_INST_ID Unique identifier for the session.

TRANS_MAPPLET_INST

Name of the mapplet where an error occurred.

TRANS_NAME Name of the transformation where an error occurred.

TRANS_GROUP Name of the input group or output group where an error occurred. Defaults to either “input” or “output” if the transformation does not have a group.

TRANS_PART_INDEX Specifies the partition number of the transformation where an error occurred.

Column Name Description

TRANS_ROW_ID Specifies the row ID generated by the last active source.

TRANS_ROW_DATA Delimited string containing all column data, including the column indicator. Column indicators are:

D - valid

N - null

T - truncated

B - binary

U - data unavailable

The fixed delimiter between column data and column indicator is colon ( : ). The delimiter between the columns is pipe ( | ). You can override the column delimiter in the error handling settings.

The Integration Service converts all column data to text string in the error table. For binary data, the Integration Service uses only the column indicator.

This value can span multiple rows. When the data exceeds 2000 bytes, the Integration Service creates a new row. The line number for each row error entry is stored in the LINE_NO column.

SOURCE_ROW_ID Value that the source qualifier assigns to each row it reads. If the Integration Service cannot identify the row, the value is -1.

SOURCE_ROW_TYPE Row indicator that tells whether the row was marked for insert, update, delete, or reject.

0 - Insert

1 - Update

2 - Delete

3 - Reject

SOURCE_ROW_DATA Delimited string containing all column data, including the column indicator. Column indicators

Column Name Description

are:

D - valid

O - overflow

N - null

T - truncated

B - binary

U - data unavailable

The fixed delimiter between column data and column indicator is colon ( : ). The delimiter between the columns is pipe ( | ). You can override the column delimiter in the error handling settings.

The Integration Service converts all column data to text string in the error table or error file. For binary data, the Integration Service uses only the column indicator.

This value can span multiple rows. When the data exceeds 2000 bytes, the Integration Service creates a new row. The line number for each row error entry is stored in the LINE_NO column.

LINE_NO Specifies the line number for each row error entry in SOURCE_ROW_DATA and TRANS_ROW_DATA that spans multiple rows.

2.1.2 PMERR_MSGWhen the Integration Service encounters a row error, it inserts an entry into the PMERR_MSG table. This table stores metadata about the error and the error message.

The following table describes the structure of the PMERR_MSG table:

Column Name Description

REPOSITORY_GID Unique identifier for the repository.

WORKFLOW_RUN_ID Unique identifier for the workflow.

WORKLET_RUN_ID Unique identifier for the worklet. If a session is not

Column Name Description

part of a worklet, this value is “0”.

SESS_INST_ID Unique identifier for the session.

MAPPLET_INST_NAME Mapplet to which the transformation belongs. If the transformation is not part of a mapplet, this value is n/a.

TRANS_NAME Name of the transformation where an error occurred.

TRANS_GROUP Name of the input group or output group where an error occurred. Defaults to either “input” or “output” if the transformation does not have a group.

TRANS_PART_INDEX Specifies the partition number of the transformation where an error occurred.

TRANS_ROW_ID Specifies the row ID generated by the last active source.

ERROR_SEQ_NUM Counter for the number of errors per row in each transformation group. If a session has multiple partitions, the Integration Service maintains this counter for each partition.

For example, if a transformation generates three errors in partition 1 and two errors in partition 2, ERROR_SEQ_NUM generates the values 1, 2, and 3 for partition 1, and values 1 and 2 for partition 2.

ERROR_TIMESTAMP Timestamp of the Integration Service when the error occurred.

ERROR_UTC_TIME Coordinated Universal Time, called Greenwich Mean Time, of when an error occurred.

ERROR_CODE Error code that the error generates.

ERROR_MSG Error message, which can span multiple rows. When the data exceeds 2000 bytes, the Integration Service creates a new row. The line number for each row error entry is stored in the LINE_NO column.

Column Name Description

ERROR_TYPE Type of error that occurred. The Integration Service uses the following values:

1 - Reader error

2 - Writer error

3 - Transformation error

LINE_NO Specifies the line number for each row error entry in ERROR_MSG that spans multiple rows.

2.1.3 PMERR_SESSWhen you choose relational database error logging, the Integration Service inserts entries into the PMERR_SESS table. This table stores metadata about the session where an error occurred.

The following table describes the structure of the PMERR_SESS table:

Column Name Description

REPOSITORY_GID Unique identifier for the repository.

WORKFLOW_RUN_ID Unique identifier for the workflow.

WORKLET_RUN_ID Unique identifier for the worklet. If a session is not part of a worklet, this value is “0”.

SESS_INST_ID Unique identifier for the session.

SESS_START_TIME Timestamp of the Integration Service when a session starts.

SESS_START_UTC_TIME

Coordinated Universal Time, called Greenwich Mean Time, of when the session starts.

REPOSITORY_NAME Repository name where sessions are stored.

FOLDER_NAME Specifies the folder where the mapping and session are located.

WORKFLOW_NAME Specifies the workflow that runs the session being logged.

TASK_INST_PATH Fully qualified session name that can span

Column Name Description

multiple rows. The Integration Service creates a new line for the session name. The Integration Service also creates a new line for each worklet in the qualified session name. For example, you have a session named WL1.WL2.S1. Each component of the name appears on a new line:

WL1

WL2

S1

The Integration Service writes the line number in the LINE_NO column.

MAPPING_NAME Specifies the mapping that the session uses.

LINE_NO Specifies the line number for each row error entry in TASK_INST_PATH that spans multiple rows.

2.1.4 PMERR_TRANSWhen the Integration Service encounters a transformation error, it inserts an entry into the PMERR_TRANS table. This table stores metadata, such as the name and datatype of the source and transformation ports.

The following table describes the structure of the PMERR_TRANS table:

Column Name Description

REPOSITORY_GID Unique identifier for the repository.

WORKFLOW_RUN_ID Unique identifier for the workflow.

WORKLET_RUN_ID Unique identifier for the worklet. If a session is not part of a worklet, this value is “0”.

SESS_INST_ID Unique identifier for the session.

TRANS_MAPPLET_INST Specifies the instance of a mapplet.

TRANS_NAME Name of the transformation where an error occurred.

TRANS_GROUP Name of the input group or output group where an

Column Name Description

error occurred. Defaults to either “input” or “output” if the transformation does not have a group.

TRANS_ATTR Lists the port names and datatypes of the input or output group where the error occurred. Port name and datatype pairs are separated by commas, for example: portname1:datatype, portname2:datatype.

This value can span multiple rows. When the data exceeds 2000 bytes, the Integration Service creates a new row for the transformation attributes and writes the line number in the LINE_NO column.

SOURCE_MAPPLET_INST

Name of the mapplet in which the source resides.

SOURCE_NAME Name of the source qualifier. n/a appears when a row error occurs downstream of an active source that is not a source qualifier or a non pass-through partition point with more than one partition.

SOURCE_ATTR Lists the connected field(s) in the source qualifier where an error occurred. When an error occurs in multiple fields, each field name is entered on a new line. Writes the line number in the LINE_NO column.

LINE_NO Specifies the line number for each row error entry in TRANS_ATTR and SOURCE_ATTR that spans multiple rows.

3 Data Error Log Table Details –

3.1.1ETL Data Error Table

Table Description

Error details at the attribute level captured during Informatica ETL execution.

Column Name Column DescriptionERROR_TABLE_PK The Primary Key of the Table.SESSION_NAME The Session nameSOURCE_PRIMARY_KEY Primary Key of the Actual Source DataSOURCE_FIELD_VALUE Source Field Value.SOURCE_TABLE_NAME Source Table NameSOURCE_FIELD_NAME Source Field NameTARGET_TABLE_NAME Target Table nameTARGET_FIELD_NAME Target Table ValueMAPPING_NAME Mapping nameTARGET_FIELD_VALUE Target Field ValueLOOKUP_TABLE_NAME Look Up Table NameERROR_TYPE The Type of ErrorERROR_DETAILS The Error Details occurred.ETL_LOAD_DATE Error date

4 Auditing Details -

4.1.1 Auditing Log Table

Table Description

Status information captured for a particular informatica ETL execution.

Column Name Column DescriptionAudit_Id the unique sequential number for the TableWF_name Workflow nameEntity_Name Table NameSession_Name Session Name of the Audit.Mapping_Name Mapping NameStart_Dtm Start Date TimeEnd_Dtm End Date TimeStatus Holds the Status (Success, Failure, etc).Created_ByCreated_Dtm Date & Time Created.Updated_By System User id of the user who updated the Table.Updated_Dtm System Date and Time when the Table was updated.

The Above Table will be populated during the Session Execution.

Procedure 1- Will Insert a New Record when the Session Starts.

Procedure 2 – Will Update, The Inserted Record for the Session.

For Sequence :create sequence audit_id Increment by 1 Start with 1 ;