View
83
Download
3
Category
Tags:
Preview:
DESCRIPTION
Agenda•Introduction•Architecture Best Practices•Mapping Tips & Tricks•Transformation Techniques•Use of Metadata•Repository maintenance•Q&A
Citation preview
2
PowerCenter Developer: Tips & Tricks for Mapping Designer
Lingaraju Ramasamy (Raju),
Technical Architecture Manager
Informatica Professional Services
3
Agenda
• Introduction
• Architecture Best Practices
• Mapping Tips & Tricks
• Transformation Techniques
• Use of Metadata
• Repository maintenance
• Q&A
4
Introduction
5
Presenter Contact
• Lingaraju Ramasamy (Raju)
• lramasamy@informatica.com
• 408-368-2475 (Mobile)
• Technical Architecture Manager, Informatica
Professional Services
6
Architecture Best Practices
7
Architecture Best Practices
• Consistency • Applying consistent standards reduces long term complications
• Naming Conventions (Velocity)
• Descriptions
• Environments
• Documentation (Hyperlink to SharePoint)
• Modularity • Develop according to a modular design
• Common Error Handling
• Reprocessing
• Mapping Assistants
• Reusability • Focus on reuse to make quick and universal modifications
• Mapplets, Worklets, Transformations, reusable functions
8
Architecture Best Practices
• Scalability • Keep volumes in mind in order to create efficient mappings
• Caching
• Queries
• Partitioning
• Initial vs. Incremental Loads
• Simplicity • Multiple simple processes are often better than few complex
processes
• Multiple mappings
• Simple Queries
• Staging Tables
• Advantages
• Easy to develop, debug, maintain and debug
9
Sample Complex Mapping
EXP_SEQ_HEA
DER_HIERARC
HY_RECS11
EXP_SEQ_DET
AIL_HIERARCH
Y_RECS
SQ_SC_T_STR
_ATTR_OUTL_
ORG_WK
SC_T_STR_AT
TR_OUTL_LAN
G_FN (Oracle)
JAVA_GENERA
TE_MSGID
JAVA_GENERA
TE_SESSID
WSC_STR_OU
Tl_SV_OUTL
JNR_OUTL_HR
_HOL
JNR_OUTL_HR
S
UNI_OUTLET
SEQ_HR_HDR
SC_T_STR_AT
TR_OUTL_HRS
_WK1 (Oracle)
SC_T_STR_AT
TR_OUTL_TM_
FN (Oracle)
SC_T_STR_AT
TR_OUTL_ORG
_WK (Oracle)
SC_T_STR_AT
TR_OUTL_HRS
_WK (Oracle)
SC_T_STR_AT
TR_OUTL_HOL
_FN (Oracle)
SC_T_STR_AT
TR_OUTL_FN1
(Oracle)
SEQ_HOL
TC_TRANSACT
ION_RECS
RTR_HDR_DE
T_DATA
EXP_HDR_BOO
KEND
SEQ_PK_FK_O
UTLET
EXP_ACCOUNT
ING_REC_6
EXP_SEQ_STA
TUS_REC_5
EXP_SEQ_IDE
NTIFICATION_
REC_4
EXP_SEQ_DET
AIL_HOLIDAY_
RECS
EXP_HRS_BOO
KEND
EXP_SEQ_REC
_TM_ZONE
SEQ_HRS
EXP_SEQ_DAY
_OF_WEEK
EXP_LANG_BO
OKEND
JNR_OUTL_LA
NG_HOL_HRS
EXP_SEQ_HEA
DER_HOLIDAY
_RECS
EXP_SEQ_HEA
DER_TEAM_M
EMBER_RECS
EXP_SEQ_DET
AIL_TEAM_ME
MBER_RECS
SQ_TEAM_ME
MBER
EXP_TEAM_ME
MBER
SEQ_TEAM_ME
MBER
JNR_TEAM_ME
MBER
JNR_HRS_HDR
EXP_HRS_HDR
SQ_SC_T_STR
_ATTR_OUTL_
FN1
EXP_OUTL_BO
OKEND
SQ_SC_T_STR
_ATTR_OUTL_
HRS_WK
SEQ_ORG
JNR_ORGANIZ
ATION
SEQ_LANG
EXP_HOL_BOO
KEND
EXP_SEQ_SER
VICE_REC_3
EXP_HEADER_
HOURS
EXP_CREATE_
HDR_ELEMEN
TS
EXP_SEQ_REC
_GEO_TYPE
SQ_SC_T_STR
_ATTR_OUTL_
HOL_FN
EXP_SEQ_REC
_BASIC
SEQ_OTHERS
EXP_DETAIL_H
OURS
SRT
SQ_SC_T_STR
_ATTR_OUTL_
LANG_FN
SQ_SC_T_STR
_ATTR_OUTL_
HRS_WK1
EXP_CHK_NE
W_RECS
EXP_OUTL_OR
G_BOOKEND
HOL_LEVEL3 (F
lat File)
SEQ_OUTLET
EXP_PASS_TH
ROUGH
Multiple Sources –> WebServices
10
Simplified Complex Mapping
RTR_HDR_DE
T_DATA
EXP_DETAIL_L
ANG
EXP_CHK_NE
W_RECS
SEQ_OUTLET
LANG_DETAIL_
STR_ATTR_OU
TL_WK (Oracle)
LANG_HEADER
_SC_T_STR_A
TTR_OUTL_W
K (Oracle)
SQ_SC_T_STR
_ATTR_OUTL_
CHG_WK
SC_T_STR_AT
TR_OUTL_CHG
_WK (Oracle)
SC_T_STR_AT
TR_OUTL_LAN
G_FN (Oracle)
SC_LKP_GET_
MSGID
SC_EXP_CREA
TE_HDR_ELEM
ENTS
SRT_INCM_RE
CS
SEQ_LANG
SQ_SC_T_STR
_ATTR_OUTL_
LANG_FN
EXP_HEADER_
LANG
SEQ_OTHERS
EXP_PASS_TH
ROUGH
EXP_OUTL_BO
OKEND
JNR_OUTL_LA
NG_HOL_HRS
EXP_LANG_BO
OKEND
SQ_SC_T_STR
_ATTR_OUTL_
HRS_WK
SC_LKP_GET_
MSGID
SC_T_STR_AT
TR_OUTL_HRS
_WK (Oracle)
SC_EXP_CREA
TE_HDR_ELEM
ENTS
DETAIL_HOUR
S_SC_T_STR_
ATTR_OUTL_
WK1 (Oracle)
HOURS_HDR_
SC_T_STR_AT
TR_OUTL_WK (
Oracle)
EXP_SEQ_DAY
_OF_WEEK
EXP_SRC_BOO
KEND
SEQ_HRS
SQ_SC_T_STR
_ATTR_OUTL_
CHG_WK
SC_T_STR_AT
TR_OUTL_CHG
_WK1 (Oracle)
EXP_HRS_BOO
KEND
RTR_HDR_DE
T_DATA
EXP_CHK_NE
W_RECS
DAY_OF_WEEK
_SC_T_STR_A
TTR_OUTL_W
K2 (Oracle)
SRT_INCM_RE
CS
EXP_DETAIL_H
OURS
EXP_HEADER_
HOURS
JNR_OUTL_HR
S
SEQ_OTHERS
EXP_PASS_TH
ROUGH
SEQ_OUTLET
SC_T_STR_AT
TR_OUTL_CHG
_WK (Oracle)
EXP_CHK_NE
W_RECS
EXP_LANG_BO
OKEND
SC_T_STR_AT
TR_OUTL_LAN
G_FN (Oracle)
SEQ_OUTLET
LANG_DETAIL_
STR_ATTR_OU
TL_WK (Oracle)
RTR_HDR_DE
T_DATA
EXP_DETAIL_L
ANG
SQ_SC_T_STR
_ATTR_OUTL_
CHG_WK
SC_LKP_GET_
MSGID
SC_EXP_CREA
TE_HDR_ELEM
ENTS
SRT_INCM_RE
CS
LANG_HEADER
_SC_T_STR_A
TTR_OUTL_W
K (Oracle)
SEQ_LANG
SQ_SC_T_STR
_ATTR_OUTL_
LANG_FN
EXP_HEADER_
LANG
SEQ_OTHERS
EXP_PASS_TH
ROUGH
EXP_OUTL_BO
OKEND
JNR_OUTL_LA
NG_HOL_HRS
Staging 1 Staging 2
Staging 3 Staging –> WebServices
JAVA_GEN_MS
G_ID
SEQ_ID EXP_GET_SEQ
_NUM
SQ_SC_T_STR
_ATTR_SITE_F
N
WSC_STR_ATT
R_SAVE_SITE
EXP_SRC_BOO
KEND
JAVA_GEN_SE
SSID
SC_T_STR_AT
TR_SITE_FN1 (
Oracle)
SC_T_STR_AT
TR_SITE_FN (O
racle)
11
Mapping Tips & Tricks
12
Mapping Tips
• Sources and Targets
• Use shortcuts from shared folders
• Extract only what is necessary
• Limit reads on source
• Distinguish between similar sources and targets
• Example
• DIM_CUSTOMER1 = DIM_CUSTOMER_insert
• DIM_CUSTOMER2 = DIM_CUSTOMER_update
13
Mapping Tricks
Parameters & Variables • Reduce overhead of creating multiple mappings
• Replace hard coded values
• Use to incrementally extract data
Example
UpdateDateTime >= TO_DATE (‘$$PREV_RUN_TS’)
: SetVariable (‘$$CURR_RUN_TS, SESSSTARTTIME)
14
Mapping Tricks
Parameters & Variables
• Assign Parameter/Variable values in a Session
• Pass values from one session to a subsequent session in same workflow/worklet
• On Components Tab in Session Properties
• Use user-defined workflow/worklet variables
• Non-reusable Sessions only
15
Mapping Tricks
Built-in Mapping Variables
• Mapping Name
• Workflow Name
• Session Name
• Integration Service Name
• Repository Service Name
• Repository User Name
• Folder Name
• Session Run Mode
• Source Table Names
• Target Table Names
16
Mapping Tricks
Group Expression (Anchor transformation)
• Add expression transformation after a source qualifier and
before a target
• If source or target definition changes, reconnecting ports is
much easier
17
Mass Update
• pmrep massupdate
• Session properties
• Session config attributes
• Transformation instance attributes
• Session instance run time options
18
Mapping Assistants
Preview Data
• View Data
• Accommodate anomalies early
• Verification of extraction/loading strategies
• Type of Data
• Source/Targets
• Relational, Flat file
• XML Files
• For further analysis, use Informatica Analyst
• Analyze the content, quality and structure of source data
• Involves separate Profiling warehouse, client and reports
19
Mapping Assistants
Mapping Wizard • Pass-Through
• Slowly Changing Dimension
• Type 1 Dimension (No History)
• Type 2 Dimension (All History)
• Version Data
• Flag Current
• Effective Date Range
• Type 3 Dimension (Previous Versions)
Slowly Changing Dimension Template
20
Mapping Assistants
• Standardize specifications
• Enhance collaboration between analyst and developer
• Improve documentation & audit ability of business logic
Mapping Analyst for Excel (MAE)
Data Analyst
Defines Business Terms
Specifies Transformation Rules
Standardize Excel format
DI Developer
Augments, Tunes
Generated Mappings
from Specifications
Generate Mapping
Generate Specification
21
Mapping Assistants
• Define consistent methodology & structure for data integration projects
• Build custom wizard based on pattern without coding
• Generate multiple mappings at one time
• Document data flow
Mapping Architect for Visio (MAV)
DI Architect
Creates & Publishes
mapping template
DI Developer
Augments, Tunes
Generated Mappings
Generate Mappings
Informatica
Toolbar
Informatica
Stencil Drawing
Window
Template File
Parameter File
Publish Template
22
Mapping Assistants
•
Mapping Architect for Visio (MAV)
Case Study #1
•7 templates were used across 2 projects to generate 600 mappings
•97% of mappings were automatically generated and required no additional
changes
•3% needed to be manually modified or custom developed
Case Study #2
•1 template was used to create 150 mappings for a data migration project
along with PowerCenter sessions and workflows
•Total effort was less than one day
•Equivalent effort to create 150 mappings manually would have been 2
weeks (10x effort)
23
Transformation Techniques
24
• Apply Default Query when possible • Utilize SQ Attributes
(i.e., User Defined Join, Source Filter)
• Understand advantages and limitations of the SQL override PROS
• Utilize database optimizers (i.e., indexes, hints)
• Can accommodate complex queries
CONS
• Processing impacts database resources
• Lose transformation logic in metadata searched
• Unable to utilize Partitioning or Pushdown Optimization options
• Minimize complicated queries
• Add the SQL Override Query to the Description
Source Qualifier
Transformation Tips
25
• Understand Port process order • INs or IN/OUTs
• VARIABLEs
• OUTPUTs
• Reduce code complexity • Use local variables
• Redundant calculations
• Check previous record
• Provide comments (-- or //) in expressions
• Optimize Expressions • Numeric operations are faster than string operations
• Operators are faster than functions
• Un-Nest complicated logic (use IIF or DECODE)
Expressions
Transformation Tips
26
Transformation Tips
• Build complex expressions and reuse them within repository
• Two Types: • Public: Callable from any transformation
expression
• Private: Only callable from another user-defined function
• Include any valid function except aggregate functions
• Can export to XML Files
User-Defined Functions
27
Transformation Tips
• Consider Source Qualifier with a filter to limit rows within relational sources
• Filter as close to the source as possible
• Replace multiple filters with a router
• Pertaining to routers, rows will go to each path where the criteria is TRUE
Filters/Routers
28
• Use sorted input to decrease use of aggregate caches
• Limit connected input/output or output port
• Filter data before aggregating
• Use as early as possible
Joiners
• Perform joins in Source Qualifier when possible
• Limit use to heterogeneous and flat file sources
• Perform normal joins when possible
• Join sorted input when possible
• Designate the master source as the source with fewer rows
Transformation Tips
Aggregators
29
Transformation Tips
• Using SQL Override in Lookup • Similar to Source Qualifier, avoid when possible
• Can apply Parameters and Variables
• Can query against multiple tables in same database
• Suppress ORDER BY statement by appending two dashes (--)
• Add indexes to database columns
• Replace large lookup tables with joins in the Source Qualifier when possible
• Relational Lookups should only return ports that meet the condition
• Remove all ports not used downstream or in the SQL Override
Lookups
30
Transformation Tips
• Lookup Cache Types
• Persistent Caches
• Save lookup cache files for reuse
• Dynamic Caches
• Retains the latest changes to data as rows pass through the mapping
• Updating a master table
• Real-time sessions
• Slowly changing dimension
• Cache Sizes
• Eliminate Paging
• Stores condition values in index, .idx, files
• Stores output values in data, .dat, files
• Apply the Cache Calculator in Session
Lookup Caches
31
Transformation Tips
• Cache Updates • Update the dynamic lookup cache with
results of an expression Use Case: Update QTY on hand for new timestamp
Add WHERE incoming row timestamp > cached timestamp
• SQL Overrides for Uncached Lookups
• You must choose the Use Any Value on Lookup Policy on Multiple Match condition
• Multiple Rows Return Use Case: Aggregate customer orders and
store the total value
• Database Deadlock Resilience
• NumOfDeadlockRetries
• DeadlockSleep
Lookups
32
Transformation Tricks
• Perform a lookup on an application source that is not a
relational table or file
• Partial pipeline contains Source & Source Qualifier but no target
• Integration Service reads source data and passes to Lookup Transformation to create cache
• Create partitions to improve performance
Pipeline Lookup
33
Transformation Tips
• Transaction in PowerCenter is a set of rows bound by commit or rollback
• Control commit and rollback transactions based on a row or set of rows that pass through the transformation
Use Case: Each invoice number is committed to the target database as a single transaction
• Change Tracing Level to ‘Terse’
• At higher tracing levels, every flush of the write buffers is logged
Transaction Control Transformation
34
Transformation Tips
Associated Source Qualifier
• Use ASQ when MQ data is flat file or COBOL
• ASQ is specific to the format of the MQ data
35
Transformation Tips
• Non-Reusable the counter is 0
• Performance will be affected if cached is low
• Increase of caching will improve the performance
• This doesn’t involve any database operation
• The caching allows to reserve number of rows in the memory
Sequence
36
File Source and Target Commands
37
File Source and Target Commands
Commands for File Sources
• Use a command to generate flat file source data
input rows or file list or a session
• Unix – any valid UNIX command or shell script
• Windows – any valid DOS or batch file.
• Service process variables ($PMSourceFileDir)
can be used in the command.
38
File Source Command
• Input Type: Command (default: file)
• Command Type: Command Generating File List
• Command writes list of file names to stdout
• PowerCenter interprets this as a file list.
Generating a File List
39
File Source Command
• Input Type: Command (default: file)
• Command Type: Command Generating Data
• Command generates rows to stdout
• Flat file reader reads directly from stdout
• Removes need for staging data
• Example use, reading compressed files
• uncompress –c $PMSourceFileDir/myCompressedFile.Z
Generating Source Data
40
File Target Command
• Output Type: Command (default: file)
• Flat file writer writes to the command
• Writing compressed files
• compress -c - > $PMTargetFileDir/myCompressedFile.Z
• Sorting output data
Processing Target Data
41
Filename Port
• Input Filename can be processed and passed
to target
Source Filename
42
Filename Port
• Write records to a dynamically named flat file
Target Filename
43
Change data detection
44
Change Detection for Updates
• Challenge: a record with a lot of columns needs
to be checked for changes
• Solution: calculate an MD5 checksum on the
columns and use a lookup to compare the value
with any existing record
MD5 or CRC32
45
Sample Change data detection
• Calculate MD5 for all columns except key
• Create lookup for primary key and MD5 value
• Perform insert/update, store MD5 value in target
46
Use of Metadata
47
Querying the PowerCenter repository
• Query in designer
• Limit querying on OPB tables
• Use the MX views instead
• Utilize Reporting Service
• Use Meta Query tool
• Use Batch Web Services
48
Reporting Service Dashboard
49
Repository Maintenance
50
Repository Maintenance
Purge repository versions
• Define version strategy for Dev, QA and Prod
• Archieve if required for future analysis
• Purge unwanted versions
• Run the purge in regular interval daily, weekly or monthly
pmrep connect -r $REPOSITORY_NAME -d $DOMAIN_NAME -n
$ADMIN_USER -X INFA_ENCRYPTED_PASSWD
pmrep purgeversion -n $VERSIONS_TO_KEEP -o $FILE_NAME
51
Repository Maintenance
Purge repository logs
• Define log strategy for Dev, QA and Prod
• Archieve if required for future analysis
• Purge unwanted logs
• Run the purge in regular interval daily, weekly or monthly
Compute statistics on metadata tables
pmrep connect -r $REPOSITORY_NAME -d $DOMAIN_NAME -n $ADMIN_USER -X INFA_ENCRYPTED_PASSWD
pmrep truncatelog -t $DAYS_TO_KEEP
pmrep updatestatistics
52
Additional Informatica Resources
Refer the following...
• http://mysupport.informatica.com
• http://velocity.informatica.com/
• http://marketplace.informatica.com
• Product manuals
• Informatica Professional Services
53
Questions?
Recommended