Informatica Power Center 7.1. Agenda Overview & Components Informatica Server & Data Movement Repository Server & Repository Manager Designer Transformations

Embed Size (px)

DESCRIPTION

Overview & Components

Citation preview

Informatica Power Center 7.1 Agenda Overview & Components Informatica Server & Data Movement Repository Server & Repository Manager Designer Transformations used in Informatica Re-usable Transformations & Mapplets Workflow Manager & Workflow Monitor Performance Tuning & Troubleshooting Informatica PowerCenter 7.1 Enhancements Overview & Components Product Overview Repository Repository Server Informatica Server Informatica Client Repository Manager Designer Workflow Manager Workflow Monitor Repository Administration Console Informatica components Overview.. Informatica Repository Stores the metadata created using the Informatica Client tools Repository Manager creates the metadata tables in the database Tasks in the Informatica Client application such as creating users, analyzing sources, developing mappings or Mapplets, or creating sessions creates metadata Informatica Server reads metadata created in the Client application when a session runs Global and local repositories can be created to share metadata Added Features in PowerCenter 7.1 Exchange of Metadata with other BI Tools,Metadata can be Exported to and Imported from BO, Cognos..The objects exported or Imported can be compared in their XML formats itself. MX Views,This Feature enables the User to view the Information on the Server Grids and the Object History in the Repository Server. Can be done using the REP_SERVER_NET, REP_SERVER_NET_REF..tables. Overview.. Informatica Client Tools Repository Manager To create and administer the Metadata Repository using the OPB tables and the MX views. To create Repository Users and Groups, assign Privileges and Permissions. Manage Folders and Locks. Print Reports containing Repository Data. Designer To add source and target definitions to the repository To create mappings that contain data transformation instructions Workflow Manager & Workflow Monitor To create, schedule, execute and monitor sessions Overview.. Informatica Server The Informatica Server reads mapping and session information from the Repository. It extracts data from the mapping Sources and stores the data in memory while it applies the transformation rules in the mapping. The Informatica Server loads the transformed data into the mapping Targets. Platforms Windows NT/2000 UNIX / Linux Solaris Overview.. Sources Relational - Oracle, Sybase, Informix, IBM DB2, Microsoft SQL Server, and Teradata File - Fixed and delimited flat file, COBOL file, and XML Extended PowerConnect products for PeopleSoft, SAP R/3, Siebel, and IBM MQSeries Mainframe PowerConnect for IBM DB2 on MVS Other - MS Excel and Access Overview.. Targets Relational - Oracle, Sybase, Sybase IQ, Informix, IBM DB2, Microsoft SQL Server, and Teradata. File - Fixed and delimited flat files and XML Extended Integration server to load data into SAP BW. PowerConnect for IBM MQSeries to load data into IBM MQSeries message queues. Other - Microsoft Access. ODBC or native drivers, FTP, or external loaders. Questions??? Informatica Server & Data Movement Informatica Server and Data Movement The Informatica Server moves data from sources to targets based on mapping and session metadata stored in a repository database A session is a set of instructions that describes how and when to move data from sources to targets Workflow Manager creates and manages and executes sessions, worklets and workflows. Workflow Monitor is used to monitor session for debugging in case of any error Informatica Server When a session starts, the Informatica Server retrieves mapping and session metadata from the repository database through Repository Server initiating a Repository Agent The Informatica Server runs as a daemon on UNIX and as a service on Windows NT/2000 The Informatica Server uses the following processes to run a session: The Load Manager process - Starts the session, creates the DTM process, and sends post-sessionwhen the session completes The DTM process - Creates threads to initialize the session, read, write, and transform data, and handle pre- and post-session operations Informatica Server Added Features in PowerCenter bit support. 64-bit PowerCenter Servers on AIX and HP-UX (Itanium). Partitioning enhancements. If Partitioning option is used, up to 64 partitions at any partition point in a pipeline that supports multiple partitions can be used. PowerCenter Server processing enhancements. The PowerCenter Server now reads a block of rows at a time. This improves processing performance for most sessions. CLOB/BLOB datatype support. You can now read and write CLOB/BLOB datatypes. BLOB: BLOB (binary large object) is a variable-length binary string that can be up to 2 gigabytes ( bytes) long. CLOB: CLOB (character large object) is a variable-length character string that can be up to 2 gigabytes ( bytes) long. If the length is zero, the value is called the empty string. This value should not be confused with the null value. Mainly Used with XML and Messaging data. The Load Manager Process The Load Manager performs the following tasks: Manages session, worklet and workflow scheduling Locks the session and reads session properties Reads the parameter file Expands the server and session variables and parameters Verifies permissions and privileges Validates source and target code pages Creates the session log file Creates the Data Transformation Manager (DTM) process, which executes the session The Load Manager Process The Load Manager and repository communicate with each other using Unicode To prevent loss of information during data transfer, the Informatica Server and repository require compatible code pages It communicates with the repository in the following situations: When you start the Informatica Server When you configure a session When a session starts Data Transformation Manager Process DTM process is the second process associated with a session run The primary purpose of the DTM process is to create and manage threads that carry out the session tasks The DTM allocates process memory for the session and divides it into buffers. This is also known as buffer memory It creates the main thread, which is called the master thread The master thread creates and manages all other threads If you partition a session, the DTM creates a set of threads for each partition to allow concurrent processing When the Informatica Server writes messages to the session log, it includes the thread type and thread ID DTM Threads For example, a pipeline contains one source, one target. You configure two partitions in the session properties. The DTM creates the following threads to process the pipeline: Two reader threads - One for each partition. Two writer threads - One for each partition When the pipeline contains an Aggregator or Rank transformation, the DTM creates one additional set of threads for each Aggregator or Rank transformation DTM Threads When the Informatica Server processes a mapping with a Joiner transformation, it first reads the master source and builds caches based on the master rows The Informatica Server then reads the detail source and processes the transformation based on the detail source data and the cache data The pipeline for the master source ends at the Joiner transformation and may not have any targets You cannot partition the master source for a Joiner transformation Questions??? Repository Server Informatica client applications and Informatica server access the repository database tables through the Repository Server. Informatica client connects to the repository server through the host name/IP Address and its port number The Repository Server can manage multiple repositories on different machines on the network. For each repository database registered with the Repository Server, it configures and manages a Repository Agent process. The Repository Agent is a multi-threaded process that performs the action needed to retrieve, insert, and update metadata in the repository database tables. Repository Server (Contd..) Added Features in PowerCenter 7.1 Updating Repository Statistics, PowerCenter, identifies and updates statistics for all repository tables and indexes when you copy, upgrade, and restore repositories. This improves performance when PowerCenter accesses the repository Pmrep, Can use pmrep to back up, disable, or enable a repository, delete a relational connection from a repository, delete repository details, truncate log files, and run multiple pmrep commands sequentially. Can also use pmrep to create, modify, and delete folder. Repository Administration Console Use the Administration Console to connect to the Repository Server and perform repository administration tasks, such as creating, starting, and backing up repositories. The Administration Console uses Microsoft Management Console (MMC) technology. MMC is a console framework for server and network management applications called snap-ins. The Administration Console is a snap-in for MMC. The Administration Console allows you to navigate multiple Repository Servers and repositories and perform repository management tasks. Repository Administration Console Repository Name Node Details When you select a repository in the Console Tree, the Main window displays details about the repository in HTML view. The Main window also displays HTML links that allow you perform some repository tasks. Questions??? Repository Manager Repository Manager Window Repository The Informatica repository tables have an open architecture Metadata can include information such as mappings describing how to transform source data sessions indicating when you want the Informatica Server to perform the transformations connect strings for sources and targets The repository also stores administrative information such as usernames and passwords permissions and privileges There are three different types of repositories: Standalone repository Global repository Local repository Repository Can create and store the following types of metadata in the repository: Database connections Global objects Mappings Mapplets Multi-dimensional metadata Reusable transformations Sessions and batches Shortcuts Source definitions Target definitions Transformations Repository Added Features in PowerCenter 7.1 Exchange of Metadata with other BI Tools,Metadata can be Exported to and Imported from BO, Cognos..The objects exported or Imported can be compared in their XML formats itself. MX Views,This Feature enables the User to view the Information on the Server Grids and the Object History in the Repository Server. Can be done using the REP_SERVER_NET, REP_SERVER_NET_REF..tables. Repository Manager Tasks Perform repository functions: Create, backup, copy, restore, upgrade, and delete repositories Make a repository as global repository Register and unregister local repositories with a global repository Implement repository security: Create, edit, and delete repository users and user groups Assign and revoke repository privileges and folder permissions View locks and unlock objects Perform folder functions: Create, edit, and delete folders Copy a folder within the repository or to another repository Compare folders within a repository or in different repositories Add and remove repository reports Import and export repository connection information in the registry Analyze source/target, mapping, and shortcut dependencies Dependency Window The Dependency window can display the following types of dependencies: Source-target dependencies - lists all sources or targets related to the selected object and relevant information Mapping dependencies - lists all mappings containing the selected object as well as relevant information Shortcut dependencies - lists all shortcuts to the selected object and relevant details Copying and Backing Up a Repository Can copy a repository from one database to another If the database into which the repository has to be copied contains an existing repository, the Repository Manager deletes the existing repository Backing up a repository, saves the entire repository in a file Can save this file in any local directory Can recover data from a repository backup file Crystal Reports The Repository Manager includes four Crystal Reports that provide views of your metadata: Mapping report (map.rpt) - Lists source column and transformation details for each mapping Source and target dependencies report (S2t_dep.rpt) Target table report (Trg_tbl.rpt) - Provides target field transformation expressions, descriptions, comments for each target table Executed session report (sessions.rpt) - Provides information about executed sessions Repository Security Can plan and implement security using the following features: User groups Repository users Repository privileges Folder permissions Locking Can assign users to multiple groups Privileges are assigned to groups Can assign privileges to individual usernames and must assign each user to at least one user group Repository Security Viewing Locks Can view existing locks in the repository in the Repository Manager Types of Locks There are five kinds of locks on repository objects: Read lock - Created when you open a repository object in a folder for which you do not have write permission Write lock - Created when you create or edit a repository object Execute lock - Created when you start a session or batch Fetch lock - Created when the repository reads information about repository objects from the database Save lock - Created when the repository is being Saved. Folders Folders provide a way to organize and store all metadata in the repository, including mappings and sessions They are used to store sources, transformations, cubes, dimensions, Mapplets, business components, targets, mappings, sessions and batches Can copy objects from one folder to another Can copy objects across repositories The Designer allows you to create multiple versions within a folder When a new version is created, the Designer creates a copy of all existing mapping metadata in the folder and places it into the new version Can copy a session within a folder, but you cannot copy an individual session to a different folder Folders To copy all sessions within a folder to a different location, you can copy the entire folder Any mapping in a folder can use only those source and target definitions or reusable transformations that are stored: in the same folder in a shared folder and accessed through a shortcut The configurable folder properties are: Folder permissions Folder owner Owners group Shared or not shared Folders Folders have the following permission types: Read permission Write permission Execute permission Shared folders allow users to create shortcuts to objects in the folder Shortcuts inherit changes to their shared object Once you make a folder shared, you cannot reverse it Copying Folders Each time you copy a folder, the Repository Manager copies the following: Sources, transformations, Mapplets, targets, mappings, and business components Sessions and batches Folder versions Copying Folders When you copy a folder, the Repository Manager allows to: Re-establish shortcuts Choose an Informatica Server Copy connections Copy persisted values Compare folders Replace folders Comparing Folders The Compare Folders Wizard allows to perform the following comparisons: Compare objects between two folders in the same repository Compare objects between two folders in different repositories Compare objects between two folder versions in the same folder Each comparison also allows to specify the following comparison criteria: Versions to compare Object types to compare Direction of comparison Whether or not the Repository Manger notes a similarity or difference between two folders depends on the direction of the comparison One-way comparisons check the selected objects of Folder1 against the objects in Folder2 Two-way comparisons check objects in Folder1 against those in Folder2 and also check objects in Folder2 against those in Folder1 Comparing Folders The comparison wizard displays the following user- customized information: Similarities between objects Differences between objects Outdated objects Can edit and save the result of the comparison The Repository Manager does not compare the field attributes of the objects in the folders when performing the comparison A two-way comparison can sometimes reveal information a one-way comparison cannot A one-way comparison does not note a difference if an object is present in the target folder but not in the source folder Folder Versions Maintaining different versions lets you revert to earlier work when needed When you save a version, you save all metadata at a particular point in development Later versions contain new or modified metadata, reflecting work that you have completed since the last version Maintaining different versions lets you revert to earlier work when needed Added Features in PowerCenter 7.1 Can run object queries that return shortcut objects. Can also run object queries based on the latest status of an object. The query can return local objects that are checked out, the latest version of checked in objects, or a collection of all older versions of objects. Exporting and Importing Objects In the Designer and Workflow Manager, you can export repository objects to an XML file and then import repository objects from the XML file Can export the following repository objects: Sources Targets Transformations Mapplets Mappings Sessions Can share objects by exporting and importing objects between repositories with the same version Questions??? Designer Screen Shot of Designer Designer Appendix Informaticas Designer is the client application used to create and manage sources, targets, and the associated mappings between them. The Informatica Server uses the instructions configured in the mapping and its associated session to move data from sources to targets. The Designer allows you to work with multiple tools, in multiple folders, and in multiple repositories at a time. The client application provides five tools with which to create mappings: Source Analyzer. Use to import or create source definitions for flat file, ERP, and relational sources. Warehouse Designer. Use to import or create target definitions. Transformation Developer. Used to create reusable object that generates or modifies data. Mapplet Designer. Used to create a reusable object that represents a set of transformations. Mapping Designer. Used to create mappings. Designer Appendix The Designer consists of the following windows: Navigator. Use to connect to and work in multiple repositories and folders. You can also copy objects and create shortcuts using the Navigator Workspace. Use to view or edit sources, targets, Mapplets, transformations, and mappings. You can work with a single tool at a time in the workspace Output. Provides details when you perform certain tasks, such as saving your work or validating a mapping. Right- click the Output window to access window options, such as printing output text, saving text to file, and changing the font size Overview. An optional window to simplify viewing workbooks containing large mappings or a large number of objects Source Analyzer The following types of source definitions can be imported or created or modified in the Source Analyzer: Relational Sources Tables, Views, Synonyms Files Fixed-Width or Delimited Flat Files, COBOL Files Microsoft Excel Sources XML Sources XML Files, DTD Files, XML Schema Files Data models using MX Data Model PowerPlug SAP R/3, SAP BW, Siebel, IBM MQ Series by using PowerConnect Source Analyzer Importing Relational Source Definitions After importing a relational source definition, Business names for the table and columns can be entered Source Analyzer Importing Relational Source Definitions The source definition appears in the Source Analyzer. In the Navigator, the new source definition appears in the Sources node of the active repository folder, under the source database name Source Analyzer Flat File Sources Supports Delimited & Fixed length files Flat File Wizard prompts for the following file properties File name and location File code page File type Column names and data types Number of header rows in the file Column size and null characters for fixed-width files Delimiter type, quote character, and escape character for delimited files Source Analyzer Flat File Sources Flat file properties in the Source Analyzer : Table name, business purpose, owner, and description File type Null characters for fixed-width files Delimiter type, quote character, and escape character for delimited files Column names and datatypes Comments HTML links to business documentation Warehouse Designer To create target definitions for file and relational sources Import the definition for an existing target - Import the target definition from a relational target Create a target definition based on a source definition, Relational source definition, Flat file source definition Manually create a target definition or design several related targets at the same time Tasks of Warehouse Designer: Edit target definitions Change in the target definitions gets propagated to the mappings using that target Create relational tables in the target database If the target tables do not exist in the target database, generate and execute the necessary SQL code to create the target tables Preview relational target data Warehouse Designer Create/Edit Target Definitions Can edit Business Names, Constraints, Creation Options, Description, Keywords on the Table tab of the target definition Can edit Column Name, Datatype, Precision and Scale, Not Null, Key Type, Business Name on the Columns tab of the target definition Mapping Mappings represent the data flow between sources and targets When the Informatica Server runs a session, it uses the instructions configured in the mapping to read, transform, and write data Every mapping must contain the following components: Source / Target definitions. Transformation / Transformations. Connectors Or Links. A mapping can also contain one or more Mapplets Mapping Sample Mapping Source Source Qualifier Links or Connectors Target Transformation Mapping - Invalidation On editing a mapping, the Designer invalidates sessions under the following circumstances: Add or remove sources or targets Remove Mapplets or transformations Replace a source, target, Mapplet, or transformation while importing or copying objects Add or remove Source Qualifiers or COBOL Normalizers, or change the list of associated sources for these transformations Add or remove a Joiner or Update Strategy transformation. Add or remove transformations from a Mapplet in the mapping Change the database type for a source Mapping - Components Every mapping requires at least one transformation object that determines how the Informatica Server reads the source data: Source Qualifier transformation Normalizer transformation ERP Source Qualifier transformation XML Source Qualifier transformation Transformations can be created to use once in a mapping, or reusable transformations to use in multiple mappings Mapping - Updates By default, the Informatica Server updates targets based on key values The default UPDATE statement for each target in a mapping can be overrode For a mapping without an Update Strategy transformation, configure the session to mark source records as update Mapping - Validation The Designer marks a mapping valid for the following reasons: Connection validation - Required ports are connected and that all connections are valid Expression validation - All expressions are valid Object validation - The independent object definition matches the instance in the mapping The Designer performs connection validation each time you connect ports in a mapping and each time you validate or save a mapping You can validate an expression in a transformation while you are developing a mapping Mapping - Validation Questions??? Transformations used in Informatica Transformations A transformation is a repository object that generates, modifies, or passes data The Designer provides a set of transformations that perform specific functions Transformations in a mapping represent the operations the Informatica Server performs on data Data passes into and out of transformations through ports that you connect in a mapping or mapplet Transformations can be active or passive Transformation - Types An active transformation can change the number of rows that pass through it A passive transformation does not change the number of rows that pass through it Transformations can be connected to the data flow, or they can be unconnected An unconnected transformation is not connected to other transformations in the mapping. It is called within another transformation, and returns a value to that transformation Transformations - Types Source Qualifier Transformation Object This object is used to define the reader process, or the selection for relational sources. Can automatically generate the SQL, User can override it and write own SQL. In the source qualifier you can also specify filters on the source data or distinct selection. Update Strategy Transformation Object Use the update strategy to finely control on a row by row basis, whether we want the row to be inserted, updated, deleted or a reject based on some logical condition we have derived in the mapping process. The update strategy transformation is used frequently for handling Slowly Changing Dimensions. Transformations - Types Expression Transformation Object The Expression Transformation is used for data cleansing and scrubbing. There are over 80 functions within PowerCenter, such as concatenate, instring, rpad, ltrim and we use many of them in the Expression Transformation. We can also create derived columns and variables in the Expression Transformation. Filter Transformation The Filter Transformation is used to do just that filter data. We can filter at the source using the Source Qualifier, but maybe I need to filter the data somewhere within the pipeline, perhaps on an aggregate that Ive calculated, and we use the Filter Transformation for that. We also use the Filter to branch and load a portion of the data into one target based on some condition and the rest of the data into another target. NOTE Router transformation is used for branching, its MUCH more efficient than the filter transformation. Transformations - Types Aggregator Transformation The Aggregator Transformation is used when doing those types of functions that require sorting or a group by. Functions like SUM, AVG. Aggregates are done in memory and we can accept pre-sorted input to reduce the amount of memory necessary. Lookup Transformation Use the Lookup Transformation to lookup tables in the source database, the target database or any other database as long as we have connectivity. Lookups are by default cached in memory, however you can turn off the caching by checking the option. The lookup condition can be any type of Boolean expression. Like the Update Strategy, the Lookup Transformation is used frequently for handling Slowly Changing Dimensions. Transformations - Types Sequence Generator Transformation We provide a Sequence Generator Transformation for those target databases that do not have these capabilities (such as MS-SQL 6.5) and ours works just like Oracles sequence generator. You can specify the starting value, the increment value, the upper limit and whether or not you want to reinitialize. (Extended discussion: the Sequence Generator can also be used to stamp a batch load id into every new row loaded in a given update/refresh load. This is very useful for backing out loads if that becomes necessary.) Stored Procedure Transformation One of the main benefits of using PowerCenter is that we eliminate the need for coding, however if you have already written a stored procedure that you cannot duplicate within PowerCenter you can call them from within a mapping using the Stored Procedure Transformation. (We import the stored procedure arguments the same way we import the source catalog definition.) Transformations - Types External Procedure Transformation The same holds true for External Procedures written in C, C++, or Visual Basic they can be called from within the mapping process using the External Procedure transformation. External procedure transformations are reusable transformations and we define those in the Transformation Developer (the fourth component tool in the Designer). Joiner Transformation The Joiner Transformation is used for heterogeneous joins within a mapping perhaps we need to join a flat file with an Oracle table or Oracle and a Sybase table or 2 flat files together. (Joins are done in memory and the memory profile is configurable!). Sorter Transformation The Sorter is used to sort the data in the Ascending or the Descending order, generally used before aggregation of data. Transformations - Types Normalizer Transformation The Normalizer Transformation is used when you have occurs or arrays in your source data and you want to flatten the data into a normalized table structure PowerCenter will do this automatically for you. The Normalizer is also used to pivot data. Ranking Transformation This is used when you are doing a very specific data mart, such as Top Performers, and you want to load the top 5 products and the top 100 customers or the bottom 5 products. Active Transformation Nodes Advanced External Procedure - Calls a procedure in a shared library or in the COM layer of Windows NT Aggregator - Performs aggregate calculations ERP Source Qualifier - Represents the rows that the Informatica Server reads from an ERP source when it runs a session Filter - Filters records Joiner - Joins records from different databases or flat file systems Rank - Limits records to a top or bottom range Router - Routes data into multiple transformations based on a group expression Source Qualifier - Represents the rows that the Informatica Server reads from a relational or flat file source when it runs a session Update Strategy - Determines whether to insert, delete, update, or reject records Passive Transformation nodes Expression - Calculates a value External Procedure - Calls a procedure in a shared library or in the COM layer of Windows NT Input - Defines Mapplet input rows. Available only in the Mapplet Designer Lookup - Looks up values Output - Defines Mapplet output rows. Available only in the Mapplet Designer Sequence Generator - Generates primary keys Stored Procedure - Calls a stored procedure XML Source Qualifier - Represents the rows that the Informatica Server reads from an XML source when it runs a session Transformation Added Features in PowerCenter 7.1 Flat file lookup: Can now perform lookups on flat files. To create a Lookup transformation using a flat file as a lookup source, the Designer invokes the Flat File Wizard. To change the name or location of a lookup between session runs, the lookup file parameter can be used. Dynamic lookup cache enhancements: When you use a dynamic lookup cache, the PowerCenter Server can ignore some ports when it compares values in lookup and input ports before it updates a row in the cache. Also, you can choose whether the PowerCenter Server outputs old or new values from the lookup/output ports when it updates a row. You might want to output old values from lookup/output ports when you use the Lookup transformation in a mapping that updates slowly changing dimension tables. Union transformation: Can use the Union transformation to merge multiple sources into a single pipeline. The Union transformation is similar to using the UNION ALL SQL statement to combine the results from two or more SQL statements. Transformation Added Features in PowerCenter 7.1 Custom transformation API enhancements: The Custom transformation API includes new array-based functions that allow you to create procedure code that receives and outputs a block of rows at a time. Use these functions to take advantage of the PowerCenter Server processing enhancements. Midstream XML transformations: Can now create an XML Parser transformation or an XML Generator transformation to parse or generate XML inside a pipeline. The XML transformations enable you to extract XML data stored in relational tables, such as data stored in a CLOB column. You can also extract data from messaging systems, such as TIBCO or IBM MQSeries. Transformations - Properties Port Name Copied ports will inherit the name of contributing port Copied ports with the same name will be appended with a number Types Of Ports: Input: Data Input from previous stage. Output: Data Output to the next stage. Lookup: Port to be used to compare Data. Return: The Port (Value) returned from Looking up. Variable: The port that stores value temporarily. Data types Transformations use internal data types Data types of input ports must be compatible with data types of the feeding output port Port Default values - can be set to handle nulls and errors Description - can enter port comments Aggregator Transformation Performs aggregate calculations Components of the Aggregator Transformation Aggregate expression Group by port Sorted Input option Aggregate cache The Aggregator is an active and connected transformation Aggregator Transformation The following aggregate functions can be used within an Aggregator transformation: AVG, COUNT, FIRST, LAST, MAX, MEDIAN MIN, PERCENTILE, STDDEV, SUM, VARIANCE Expression Transformation Can use the Expression transformation to perform any non-aggregate calculations Calculate values in a single row test conditional statements before you output the results to target tables or other transformations Ports that must be included in an Expression Transformation: Input or input/output ports for each value used in the calculation Output port for the expression Variable Ports are used to temporarily store values. Expression Transformation Can enter multiple expressions in a single expression transformation Can enter only one expression for each output port Can create any number of output ports in the transformation Can create Variable ports to store Data temporarily. Filter Transformation It provides the means for filtering rows in a mapping All ports in a Filter transformation are input/output Only rows that meet the condition pass through it Cannot concatenate ports from more than one transformation into the Filter transformation To maximize session performance, include the Filter transformation as close to the sources in the mapping as possible Does not allow setting output default values !! Filter should always be used as close to the Source, so that the Load of data carried ahead is decreased at / or near to the Source Itself. !! Filter should always be used as close to the Source, so that the Load of data carried ahead is decreased at / or near to the Source Itself. Joiner Transformation Joins two related heterogeneous sources residing in different locations or file systems Can be used to join Two relational tables existing in separate databases Two flat files in potentially different file systems Two different ODBC sources Two instances of the same XML source A relational table and a flat file source A relational table and an XML source Joiner Transformation Use the Joiner transformation to join two sources with at least one matching port It uses a condition that matches one or more pairs of ports between the two sources Requires two input transformations from two separate data flows It supports the following join types Normal (Default) Master Outer Detail Outer Full Outer Added Features in PowerCenter 7.1 Can be used to join two data streams that originate from the same source. Joiner Transformation Joiner on Same Sources, Used when a Calculation has to be done on One part of the data and join the Transformed data to the Original Data set. Joiner on Multiple Sources. Lookup Transformation Used to look up data in a relational table, view, synonym or Flat File. The Informatica Server queries the lookup table based on the lookup ports in the transformation. It compares Lookup transformation port values to lookup table column values based on the lookup condition. Can use the Lookup transformation to perform many tasks, including: Get a related value Perform a calculation Update slowly changing dimension tables Types Of Lookups Connected Unconnected Connected Lookup Connected Lookup Transformation Receives input values directly from another transformation in the pipeline For each input row, the Informatica Server queries the lookup table or cache based on the lookup ports and the condition in the transformation Passes return values from the query to the next transformation Unconnected Lookup Unconnected Lookup Transformation Receives input values from an expression using the :LKP (:LKP.lookup_transformation_name(argument, argument,...)) reference qualifier to call the lookup and returns one value. Some common uses for unconnected lookups include, Testing the results of a lookup in an expression, Filtering records based on the lookup results, Marking records for update based on the result of a lookup (for example, updating slowly changing dimension tables), Calling the same lookup multiple times in one mapping. With unconnected Lookups, you can pass multiple input values into the transformation, but only one column of data out of the transformation Use the return port to specify the return value in an unconnected lookup transformation Lookup Caching Session performance can be improved by caching the lookup table Caching can be static or dynamic By default, the lookup cache remains static and does not change during the session Types of Caching Persistent Cache used across sessions. ReCache from Source To Synchronize persistent cache. Static Read only cache for single Lookup. Dynamic To reflect the Change Data Directly, Used when Target Table is looked up. Shared Cache shared among multiple transformations. Added Features in PowerCenter 7.1 Dynamic lookup cache enhancements: When you use a dynamic lookup cache, the PowerCenter Server can ignore some ports when it compares values in lookup and input ports before it updates a row in the cache. Also, you can choose whether the PowerCenter Server outputs old or new values from the lookup/output ports when it updates a row. You might want to output old values from lookup/output ports when you use the Lookup transformation in a mapping that updates slowly changing dimension tables. Dynamic Lookup Transformation A Lookup transformation using a dynamic cache has the following properties that a Lookup transformation using a static cache does not have: NewLookupRow Associated Port Dynamic Lookup Transformation You might want to configure the transformation to use a dynamic cache when the target table is also the lookup table. When you use a dynamic cache, the Informatica Server inserts rows into the cache as it passes rows to the target. Router Transformation A Router transformation tests data for one or more conditions and gives the option to route rows of data that do not meet any of the conditions to a default output group It has the following types of groups: Input Output There are two types of output groups: User-defined groups Default group Create one user-defined group for each condition that you want to specify Comparing Router & Filter Transformations Sequence Generator Transformation Generates numeric values It can be used to create unique primary key values replace missing primary keys cycle through a sequential range of numbers It provides two output ports: NEXTVAL and CURRVAL These ports can not be edited or deleted Can not add ports to the sequence generator transformation When NEXTVAL is connected to the input port of another transformation, the Informatica Server generates a sequence of numbers Sequence Generator Transformation Connect the NEXTVAL port to a downstream transformation to generate the sequence based on the Current Value and Increment By properties The CURRVAL port is connected, only when the NEXTVAL port is already connected to a downstream transformation Source Qualifier Transformation The SQ represents the records that the Informatica Server reads when it runs a session Can use the SQ to perform the following tasks: Join data originating from the same source database Filter records when the Informatica Server reads source data Specify an outer join rather than the default inner join Specify sorted ports Select only distinct values from the source Create a custom query to issue a special SELECT statement for the Informatica Server to read source data Parameter Variables can be used in the SQ. For relational sources, the Informatica Server generates a query for each SQ when it runs a session The default query is a SELECT statement for each source column used in the mapping The Informatica Server reads only those columns in SQ that are connected to another transformation The Target Load Order can be specified depending on the SQ Source Qualifier Transformation The SQ Override can be used to Select only the Required Ports from the Source. The SQ Override can contain Parameter Variables like $$$Sessstarttime, $$IPAddress. Clicking the Generate SQL tab would result in the generation of Default Select query. This can be altered by the User. For an SQL to Generate a Default Query, Its should have Linked Output port !! Source Qualifier Transformation The SQ Override can be used to Join two different Sources based on a key (In the WHERE Clause) and Select only the Required Ports from either of the Sources. For an SQL to Generate a Default Query, Its should have Linked Output port !! Source Qualifier Transformation The SQ can be used to define a Source Filter which would filter off the Unwanted records at the Source itself, thus reducing the cache time. The SQ can be used to Define a User-Defined Join condition also. This Join condition would be added to the Default Query generated by Informatica as a WHERE clause. Source Qualifier Transformation The SQ can be used to Sort ports if it is being succeeded by an Aggregator. Select Distinct Check Box when checked performs a Distinct Select on the Input Data. Pre SQL can be used to run a pre Caching command, either a Delete Target or Join two source systems. Similarly Post SQL command can also be used. Update Strategy Transformation The Is used to Control how the Rows are flagged for Insert, Update, Reject or Delete in a Mapping. To define the flagging of rows in a session, It can be Insert, Update, Delete or Data Driven. In Data Driven Option the Session follows the Flag specified in the Mapping's Update strategy. It can be configured to either Drop the Rejected rows or to forward it to next transformation, In the Later case a separate Reject Table can be used. Update Strategy Transformation The Control can be separately done for Each of the Targets in the mapping. Update Strategy Options: Insert - Inserts row into the table. Delete - Deletes row from the table. Update - Updates row existing in the table. Target Table Update strategy Options: Update as Update - Updates each row flagged for Update if it exists in the table. Update as Insert - Inserts a new row for each Update. Update else Insert - Updates if row exists, else Inserts. Rank Transformation Allows to select only the top or bottom rank of data, not just one value Can use it to return the largest or smallest numeric value in a port or group the strings at the top or the bottom of a session sort order During the session, the Informatica Server caches input data until it can perform the rank calculations Can select only one port to define a rank Rank Transformation When you create a Rank transformation, you can configure the following properties: Enter a cache directory Select the top or bottom rank Select the input/output port that contains values used to determine the rank. You can select only one port to define a rank Select the number of rows falling within a rank Define groups for ranks Rank Transformation Ports: Variable port - Can use to store values or calculations to use in an expression Rank port - Use to designate the column for which you want to rank values Rank Transformation Properties Top/Bottom This is used to specify if the top or the bottom rank needs to be output. Number of Ranks Specifies the number of Rows to be Ranked. Transformation Scope Transaction Applies the transformation logic to only the rows of current transaction All-Input - Applies the transformation logic to all the rows of Input data. Stored Procedure Transformation A Stored Procedure transformation is an important tool for populating and maintaining databases a precompiled collection of Transact-SQL statements and optional flow control statements, similar to an executable script used to call a stored procedure The stored procedure must exist in the database before creating a Stored Procedure transformation One of the most useful features of stored procedures is the ability to send data to the stored procedure, and receive data from the stored procedure There are three types of data that pass between the Informatica Server and the stored procedure: Input/Output parameters - For many stored procedures, you provide a value and receive a value in return Return values - Most databases provide a return value after running a stored procedure Status codes - Status codes provide error handling for the Informatica Server during a session Stored Procedure Transformation The following list describes the options for running a Stored Procedure transformation: Normal - During a session, the stored procedure runs where the transformation exists in the mapping on a row-by-row basis Pre-load of the Source - Before the session retrieves data from the source, the stored procedure runs Post-load of the Source - After the session retrieves data from the source, the stored procedure runs Pre-load of the Target - Before the session sends data to the target, the stored procedure runs Post-load of the Target - After the session sends data to the target, the stored procedure runs Stored Procedure Transformation Can set up the Stored Procedure transformation in one of two modes, either connected or unconnected The flow of data through a mapping in connected mode also passes through the Stored Procedure transformation Cannot run the same instance of a Stored Procedure transformation in both connected and unconnected mode in a mapping. You must create different instances of the transformation The unconnected Stored Procedure transformation is not connected directly to the flow of the mapping It either runs before or after the session, or is called by an expression in another transformation in the mapping Stored Procedure Transformation A Connected Stored Procedure transformation is linked with the Input port from the preceding transformation and the Output port can be linked ahead. A Unconnected Stored Procedure transformation can be called by passing the parameters of Input through an expression in the following format, :SP. (Input, Output) Custom Transformation Custom Transformation Operates in Conjunction with the Procedures that are created outside the Designer. The PowerCenter Server uses generated Functions to Interface with the procedure and the Procedure code should be developed using API functions. The PowerCenter Server can pass a Single row of data or an Array depending on the Custom function. The types of generated functions are - Initialization Function - To initialize processes before data is passed to custom function. Notification Function - To send notifications Deinitialization Function - To deinitialize the processes after data is passed to custom function. The Types of API Functions are - Set Data Access Mode Functions Navigation Functions Property Functions Data Handling Functions Most of these Functions are associated with Handles, These Handles are Internal Informatica Functions which Guide the process flow. Sorter Transformation A Sorter Transformation is used to sort Data. Types of Sorting: Ascending/Descending Case Sensitive - Does Case sensitive ascending or descending sort. Distinct - Gives distinctly sorted output rows. A sorter can be used to sort data from a Relational Source or a Flat file. The Sort key has to be specified based on which the sort would be done. Sorter Transformation The Fields can be sorted in Either Ascending or Descending orders by specifying the type of sort required in the Direction field. Multiple fields can be marked as Keys and different sorts can be done on them. When Case Sensitive Check box is Checked, the sorting is done on case sensitive basis. Working Directory is the directory in which Informatica Caches the files for sort. Distinct Check box is used to get Distinct sorted output. Transformation Language The designer provides a transformation language to help you write expressions to transform source data With the transformation language, you can create a transformation expression that takes the data from a port and changes it Can write expressions in the following transformations: Aggregator Expression Filter Rank Router Update Strategy Transformation Language Expressions can consist of any combination of the following components: Ports (input, input/output, variable) String literals, numeric literals Constants Functions Local and system variables Mapping parameters and mapping variables Operators Return values Transformation Language The functions available in PowerCenter are Aggregate Functions e.g. AVG, MIN, MAX Character Functions e.g. CONCAT, LENGTH Conversion Functions e.g. TO_CHAR, TO_DATE Date Functions e.g. DATE_DIFF, LAST_DAY Numeric Functions e.g. ABS, CEIL, LOG Scientific Functions e.g. COS, SINH Special Functions e.g. DECODE, IIF, ABORT Test Functions e.g. ISNULL, IS_DATE Variable Functions e.g. SETMAXVARIABLE Transformation Expressions The pre-compiled and tested transformation expressions help you create simple or complex transformation expressions: Functions. Over 60 SQL-like functions allow you to change data in a mapping. Aggregates. Calculate a single value for all records in a group. Return a single value for each group in an Aggregator transformation. Apply filters to calculate values for specific records in the selected ports. Use operators to perform arithmetic within the function. Calculate two or more aggregate values derived from the same source columns in a single pass. Filter condition can be applied to all aggregate functions. The filter condition must evaluate to TRUE, FALSE, or NULL. If the filter condition evaluates to NULL or FALSE, the Informatica Server does not select the record. For example, the following expression calculates the median salary for all employees who make more than $50,000: MEDIAN( SALARY, SALARY > ) Transformation Expressions Characters. Character functions assist in the conversion, extraction and identification of sub-strings. For example, the following expression evaluates a string, starting from the end of the string. The expression finds the first space and then deletes characters from first space to end of line. SUBSTR( CUST_NAME,1,INSTR( CUST_NAME,' ',-1,1 )) Conversions. Conversion functions assist in the transformation of data from one type to another. For example, the following expression converts the dates in the DATE_PROMISED port to text in the format MON DD YYYY: TO_CHAR( DATE_PROMISED, 'MON DD YYYY' ) DATE_PROMISEDRETURN VALUE Apr :00:10AM'Apr ' Transformation Expressions Dates. Date functions help you round, truncate, or compare dates; extract one part of a date; or to perform arithmetic on a date. For example, the following expression would return the month portion of the date, GET_DATE_PART(Apr :00:00, MM) Numerical and Scientific. Numerical and scientific functions assist with mathematical operations needed in transformation processing. For example, the following expression returns the average order for a Stabilizing Vest, based on the first five records in the Sales port, and thereafter, returns the average for the last five records read: MOVINGAVG( SALES, 5 ) RECORD_NOSALESRETURN VALUE 1600NULL Transformation Expressions Miscellaneous. Informatica also provides functions to assist in: aborting or erroring out records developing if then else structures looking up values from a specified external or static table testing values for validity (such as date or number format) For example, the following expression might cause the server to skip a value. IIF(SALES < 0, ERROR( 'Negative value found'), EMP_SALARY) SALESRETURN VALUE Server skips record Transformation Expressions Operators. Use transformation operators to create transformation expressions to perform mathematical computations, combine data, or compare data. Constants. Use built-in constants to reference values that remain constant, such as TRUE, FALSE, and NULL. Variables. Use built-in variables to write expressions that reference values that vary, such as the system date. You can also create local variables within a transformation. Return values. You can also write expressions that include the return values from Lookup, Stored Procedure, and External Procedure transformations Questions??? Re usable Transformations and Mapplets Reusable Transformation A Transformation is said to be in reusable mode when multiple instances of the same transformation can be created. Reusable transformations can be used in multiple mappings. Creating Reusable transformations: Design it in the Transformation Developer Promote a standard transformation from the Mapping Designer. Mapplet A Mapplet is a reusable object that represents a set of transformations It allows to reuse transformation logic and can contain as many transformations as needed Mapplets can: Include source definitions Accept data from sources in a mapping Include multiple transformations Pass data to multiple pipelines Contain unused ports Sample Mapplet in a Mapping Expanded Mapplet Mapplet - Components Each Mapplet must include the following: One Input transformation and/or Source Qualifier transformation At least one Output transformation A Mapplet should contain exactly one of the following: Input transformation with at least one port connected to a transformation in the Mapplet Source Qualifier transformation with at least one port connected to a source definition Questions??? Workflow Manager & Workflow Monitor Server Manager Workflow Manager Workflow Monitor 1. Task Developer 2. Workflow Designer 3. Worklet Designer 1. Gantt Chart 2. Task View Workflow Manager The Workflow Manager replaces the Server Manager in version 5.0. Instead of running sessions, you now create a process called the workflow in the Workflow Manager. A workflow is a set of instructions on how to execute tasks such as sessions,s, and shell commands. A session is now one of the many tasks you can execute in the Workflow Manager. The Workflow Manager provides other tasks such as Assignment, Decision, and Events. You can also create branches with conditional links. In addition, you can batch workflows by creating worklets in the Workflow Manager. Workflow Manager Screen Shot Workflow Manager Tools Task Developer Use the Task Developer to create tasks you want to execute in the workflow. Workflow Designer Use the Workflow Designer to create a workflow by connecting tasks with links. You can also create tasks in the Workflow Designer as you develop the workflow. Worklet Designer Use the Worklet Designer to create a worklet. Workflow Tasks Command. Specifies a shell command run during the workflow. Control. Stops or aborts the workflow. Decision. Specifies a condition to evaluate.. Sendsduring the workflow. Event-Raise. Notifies the Event-Wait task that an event has occurred. Event-Wait. Waits for an event to occur before executing the next task. Session. Runs a mapping you create in the Designer. Assignment. Assigns a value to a workflow variable. Timer. Waits for a timed event to trigger. Create Task Workflow Monitor PowerCenter 6.0 provides a new tool, the Workflow Monitor, to monitor workflow, worklets, and tasks. The Workflow Monitor displays information about workflows in two views: 1. Gantt Chart view 2. Task view. You can monitor workflows in online and offline mode. Workflow Monitor Gantt Chart Workflow Monitor Task View Workflow Monitor Added Features in PowerCenter 7.1 The Workflow Monitor includes the following performance and usability enhancements: When you connect to the PowerCenter Server, you no longer distinguish between online or offline mode. You can open multiple instances of the Workflow Monitor on one machine. You can simultaneously monitor multiple PowerCenter Servers registered to the same repository. The Workflow Monitor includes improved options for filtering tasks by start and end time. The Workflow Monitor displays workflow runs in Task view chronologically with the most recent run at the top. It displays folders alphabetically. You can remove the Navigator and Output window. Questions??? Performance Tuning First step in performance tuning is to identify the performance bottleneck in the following order: Target Source Mapping Session System The most common performance bottleneck occurs when the Informatica Server writes to a target database. Target Bottlenecks Identifying A target bottleneck can be identified by configuring the session to write to a flat file target. Optimizing Dropping Indexes and Key Constraints before loading. Increasing commit intervals. Use of Bulk Loading / External Loading. Source Bottlenecks Identifying Add a filter condition after Source qualifier to false so that no data is processed past the filter transformation. If the time it takes to run the new session remains about the same, then there is a source bottleneck. In a test mapping remove all the transformations and if the performance is similar, then there is a source bottleneck. Optimizing Optimizing the Query by using hints. Use Informatica Conditional Filters if the source system lacks indexes. Mapping Bottlenecks Identifying If there is no source bottleneck, add a Filter transformation in the mapping before each target definition. Set the filter condition to false so that no data is loaded into the target tables. If the time it takes to run the new session is the same as the original session, there is a mapping bottleneck. Optimizing Configure for Single-Pass reading Avoid unnecessary data type conversions. Avoid database reject errors. Use Shared Cache / Persistent Cache Session Bottlenecks Identifying If there is no source, Target or Mapping bottleneck, then there may be a session bottleneck. Use Collect Performance Details. Any value other than zero in the readfromdisk and writetodisk counters for Aggregator, Joiner, or Rank transformations indicate a session bottleneck. Low (0-20%) BufferInput_efficiency and BufferOutput_efficiency counter values also indicate a session bottleneck. Optimizing Increase the number of partitions. Tune session parameters. DTM Buffer Size (6M 128M) Buffer Block Size (4K 128K) Data (2M 24 M )/ Index (1M-12M) Cache Size Use incremental Aggregation if possible. Session Bottlenecks - Memory Configure the index and data cache memory for the Aggregator, Rank, and Joiner transformations in the Configuration Parameters dialog box The amount of memory you configure depends on partitioning, the transformation that requires the largest cache, and how much memory cache and disk cache you want to use Incremental Aggregation First Run creates idx and dat files. Second Run performs the following actions: For each i/p record, the Server checks historical information in the index file for a corresponding group, then: If it finds a corresponding group, it performs the aggregate operation incrementally, using the aggregate data for that group, and saves the incremental change If it does not find a corresponding group, it creates a new group and saves the record data When writing to the target Informatica Server Updates modified aggregate groups in the target Inserts new aggregate data Deletes removed aggregate data Ignores unchanged aggregate data Saves modified aggregate data in the index and data files Incremental Aggregation You can find options for incremental aggregation on the Transformations tab in the session properties The Server Manager displays a warning indicating the Informatica Server overwrites the existing cache and a reminder to clear this option after running the session System Bottlenecks Identifying If there is no source, Target, Mapping or Session bottleneck, then there may be a system bottleneck. Use system tools to monitor CPU usage, memory usage, and paging. On Windows :- Task Manager On Unix Systems toots like sar, iostat. For E.g. sar u (%usage on user, idle time, i/o waiting time) Optimizing Improve network speed. Improve CPU performance Check hard disks on related machines Reduce Paging PMCMD Can use the command line program pmcmd to communicate with the Informatica Server Can perform the following actions with pmcmd: Determine if the Informatica Server is running Start sessions and batches Stop sessions and batches Recover sessions Stop the Informatica Server Can configure repository usernames and passwords as environmental variables with pmcmd can also customize the way pmcmd displays the date and time on the machine running the Informatica Server pmcmd returns zero on success and non-zero on failure You can use pmcmd with operating system scheduling tools like cron to schedule sessions, and you can embed pmcmd into shell scripts or Perl programs to run or schedule sessions PMCMD Need the following information to use pmcmd: Repository username Repository password Connection type - The type of connection from the client machine to the Informatica Server Port or connection - The TCP/IP port number or IPX/SPX connection (Windows NT/2000 only) to the Informatica Server Host name - The machine hosting the Informatica Server Session or batch name - The names of any sessions or batches you want to start or stop Folder name - The folder names for those sessions or batches Parameter file Commit Points A commit interval is the interval at which the Informatica Server commits data to relational targets during a session The commit point can be a factor of the commit interval, the commit interval type, and the size of the buffer blocks The commit interval is the number of rows you want to use as a basis for the commit point The commit interval type is the type of rows that you want to use as a basis for the commit point Can choose between the following types of commit interval Target-based commit Source-based commit During a source-based commit session, the Informatica Server commits data to the target based on the number of rows from an active source in a single pipeline Commit Points During a target-based commit session, the Informatica Server continues to fill the writer buffer after it reaches the commit interval When the buffer block is filled, the Informatica Server issues a commit command As a result, the amount of data committed at the commit point generally exceeds the commit interval Commit Points During a source-based commit session, the Informatica Server commits data to the target based on the number of rows from an active source in a single pipeline These rows are referred to as source rows A pipeline consists of a source qualifier and all the transformations and targets that receive data from the source qualifier An active source can be any of the following active transformations: Advanced External Procedure Source Qualifier Normalizer Aggregator Joiner Rank Mapplet, if it contains one of the above transformations Commit Points When the Informatica Server runs a source-based commit session, it identifies the active source for each pipeline in the mapping The Informatica Server generates a commit row from the active source at every commit interval When each target in the pipeline receives the commit row, the Informatica Server performs the commit Commit Points Multiple Servers You can register multiple PowerCenter Servers with a PowerCenter repository Can run these servers at the same time Can distribute the repository session load across available servers to improve overall performance Can use the Server Manager to administer and monitor multiple servers With multiple Informatica Servers, you need to decide which server you want to run each session and batch You can register and run only one PowerMart Server in a local repository Cannot start a PowerMart Server if it is registered in a local repository that has multiple servers registered to it Multiple Servers When attached to multiple servers, you can only view, or monitor, one Informatica Server at a time, but you have access to all the servers in the repository Questions??? Debugger Can debug a valid mapping to gain troubleshooting information about data and error conditions To debug a mapping, you configure and run the Debugger from within the Mapping Designer When you run the Debugger, it pauses at breakpoints and allows you to view and edit transformation output data After you save a mapping, you can run some initial tests with a debug session before you configure and run a session in the Server Manager Debugger Can Use the following process to debug a mapping: Create breakpoints Configure the Debugger Run the Debugger Monitor the Debugger Debug log Session log Target window Instance window Modify data and breakpoints A breakpoint can consist of an instance name, a breakpoint type, and a condition Debugger After you set the instance name, breakpoint type, and optional data condition, you can view each parameter in the Breakpoints section of the Breakpoint Editor Questions??? Informatica PowerCenter 7.1 Enhancements This Presentation part describes new features and enhancements to Informatica PowerCenter version 7.0. forming Informatica PowerCenter version 7.1 Informatica PowerCenter 7.1 Configuration Management Deployment Groups Repository & Repository Server PowerCenter Server PowerCenter Webservices XML Enhancements Security Client Usability Enhancements Other Enhancements Data Profiling Built into PowerCenter Data Profiling PowerCenter Metadata Reporter & Server Transformations Usability Configuration Management Full Object Level Versioning Run Object Queries that return shortcut Objects and also the Latest used / Checked out object. Labels Compare/Difference of Objects Query/Reporting Deployment Groups Command Line Facilities XML Object Export/Import Enhancements Deployment Groups Define collections of objects to deploy (move from one repository to another) Enables automation of deployment Can contain objects within any folder in the repository Dynamic Deployment Groups execute a query to determine the objects to move Repository & Repository Server Repository Exchange metadata with business intelligence tools. You can export metadata to and import metadata from other business intelligence tools, such as Cognos Report Net and Business Objects. Object import and export enhancements. You can compare objects in an XML file to objects in the target repository when you import objects. MX views. MX views have been added to help you analyze metadata stored in the repository. REP_SERVER_NET and REP_SERVER_NET_REF views allow you to see information about server grids. REP_VERSION_PROPS allows you to see the version history of all objects in a PowerCenter repository. Repository Server Updating repository statistics. PowerCenter now identifies and updates statistics for all repository tables and indexes when you copy, upgrade, and restore repositories. This improves performance when PowerCenter accesses the repository. Repository & Repository Server Increased repository performance. You can increase repository performance by skipping information when you copy, back up, or restore a repository. You can choose to skip MX data, workflow and session log history, and deploy group history. pmrep. You can use pmrep to back up, disable, or enable a repository, delete a relational connection from a repository, delete repository details, truncate log files, and run multiple pmrep commands sequentially. You can also use pmrep to create, modify, and delete a folder. PowerCenter Server DB2 bulk loading. You can enable bulk loading when you load to IBM DB Distributed processing. If you purchase the Server Grid option, you can group PowerCenter Servers registered to the same repository into a server grid. In a server grid, PowerCenter Servers balance the workload among all the servers in the grid. Row error logging. The session configuration object has new properties that allow you to define error logging. You can choose to log row errors in a central location to help understand the cause and source of errors. External loading enhancements. When using external loaders on Windows, you can now choose to load from a named pipe. When using external loaders on UNIX, you can now choose to load from staged files. External loading using Teradata Warehouse Builder. You can use Teradata Warehouse Builder to load to Teradata. You can choose to insert, update, upsert, or delete data. Additionally, Teradata Warehouse Builder can simultaneously read from multiple sources and load data into one or more tables. PowerCenter Server Mixed mode processing for Teradata external loaders. You can now use data driven load mode with Teradata external loaders. When you select data driven loading, the PowerCenter Server flags rows for insert, delete, or update. It writes a column in the target file or named pipe to indicate the update strategy. The control file uses these values to determine how to load data to the target. Concurrent processing. The PowerCenter Server now reads data concurrently from sources within a target load order group. This enables more efficient joins with minimal usage of memory and disk cache. Real time processing enhancements. You can now use real time processing in sessions that also process active transformations, such as the Aggregator transformation. You can apply the transformation logic to rows defined by transaction boundaries. PowerCenter Webservices Web services client Pulling/pushing data from a Web service data source Call an external Web service like any other Transform Access and manage PowerCenter metadata Web services server Receive, process and respond to Web services requests from external applications Integrate seamlessly into enterprise Web services processes Web Services interface to read/write repository metadata Web Services Provider Real-time Web Services. Real-time Web Services allows you to create services using the Workflow Manager and make them available to web service clients through the Web Services Hub. The PowerCenter Server can perform parallel processing of both request-response and one-way services. Web Services Hub. The Web Services Hub now hosts Real-time Web Services in addition to Metadata Web Services and Batch Web Services. You can install the Web Services Hub on a JBoss application server. XML Enhancements Enhanced support for XML Processing Large XML files XML transformations Schema, namespace, elements and attributes Full XML import/export of repository objects Workflows, worklets, sessions, mappings, transformations Multiple objects in a single XML file Automatic handling of dependent objects Objects can span multiple folders across a repository Security LDAP Support Login security through LDAP Audit trail using log file SDK for custom authentication schemes Support corporate security policies Automatically and without manual intervention Changes in employee status Changes in project responsibility Client Usability Enhancements Port Change Propagation Mass validation of objects Partitioning UI improvements Session Editor enhancements Copy objects in Repository Manager Other Enhancements Session Level Error Logging Custom Transformations (MGEP) PowerCenter Metadata Reporter Connectivity Enhancements PowerChannel, PowerConnect for SAP Sort/Merge Joiner Enhancements New Metadata Exchanges for Data Models - Embarcadero Studio and ERWin AllFusion Model Manager Applications / Database / DataMarts / Legacy Systems / Real Time Profiling Access Via Designer / Wizard Profile Rules Mapping INFORMATICA POWERCENTER Profiling Warehouse In PowerCenter In PowerAnalyser In 3 rd Party reporting tool Reports Data Profiling Built into PowerCenter Data Profiling Data Profiling for VSAM sources. You can now create a data profile for VSAM sources. Support for verbose mode for source-level functions. You can now create data profiles with source-level functions and write data to the Data Profiling warehouse in verbose mode. Aggregator function in auto profiles. Auto profiles now include the Aggregator function. Creating auto profile enhancements. You can now select the columns or groups you want to include in an auto profile and enable verbose mode for the Distinct Value Count function. Purging data from the Data Profiling warehouse. You can now purge data from the Data Profiling warehouse. Source View in the Profile Manager. You can now view data profiles by source definition in the Profile Manager. PowerCenter Data Profiling report enhancements. You can now view PowerCenter Data Profiling reports in a separate browser window, resize columns in a report, and view verbose data for Distinct Value Count functions. Prepackaged domains. Informatica provides a set of prepackaged domains that you can include in a Domain Validation function in a data profile. Data Profiling PowerCenter Server 64-bit support. You can now run 64-bit PowerCenter Servers on AIX and HP-UX (Itanium). Partitioning enhancements. If you have the Partitioning option, you can define up to 64 partitions at any partition point in a pipeline that supports multiple partitions. PowerCenter Server processing enhancements. The PowerCenter Server now reads a block of rows at a time. This improves processing performance for most sessions. CLOB/BLOB datatype support. You can now read and write CLOB/BLOB datatypes. PowerCenter Metadata Reporter PowerCenter Metadata Reporter modified some report names and uses the PowerCenter 7.1 MX views in its schema. PowerCenter Metadata Reporter & Server Flat file lookup. You can now perform lookups on flat files. When you create a Lookup transformation using a flat file as a lookup source, the Designer invokes the Flat File Wizard. You can also use a lookup file parameter if you want to change the name or location of a lookup between session runs. Dynamic lookup cache enhancements. When you use a dynamic lookup cache, the PowerCenter Server can ignore some ports when it compares values in lookup and input ports before it updates a row in the cache. Also, you can choose whether the PowerCenter Server outputs old or new values from the lookup/output ports when it updates a row. You might want to output old values from lookup/output ports when you use the Lookup transformation in a mapping that updates slowly changing dimension tables. Union transformation. You can use the Union transformation to merge multiple sources into a single pipeline. The Union transformation is similar to using the UNION ALL SQL statement to combine the results from two or more SQL statements. Transformations Custom transformation API enhancements. The Custom transformation API includes new array-based functions that allow you to create procedure code that receives and outputs a block of rows at a time. Use these functions to take advantage of the PowerCenter Server processing enhancements. Midstream XML transformations. You can now create an XML Parser transformation or an XML Generator transformation to parse or generate XML inside a pipeline. The XML transformations enable you to extract XML data stored in relational tables, such as data stored in a CLOB column. You can also extract data from messaging systems, such as TIBCO or IBM MQSeries. Transformations Usability Copying objects. You can now copy objects from all the PowerCenter Client tools using the copy wizard to resolve conflicts. You can copy objects within folders, to other folders, and to different repositories. Within the Designer, you can also copy segments of mappings to a workspace in a new folder or repository. Comparing objects. You can compare workflows and tasks from the Workflow Manager. You can also compare all objects from within the Repository Manager. Change propagation. When you edit a port in a mapping, you can choose to propagate changed attributes throughout the mapping. The Designer propagates ports, expressions, and conditions based on the direction that you propagate and the attributes you choose to propagate. Enhanced partitioning interface. The Session Wizard is enhanced to provide a graphical depiction of a mapping when you configure partitioning. Usability Revert to saved. You can now revert to the last saved version of an object in the Workflow Manager. When you do this, the Workflow Manager accesses the repository to retrieve the last-saved version of the object. Enhanced validation messages. The PowerCenter Client writes messages in the Output window that describe why it invalidated a mapping or workflow when you modify a dependent object. Validate multiple objects. You can validate multiple objects in the repository without fetching them into the workspace. You can save and optionally check in objects that change from invalid to valid status as a result of the validation. You can validate sessions, mappings, mapplets, workflows, and worklets. View dependencies. Before you edit or delete versioned objects, such as sources, targets, mappings, or workflows, you can view dependencies to see the impact on other objects. You can view parent and child dependencies, and global shortcuts across repositories. Viewing dependencies assists you in modifying objects and composite objects without breaking dependencies