ETL Standards Document

Embed Size (px)

Citation preview

  • 8/15/2019 ETL Standards Document

    1/38

      ETL Standards Document

    Arbitron, Inc. Page 1 12/12/2013

    PowerCenter

    Lifecycle/Standards/Environment

    Document

    Draft version 1.3

  • 8/15/2019 ETL Standards Document

    2/38

      ETL Standards Document

    Arbitron, Inc. Page 2 12/12/2013

    Revision History

    Date Version Description Author

    December 21, 2004 1.0 Initial Outline for Review Bhuwan Joshi

    January 6, 2005 1.1 Altered Format and Details Jack VorstegJanuary 11, 2005 1.2 Revised Metadata Extension Names

     Added Backup and Down Times

    Mike Ficca

    March 27, 2007 1.3 Added System Test environment

    information

    Patricia Khan

  • 8/15/2019 ETL Standards Document

    3/38

      ETL Standards Document

    Arbitron, Inc. Page 3 12/12/2013

    Table of Contents

    1.  Introduction ................................................................................................................................ 61.0  Informatica PowerCenter ................................................................................................. 6

    1.1  Scope ................................................................................................................................ 61.2  Audience ........................................................................................................................... 61.3  Terms ................................................................................................................................ 6

    2.  ETL Life Cycle ............................................................................................................................. 7

    3.  The Informatica PowerCenter Environments ......................................................................... 113.0  Development ................................................................................................................... 11

    3.0.1  The DWDEVL1 Repository ............................................................................. 113.1  Test / UAT ....................................................................................................................... 11

    3.1.1  The DWTEST1 Repository ............................................................................. 123.2  Production ...................................................................................................................... 13

    3.2.1  The DWPROD1 Repository ............................................................................ 133.3  General Environment Settings ...................................................................................... 13

    3.3.1  UNIX Environment Variables ......................................................................... 143.3.2  Server Variables / Directories ........................................................................ 14

    4.  PowerCenter Repositories ....................................................................................................... 154.0  Privileges ........................................................................................................................ 15

    4.0.1  Groups ............................................................................................................. 164.0.2  Users ................................................................................................................ 164.0.3  Folders ............................................................................................................. 16

    4.1  The DWDEVL1 Repository ............................................................................................. 164.1.1  Developer Folders .......................................................................................... 16

    4.1.2  MASTER_SRC_TRGT_OBJS Shared Folder ................................................ 174.1.3  Project Folders................................................................................................ 18

    4.2  The DWTEST1 Repository ............................................................................................. 194.2.1  Developer Folders .......................................................................................... 204.2.2  MASTER_SRC_TRGT_OBJS Shared Folder ................................................ 204.2.3  Project Folders................................................................................................ 20

    4.3  The DWPROD1 Repository ............................................................................................ 204.3.1  Developer Folders .......................................................................................... 204.3.2  MASTER_SRC_TRGT_OBJS Shared Folder ................................................ 204.3.3  Project Folders................................................................................................ 21

    5.  PowerCenter Standards ........................................................................................................... 215.0  PowerCenter Repository Manager Standards ............................................................. 215.0.1  Repository Server ........................................................................................... 215.0.2  Repository Name ............................................................................................ 215.0.3  Repository Schema Owner ............................................................................ 215.0.4  Repository Folder ........................................................................................... 215.0.5  Repository Group Name ................................................................................ 215.0.6  Repository User Name ................................................................................... 22

  • 8/15/2019 ETL Standards Document

    4/38

      ETL Standards Document

    Arbitron, Inc. Page 4 12/12/2013

    5.1  PowerCenter Designer Standards ................................................................................ 225.1.1  Mapping Standards ........................................................................................ 225.1.2  Mapplet Standards.......................................................................................... 225.1.3  Source Object Standards ............................................................................... 235.1.4  Target Object Standards ................................................................................ 23

    5.1.5  Source Qualifier Standards ........................................................................... 245.1.6  Advanced External Procedure Standards .................................................... 245.1.7  Aggregator Standards .................................................................................... 245.1.8  Expression Standards .................................................................................... 245.1.9  Filter Standards .............................................................................................. 245.1.10  Joiner Transformation Standards ................................................................. 255.1.11  Normalizer Transformation Standards ......................................................... 255.1.12  Ranker Transformation Standards ................................................................ 255.1.13  Router Transformation Standards ................................................................ 255.1.14  Sequence Generator Transformation Standards ......................................... 255.1.15  Stored Procedure Transformation Standards .............................................. 25

    5.1.16  Update Strategy Transformation Standards ................................................ 265.1.17  Lookup Transformation Standards ............................................................... 265.1.18  Other Transformation Standards .................................................................. 275.1.19  Miscellaneous Standards ............................................................................... 27

    5.2  PowerCenter Workflow Manager Standards ................................................................ 275.2.1  Informatica Power Center Server .................................................................. 275.2.2  Workflow Standards ....................................................................................... 275.2.3  Worklet Standards .......................................................................................... 285.2.4  Session Standards ......................................................................................... 285.2.5  Miscellaneous Workflow Standards .............................................................. 28

    5.3  Power Centers Web Services Standards ..................................................................... 29

    Appendix A: Code Review................................................................................................................ 30Mapping Checks ....................................................................................................................... 30Data and Unit Testing............................................................................................................... 32Reviewer’s Additional Comments ........................................................................................... 32

    Appendix B: Mapping Specification Templates ............................................................................. 33Mapping Specification Template .................................................................................... 33High Level Process Overview ......................................................................................... 34Processing Description (Detail) ...................................................................................... 34Stored Procedures ........................................................................................................... 34External Procedures ........................................................................................................ 35Mapplets ........................................................................................................................... 35

    Aggregators ..................................................................................................................... 35Ranks ................................................................................................................................ 36Router ............................................................................................................................... 36Joiner ................................................................................................................................ 36Comments ........................................................................................................................ 37[Note: Please copy and paste your comments from your Informatica Object here] .. 37Normalizer ........................................................................................................................ 37Expressions ..................................................................................................................... 37

  • 8/15/2019 ETL Standards Document

    5/38

      ETL Standards Document

    Arbitron, Inc. Page 5 12/12/2013

    ETL Estimations ................................................................................................................................ 38Source to Staging, ODS or 3NF Model ................................................................................... 38Staging to ODS or 3NF Model ................................................................................................. 38PSA to MDA or Mart ................................................................................................................. 38

  • 8/15/2019 ETL Standards Document

    6/38

      ETL Standards Document

    Arbitron, Inc. Page 6 12/12/2013

    1. Introduction

    1.0 Informatica PowerCenter

    Current Version: 7.1.1

    Description: Informatica‟s ETL Software product used to address the complete lifecycle of alldata integration and delivery needs.

    1.1 Scope

    The scope of this document is to provide the standards and to describe the development lifecyclethat properly trained individual must follow to develop high performance ETL Mappings andWorkflows. Deviations from this document must be approved on a case by case by both an ETL

    Administrator and the Enterprise Informatica Owner.

    1.2 Audience

    This document is intended for:

      PowerCenter Administrator

      PowerCenter Development Team

      Others with a good understanding of Informatica PowerCenter

    1.3 Terms

    PowerCenter: The Data Integration Platform software produced by Informatica. It has many pieces that are referred to within this document: Data Integration Engine, Repository Server, aRepository Database, and various Client Tools. The client tools are: Designer, WorkflowManager, Workflow Monitor, Repository Manager, and Repository Server AdministrationConsole. 

    Shareable: Refers to objects that are found in a shareable folder. These objects can be used inany folder within a repository by simply dragging and dropping into the target folder. Shareableobjects can be sources, targets, mapplets, mappings, and transformations in the Designer, as wellas commands, emails, sessions, and worklets in the Workflow Manager.

     Note: When deciding on using shared objects, keep in mind that if a change is made to the

    object, everything using that object is affected. Let’s put this in perspective. If Project A and Bare using a shared object, and Project A must change that object for their new development

    effort, then Project B must incorporate the change as well, whether they are ready for it or not. If Project B refuses to implement the change, then Project A might be late on their deliverable.

  • 8/15/2019 ETL Standards Document

    7/38

      ETL Standards Document

    Arbitron, Inc. Page 7 12/12/2013

    Reusable: Refers to transformations that are found in a regular folder. These transformations arerestricted to the folder in which they were created. Other folders do not have access to them inthe same manner as the shareable transformations, unless they are copied. Transformations can be made reusable and will appear under the Transformations list in the folder. Mapplets, sources,targets, and mappings are automatically created as reusable.

    ETL Administrator or ETL Admin: Refers to the PowerCenter Repository user id that has allthe rights and privileges to work within the Informatica environment. It does not refer to anyspecific individual or individuals on the Data Management team.

    ETL Developer: Refers to the PowerCenter Repository user id that has all the rights and privileges to work within their own Informatica Folder. The Users will also have execute permission within folders of projects they are assigned to.

    Data Architects or DA: Refers to the Arbitron employee assigned to the project and creator ofeither a source or target data model (or both). This user is responsible for the source to target

    mappings documentation as well as the schema implementation.

    Database Administrator or DBA: Refers to the Arbitron employee assigned as the databaseadministrator of the project.

    2. ETL Life Cycle

    The following diagram documents the lifecycle of Mappings and Workflows through theArbitron ETL working environments. It is not the intention of this lifecycle to burden developerswith additional documentation, but rather generalize tasks and create channels of communication between the ETL Developers and Administrators. It is assumed that system documentation will

     be created as part of the overall project document and there is no need to create individualdocuments based on singular mappings and workflows. When in compliance with the Standardsset forth in this document, the Informatica Repository will become self documenting and anynecessary information can easily be reported from within the repository metadata.

  • 8/15/2019 ETL Standards Document

    8/38

      ETL Standards Document

    Arbitron, Inc. Page 8 12/12/2013

    STEP 1SPRINT TASK/ERWIN MODEL/WORK REQUEST/SCR

    The delivery of a work assignment to a ETL developer. The ETL Developer will provide rough

    estimates on development times based on the ETL WORK LEVEL ESTIMATES . Appendix A

     

    STEP 2IMPLEMENTATION/MODEL REVIEW

    The ETL Developer will meet with appropriate DA or persons to review the Data model and/or 

    SRC and discuss the implementation strategy . This discussion shall include the methods to be

    developed in the event of a required historical backfill of data.

    NOTE: At this time there is no formal document required. The ETL Developer should be

    prepared to discuss the solution.

     

    STEP 3ETL IMPLEMENTATION REVIEW

    The Implementation Strategy developed during step 2 shall be discussed with at least one ETL

    Administrator.

    NOTE: At this time there is no formal document required. The ETL Developer is expected to

    provide enough detail so that the ETL Administrator is aware of the overall development effort.

     

    STEP 4DEVELOPMENT/ UNIT TESTING

    The ETL Developer shall Develop and Unit Test all mappings and Workflows in their own folder 

    in the DWDEVL1 repository. All Source, Targets and Reusable Objects used in mappings shall be

    links to the MASTER_SRC_TGT_OBJ folder. The ETL Developers are dependent upon the DA to

    have the Schema in place in the correct development environment. The ETL Administrator is

    responsible for Importing all the necessary Source and Targets necessary for the ETL Developer 

    to create their mappings. It is the ETL Administrators responsibility to perform a impact analysis

    in the event that a Source or Target definition is being updated.

     

    Life Cycle Strategy Flow Diagram

     

  • 8/15/2019 ETL Standards Document

    9/38

      ETL Standards Document

    Arbitron, Inc. Page 9 12/12/2013

    STEP 5STANDARDS AND METHODS REVIEW

    The ETL Developer shall present the developed Mappings and Workflows to 2 or more fellow

    ETL DEVELOPERS. ( One of which should be a ETL Admin). The Mappings and Workflows will

    be reviewed for methods and standard compliance PRIOR to the mappings being migrated to the

    projects centralized folder in the DWDEVL1 repository. items presented to this meeting should

    include:

    1. Repository Report for each Mapping and Workflow

    2. SQL EXPLAIN PLANS for all SQL Overrides

    3. TBD by ETL ADMINS

    NOTE: The Repository Report has yet to be created and will be part of the Informatica

    POWERANALYZER APP.

    Development will return to STEP4 in the event of any issue

     

    STEP 6DEVELOPER-to-PROJECT FOLDER MIGRATIONDEVELOPMENT SYSTEM INTEGRATION TESTING

    The ETL Administrator shall be responsible form the migration of the developers mappings and

    workflows from the Developers personal folder to the project folder. The Developer will then be

    responsible for a complete system integration test of the mappings and workflows in the overall

    environment. The Developer shall be responsible for scheduling the workflows in their correct

    order and verifying their execution.

    STEP 8DA/ETL DEVELOPER REVIEW

    The ETL Developer and the DA assigned to the project shall conduct a brief data analysis to

    review the data against the source to target mapping requirements. This review is not meant to

    identify errors in the mappings, but rather to help verify the data is loaded as intended

    STEP 9SYSTEM TEST MIGRATION/SYSTEM TEST

    The ETL Developer shall notify the ETL Admin when mappings and/or workflows are ready to be

    migrated to the DWSTEN1 repository. It shall be the responsibility of the DA to be certain that

    the Source and Target definitions are in place prior to the scheduling of the Mappings. The

    TESTER assigned to the project will be responsible for running the workflows and following all

    standards outlined by ARBITRONS BEST PRACTICES for TESTING. It is the responsibility of 

    the ETL Developer, DA and TESTER and User Community to decide on the best methods for 

    System Testing.

    Issue: Go to Step 4

    Issue: Go to Step 4

    Issue: Go to Step 4

     

  • 8/15/2019 ETL Standards Document

    10/38

  • 8/15/2019 ETL Standards Document

    11/38

      ETL Standards Document

    Arbitron, Inc. Page 11 12/12/2013

    3. The Informatica PowerCenter Environments

    3.0 Development

    UNIX Box: DIUETL01 Solaris 5.8 SPARC

    Repository Server Name: DWDEVL1Repository Server Port: 5004Repository Server Home: /app/informatica/powercenter/repositoryserver

    Server Name: diuetl01_DEVL1 Server Port: 4004 Server Home: /app/informatica/powercenter/server  Server Scripts: /app/informatica/powercenter/server/scripts 

    Web Services Server Name: diuetl01_ws_devl Web Services Server Port: 5555 

    Web Services Server Home: /app/informatica/powercenter/webserviceshub 

    3.0.1 The DWDEVL1 Repository

    Database: Oracle 9i Schema Owner: PM_REPO_DEVL1 Instance: DWDEVL1 Backup: Monday –  Friday 6AM and 12noon Scheduled Downtime: Sunday 6PM –  Sunday 8:30PM 

    The dwdevl1 repository is where Development and Unit Testing will take place. Eachdeveloper will have their own folder for development, maintenance, and Unit Testingefforts. The developers will be responsible for preparing the workflows to execute in thedwdevl1 repository. This includes all components such as scripts, procedures, etc. Whentesting has been completed by the developer, the Administrator will move the mappingsand workflows to the project folder(s) in the dwdevl1 repository after a review. Once themappings and workflow are migrated to the project folder(s), the developer will beresponsible for performing a complete system integration test in the overall developmentenvironment. General developers may have select permission upon request to Views toquery the repository.

    3.1 System Test

    UNIX Box: DIUETL01 Solaris 5.8 SPARCRepository Server Name: DWSTEN1Repository Server Port: 5006Repository Server Home: /app/sten/informatica/powercenter/repositoryserver

    Server Name: diuetl01_STEN1 Server Port: 4005 Server Home: /app/sten/informatica/powercenter/server  

  • 8/15/2019 ETL Standards Document

    12/38

      ETL Standards Document

    Arbitron, Inc. Page 12 12/12/2013

    Server Scripts: /app/sten/informatica/powercenter/server/scripts 

    Web Services Server Name: diuetl01_ws_sten Web Services Server Port: 5556 Web Services Server Home: /app/sten/informatica/powercenter/webserviceshub 

    3.1.1 The DWSTEN1 Repository

    Database: Oracle 9i Schema Owner: PM_REPO_STEN1 Instance: DWSTEN1 Backup: Monday –  Friday 6AM and 12noon Scheduled Downtime: Sunday 6PM –  Sunday 10PM 

    The dwsten1 repository is where System Testing will take place. At this point allworkflow problems should have been identified during the Unit Test phase. If a problemis discovered, developers will correct the problem in the dwdevl1 repository in their ownfolders. Once the complete Unit Testing cycle has been completed, the Administrator willre-migrate the corrected object(s) into dwsten1 repository. General developers, testersand users may have select permission upon request to Views to query the repository.

    3.2 User Acceptance Test

    UNIX Box: ARBETL1 Solaris 5.8 SPARCRepository Server Name: DWTEST1Repository Server Port: 5004Repository Server Home: /app/informatica/powercenter/repositoryserver

    Server Name: arbetl1_TEST1 Server Port: 4004 Server Home: /app/informatica/powercenter/server  Server Scripts: /app/informatica/powercenter/server/scripts 

    Web Services Server Name: arbetl1_ws_test Web Services Server Port: 5555 Web Services Server Home: /app/informatica/powercenter/webserviceshub 

    3.2.1 The DWTEST1 Repository

    Database: Oracle 9i Schema Owner: PM_REPO_TEST1 Instance: DWTEST1 Backup: Monday –  Friday 6AM and 12noon Scheduled Downtime: Sunday 6PM –  Monday Midnight 

    The dwtest1 repository is where User Acceptance Testing will take place. At this point allworkflow problems should have been identified during the System Test phase. If a

  • 8/15/2019 ETL Standards Document

    13/38

      ETL Standards Document

    Arbitron, Inc. Page 13 12/12/2013

     problem is discovered, developers will correct the problem in the dwdevl1 repository intheir own folders. Once the complete Unit Testing cycle has been completed, theAdministrator will re-migrate the corrected object(s) into the dwsten1 respository forSystem Testing. After System Testing is completed successfully, the Administrator willre-migrate the corrected object(s) into the dwtest1 repository. General developers, testers

    and users may have select permission upon request to Views to query the repository.

    3.3 Production

    UNIX Box: PIUETL01 Solaris 5.8 SPARCRepository Server Name: DWPROD1Repository Server Port: 5004Repository Server Home: /app/informatica/powercenter/repositoryserver

    Server Name: piuetl01_PROD1 Server Port: 4004 Server Home: /app/informatica/powercenter/server  

    Server Scripts: /app/informatica/powercenter/server/scripts 

    Web Services Server Name: piuetl01_ws_prod Web Services Server Port: 5555 Web Services Server Home: /app/informatica/powercenter/webserviceshub 

    3.3.1 The DWPROD1 Repository

    Database: Oracle 9i Schema Owner: PM_REPO_PROD1 Instance: DWPROD1 

    Backup: Monday –  Saturday 11:59PM Scheduled Downtime: Sunday 3AM –  Sunday 3:05AM 

    The dwprod1 repository is where the Production run takes place. Since all thedevelopment and testing will be done in development and test environments, thisrepository will have no write permissions other than the Administrator who has fullaccess. There will be a user id created which has only enough permissions and privilegesto execute the workflows via scripts, but the actual password will be encrypted. TheDevelopers will have select permission to this environment, and will be responsible forconfirming all scheduled workflows.

    3.4 General Environment Settings

    The following information is consistent across all environments. Substitute above valuesas necessary.

  • 8/15/2019 ETL Standards Document

    14/38

      ETL Standards Document

    Arbitron, Inc. Page 14 12/12/2013

    3.4.1 UNIX Environment Variables

    repo1_name=DWTEST1 - The Current Environment Repository repo_host=arbetl1 - The Current UNIX Repository Host repo_port=5004 - The Current Repository Port runpm_password=ND87NKGOFKM:N - The runpm user password 

    runpm_user=runpm - The general pmcmd user for scripts server_host=arbetl1 - The Current UNIX Server Host server_port=4004 - The Current Unix Server Port

    3.4.2 Server Variables / Directories

    $PMWorkflowLogDir /data/informatica/powercenter/WorkflowLogs$PMWorkflowLogCount 0$PMLookupFileDir /data/informatica/powercenter/LkpFiles$PMRootDir /app/informatica/powercenter/server$PMSessionLogDir /data/informatica/powercenter/SessLogs

    $PMBadFileDir /data/informatica/powercenter/BadFiles$PMCacheDir /data/informatica/powercenter/Cache$PMTargetFileDir /data/informatica/powercenter/TgtFiles$PMSourceFileDir /data/informatica/powercenter/SrcFiles$PMExtProcDir $PMRootDir/ExtProc$PMTempDir $PMRootDir/Temp$PMSuccessEmailUser [email protected]$PMFailureEmailUser [email protected]$PMSessionLogCount 2$PMSessionErrorThreshold 1Where: $PMRootDir = /app/informatica/powercenter/server

    Because development and System Test share the same UNIX server, the System Testserver variables and directories are as follows:

    $PMWorkflowLogDir /data/sten/informatica/powercenter/WorkflowLogs$PMWorkflowLogCount 0$PMLookupFileDir /data/sten/informatica/powercenter/LkpFiles$PMRootDir /app/sten/informatica/powercenter/server$PMSessionLogDir /data/sten/informatica/powercenter/SessLogs$PMBadFileDir /data/sten/informatica/powercenter/BadFiles$PMCacheDir /data/sten/informatica/powercenter/Cache

    $PMTargetFileDir /data/sten/informatica/powercenter/TgtFiles$PMSourceFileDir /data/sten/informatica/powercenter/SrcFiles$PMExtProcDir $PMRootDir/ExtProc$PMTempDir $PMRootDir/Temp$PMSuccessEmailUser [email protected]$PMFailureEmailUser [email protected]$PMSessionLogCount 2$PMSessionErrorThreshold 1

  • 8/15/2019 ETL Standards Document

    15/38

  • 8/15/2019 ETL Standards Document

    16/38

      ETL Standards Document

    Arbitron, Inc. Page 16 12/12/2013

    AdministerRepository

    Manage Users, Groups, and Privileges. Upgrade, Restore, andBackup Repository. Manipulate Folders where appropriate permissions apply.

    Administer Server Start / Stop the Informatica Server Engine

    Super User The sky is the limit.

    Table 1 Background of Privileges

    4.0.1 Groups

    Groups are created and maintained in the Repository Manager in the Security menu item.The Administrator or members of the Administrators group can create or maintain groupsas they have the Administer Repository privilege.

    It is advisable that each repository contains only the users and groups that it needs. If all

    the repositories contain the same users and groups a security breach could ensue.

    4.0.2 Users

    Users are created and maintained in the Repository Manager in the Security menu item.The Administrator or members of the Administrators group can create or maintain groupsand users as they have the Administer Repository privilege.

    4.0.3 Folders

    Folders are created and maintained in the Repository Manager in the Folder menu option.

    The Administrator or members of the Administrators group can create or maintain foldersas they have the Administer Repository privilege.

    Since you can only have one Owner, and one Group assigned to a Folder, it is veryimportant how these are assigned to insure the proper security. Please also note that thereare no Sub-Folders.

    4.1 The DWDEVL1 Repository

    1.  All Developers will be assigned to Project Groups2.  All Testers will only be assigned to the PUBLIC Group

    4.1.1 Developer Folders

    Each developer will have his/her own folder with FULL read/write and execute permission. All development and Unit Testing should happen within this folder.This Developer shall be assigned to the DEVELOPER group and this group willhave read permissions only to the Folder. The repository public users will also

  • 8/15/2019 ETL Standards Document

    17/38

      ETL Standards Document

    Arbitron, Inc. Page 17 12/12/2013

    have read permission to individual developer‟s folders. The following image is anexample of general developer‟s folders permissions.

    4.1.2 MASTER_SRC_TRGT_OBJS Shared Folder

    The MASTER_SRC_TRGT_OBJS is a centralized shared folder owned by theuser REPO_ADMIN that contains all the Source, Target and ReusableTransformations used by every Project and/or Developers folders. Every Sourceand Target in other folders MUST be a shortcut to this folder. This will aid therepository administrators when determining impact analysis of changes as well as

    migrations. Reusable Transformation transformations will be considered on acase-by-case basis, but it is not recommended at this time to reuse thesetransformations across folders. Administrators and Senior Developers assigned tothe group MASTER_SRC_TRGT_OBJS, will have permission to write to thisfolder. In the event of an update, it will be their responsibility to take care of theother projects affected. (Notice that Allow Shortcut is selected).

  • 8/15/2019 ETL Standards Document

    18/38

      ETL Standards Document

    Arbitron, Inc. Page 18 12/12/2013

    4.1.3 Project Folders

    The individual project folders are owned by the REPO_ADMIN user and containall mappings and workflows associated with a project. A group will be createdthat contains all the individual developers associated with the project. All sourceand targets defined in these folders must be shortcuts to the centralized sharedfolder MASTER_SRC_TRGT_OBJS. No transformations may be shared fromthese project folders to other folders for any reason. Administrators will have permission to write to this folder after the mappings and workflows have beenUnit Tested and reviewed for standards compliance. Once mappings and

    workflows are migrated to the Project Folders, all Developers assigned to thefolders group will have both read and execute permission in order to performsystem integration testing.

  • 8/15/2019 ETL Standards Document

    19/38

      ETL Standards Document

    Arbitron, Inc. Page 19 12/12/2013

    4.2 The DWSTEN1 Repository

    1.  The Developers will only be assigned to the Public Group2.  Testers will be assigned to Project Groups

    4.2.1 Developer Folders

    Developer folders will not exist in the dwsten1 repository. Developers will haveread only permission in this repository. 

    4.2.2 MASTER_SRC_TRGT_OBJS Shared Folder

    The MASTER_SRC_TRGT_OBJS folder is owned by the user REPO_ADMINin the DWSTEN1 repository. Only ETL Administrators will have permission towrite to this folder. All other Users, Developers and Testers will have read-only permissions. 

  • 8/15/2019 ETL Standards Document

    20/38

      ETL Standards Document

    Arbitron, Inc. Page 20 12/12/2013

    4.2.3 Project Folders

    All Project Folders are owned by the user REPO_ADMIN in the DWSTEN1repository. Only ETL Administrators will have permission to write to thesefolders. Testers assigned to the Project Folder will be granted Execute permission

    to perform System Testing. 

    4.3 The DWTEST1 Repository

    1.  The Developers will only be assigned to the Public Group2.  Testers will be assigned to Project Groups

    4.3.1 Developer Folders

    Developer folders will not exist in the dwtest1 repository. Developer will haveread only permission in this repository. 

    4.3.2 MASTER_SRC_TRGT_OBJS Shared Folder

    The MASTER_SRC_TRGT_OBJS folder is owned by the user REPO_ADMINin the DWTEST1 repository. Only ETL Administrators will have permission towrite to this folder. All other Users, Developers and Testers will have read-only permissions. 

    4.3.3 Project Folders

    All Project Folders are owned by the user REPO_ADMIN in the DWTEST1

    repository. Only ETL Administrators will have permission to write to thesefolders. Testers assigned to the Project Folder will be granted Execute permissionto assist in UAT testing. There is no plan in place to allow users to directlyexecute any Power Center workflow from the Informatica Client applications. 

    4.4 The DWPROD1 Repository

    1. The Developers will only be assigned to the Public Group.2. Testers will only be assigned to the Public Group.

    4.4.1 Developer Folders

    Developer folders will not exist in the dwprod1 repository. Developer will haveread only permission in this repository.

    4.4.2 MASTER_SRC_TRGT_OBJS Shared Folder

    The MASTER_SRC_TRGT_OBJS folder is owned by the user REPO_ADMINin the DWPROD1 repository. Only ETL Administrators will have permission to

  • 8/15/2019 ETL Standards Document

    21/38

      ETL Standards Document

    Arbitron, Inc. Page 21 12/12/2013

    write to this folder. All other Users, Developers and Testers will have read-only permissions. 

    4.4.3 Project Folders

    All Project Folders are owned by the user REPO_ADMIN in the DWPROD1repository. Only ETL Administrators will have permission to write, schedule andexecute within these folders on a normal basis. Execute permission will begranted on a case-by-case basis when deemed necessary. The general runpm userwill be used for normal script and web services execution as necessary.

    5. PowerCenter Standards

    5.0 PowerCenter Repository Manager Standards

    When the ETL Administrator is creating new objects associated with a repository, the followingstandards and conventions should be adhered to.

    5.0.1 Repository Server

       Name is same as the UNIX Host name

      This name is determined by system Services.

      Ex. DIUETL01

    5.0.2 Repository Name

      The Repository name is the same as the Oracle Instance in which it resides.

      Ex. DWDEVL1

    5.0.3 Repository Schema Owner

      PM_REPO_(ENVIRONMENT)#

      The name based upon the environment as well as the number of repositories forthat environment.

      Ex. PM_REPO_DEVL1

    5.0.4 Repository Folder

      Any abbreviated Arbitron system or a unique value agreed upon by the ETLAdministrator.

      Project Folder names shall be in Upper Case

      Developers Folder names shall be in Lower Case

      Ex. EDD, EDW, ODS

    5.0.5 Repository Group Name

      Each Repository Folder will have an associated Group with the exact same name.

  • 8/15/2019 ETL Standards Document

    22/38

      ETL Standards Document

    Arbitron, Inc. Page 22 12/12/2013

    5.0.6 Repository User Name

      Each User name will consist of the user‟s first initial and complete last name 

      Ex. jvorsteg

      ETL Administrators may create system users that do not follow the developer

    naming convention

    5.1 PowerCenter Designer Standards

    When the ETL Developer is creating new mappings or editing a mapping, the followingstandards and conventions will be adhered to.

    5.1.1 Mapping Standards

      Ex. M_GDR001_LOAD_ODS_SRVY_FROM_GDR_PROD

      All Upper Case

       Name should start with M_ and then followed by abbreviated source databasename then followed by numeric representation (3 numbers e.g. 001,002).

      Source database representation should be followed by the operation e.g.Insert/Update/Delete/Truncate or Load (generic).

      Operation should be followed by target table name. In case of multiple targets usethe major target name.

      Target table name may be followed by the optional description of the mapping.

      All the above components of mapping name should be separated by “_”. 

      Mapping name will not exceed 80 characters.

      Mappings will cover only one data flow. Mappings can have multiple sourcesand/or multiple targets but they can not have two entirely separate data streams inthe same mapping.

      All mappings will have comments starting with date and initials of the personwriting comments. These comments should be added in the mapping itself. Ex.01/01/2002 JBV COMMENT……. 

      If a mapping is modified, comments with the name of the transformations beingmodified must be appended to the original comments. The actual transformationsmodified shall contain the date, initials and a brief description in the

    transformations AUDIT metadata extension.5.1.2 Mapplet Standards

      The Mapplet will reside either in the respective project folder as reusable or in theCommon Transformations folder.

      Mapplets will follow the same standards as mappings where applicable.

      Mapplets names will begin with MPL_

  • 8/15/2019 ETL Standards Document

    23/38

      ETL Standards Document

    Arbitron, Inc. Page 23 12/12/2013

      Mapplets will have an AUDIT metadata extension. Audits will include theDevelopers Name, Date and change description.

    5.1.3 Source Object Standards

      Database Source names will remain the same as the value imported from data

     base.  Flat file source names will have the same name as the flat file Source.

      Add Three/Four characters for the abbreviated source system name if needed forclarification

      When creating a mapping that uses new Sources you will first need to have theETL ADMIN import these into the MASTER_SRC_TRGT_OBJS Shared Folder.You then create a Shortcut to that Object in your own development folder in orderto use it in the Mapping.

      Make sure to first create a shortcut for the sources in your Source Analyzer work

    space before they are used in your mapping. This will allow you to edit the Name. You shall remove the 'Shortcut To_' from the name and save.

      Source Transformations will have an AUDIT metadata extension. Audits willinclude the Developers Name, Date and change description.

      The $Source server variable should be used (wherever applicable) fortransformations that connect to the database.

    5.1.4 Target Object Standards

      Database Target names will remain the same as the value imported from data base.

      Flat file Target names will have the same name as the flat file Target.

      Add Three/Four characters for the abbreviated source system name if needed forclarification

      When creating a mapping that uses new Targets you will first need to have theETL ADMIN import these into the MASTER_SRC_TRGT_OBJS Shared Folder.You then create a Shortcut to that Object in your own development folder in orderto use it in the Mapping.

      Make sure to first create a shortcut for the targets in your Warehouse Designerwork space before they are used in your mapping. This will allow you to edit the Name. You shall remove the 'Shortcut To_' from the name and save.

      Target Transformations will have an AUDIT metadata extension. Audits willinclude the Developers Name, Date and change description.

      Target Names should be unique in the mapping. When multiple instances of thesame target table exist they will be distinguished by operation.

      The $Target server variable should be used (wherever applicable) fortransformations that connect to the database.

  • 8/15/2019 ETL Standards Document

    24/38

      ETL Standards Document

    Arbitron, Inc. Page 24 12/12/2013

      If the Target Update Override is used, then the target name must begin withOVR_

      Target Transformations will have a UPD_OVR metadata extension with a defaultvalue of „N‟. If an Update override is created, then the update SQL shall becopied to the metadata extension and along with all the future changes.

    5.1.5 Source Qualifier Standards

      Source Qualifier names will begin with SQ_ followed by a unique description.

      Source Qualifier names will begin with SQ_OVR_ if the SQL is over written.

      Import all the tables which are used in the source qualifier. Even if no column has been selected from the table, still import the table and drag any column to SQ.This will avoid hiding any table in query override.

      Source Qualifiers will have a SQL_OVR metadata extension with a default valueof „N‟. If a SQL override is created, then the SQL shall be copied to the metadata

    extension and along with all the future changes.5.1.6 Advanced External Procedure Standards

      Advanced External Proc names will begin with AEP_ followed by a uniquedescription.

      Advanced External Procedure Transformations will have an AUDIT metadataextension. Audits will include the Developers Name, Date and changedescription.

    5.1.7 Aggregator Standards

      Aggregator names will begin with AGG_ followed by a unique description.

      Aggregator names expecting sorted input data begin with AGG_SI_.

      Aggregators will have an AUDIT metadata extension. Audits will include theDevelopers Name, Date and change description.

    5.1.8 Expression Standards

      Expression names will begin with EXP_ followed by a unique description.

      Expression transformations will have an UNCONNECTED_LKP metadataextension to hold the name of the Lookup. The extension will have a default valueof „N‟. 

      Expression transformations will have an UNCONNECTED_SP metadataextension to hold the name of the Stored Procedure. The extension will have adefault value of „N‟.

      Expression transformations will have an AUDIT metadata extension. Audits willinclude the Developers Name, Date and change description.

    5.1.9 Filter Standards

      Filter names will begin with FIL_ followed by a unique description.

  • 8/15/2019 ETL Standards Document

    25/38

      ETL Standards Document

    Arbitron, Inc. Page 25 12/12/2013

      Filter transformations will have an AUDIT metadata extension. Audits willinclude the Developers Name, Date and change description.

    5.1.10 Joiner Transformation Standards

      Joiner names will begin with JNR_ followed by the name of the type of join, then

    the Master Table, then 2 underscores “__”, and then the detail table.   Ex. JNR_OUT_MDA_MARKET_DIM__MDA_SOSO_FACT

      If the Joiner expects filtered inputs, then the name shall begin with JNR_SI_.

      Joiner transformations will have an AUDIT metadata extension. Audits willinclude the Developers Name, Date and change description.

      Join Type abbreviations are IN, OUT.

    5.1.11 Normalizer Transformation Standards

       Normalizer names will begin with NMR_ followed by a unique description.

       Normalizer transformations will have an AUDIT metadata extension. Audits willinclude the Developers Name, Date and change description.

    5.1.12 Ranker Transformation Standards

      Rank names will begin with RNK_ followed by a unique description.

      Rank transformations will have an AUDIT metadata extension. Audits willinclude the Developers Name, Date and change description.

    5.1.13 Router Transformation Standards

      Router names will begin with RTR_ followed by a unique description.

      Router transformations will have an AUDIT metadata extension. Audits willinclude the Developers Name, Date and change description.

      Router groups must have a valid condition name.

    5.1.14 Sequence Generator Transformation Standards

      Sequence Generator names will begin with SG_ followed by a unique descriptionthat preferably matches the table the sequence is used for.

      Sequence Generator transformations will have an AUDIT metadata extension.Audits will include the Developers Name, Date and change description.

      Sequence Generators will reset to 0 upon session initialization to avoid sequence

    contention between environments.

    5.1.15 Stored Procedure Transformation Standards

      Stored Procedure names will begin with SP_ followed by the Schema (Withoutthe environment) and the actual Procedure.

      Ex. SP_GDR_PPMRE1001P

  • 8/15/2019 ETL Standards Document

    26/38

      ETL Standards Document

    Arbitron, Inc. Page 26 12/12/2013

      Stored Procedure transformations will have an AUDIT metadata extension.Audits will include the Developers Name, Date and change description. Changescaptured here must include a reference to the procedure in the event that only thestored procedure has changed.

      Stored Procedure transformations will have an RDBMS_TYPE metadata

    extension with a default value of Oracle.

    5.1.16 Update Strategy Transformation Standards

      Update Strategy names will begin with UPD_ followed by the operation andfinally the target name.

      Ex. UPD_INSERT_GDR_MKT_AREA

      Operations shall include INSERT, UPDATE, DELETE and REJECT.

      The UPDATE strategy data driven command shall be based on the PowerCentervariables dd_insert, dd_update, dd_delete and dd_reject. The data driven

    command shall not be performed based upon the numeric value.  Update strategy will not be used for mappings only performing inserts.

    5.1.17 Lookup Transformation Standards

      Lookup Transformations names will begin with LKP_ followed by the tablename.

      Lookup Transformations that contain a SQL Override shall be named startingwith LKP_OVR_ followed by the Primary Lookup table.

      Lookup Transformations that are unconnected names will begin with UN_LKP_.

      Lookup transformations will only have the ports to and from the transformationnecessary to complete the lookup.

      Default Lookup policy is “Report Error”. 

      Do not make dynamic Lookups reusable unless explicitly directed to be the ETLAdministrator. Dynamic lookup names will begin with DLKP_.

      Lookup Transformations will have a SQL_OVR metadata extension with adefault value of „N‟. If a SQL override is created, then the SQL shall be copied to

    the metadata extension and along with all the future changes.

      Lookup Transformations will have a SQL_OVR_TABLES metadata extension tostore the current SQL Override statements tables. The tables shall be listed as a

    continuous string with the different tables delimited by double pipes “||”. This willhave a default value „N‟. 

      Lookup Transformations will have an AUDIT metadata extension. Audits willinclude the Developers Name, Date and change description.

      Lookup Transformations will have an CONNECTED_LKP metadata extensionwith a default value „Y‟.

  • 8/15/2019 ETL Standards Document

    27/38

      ETL Standards Document

    Arbitron, Inc. Page 27 12/12/2013

      Unconnected lookups shall be used with caution. As a rule of thumb, if more than10% of rows use the lookup, the lookup should be connected.

    5.1.18 Other Transformation Standards

      As other transformation are introduced by Informatica and used by developers,

    standards will be determined during Step 3 of the PowerCenter lifecycle andadded to the document as necessary.

    5.1.19 Miscellaneous Standards

      Input Ports shall be named based on the connect source when possible. Areasonable name is acceptable. At no time is any default name acceptable.

      Output Ports shall be named based on the connect source when possible. Areasonable name is acceptable. At no time is any default name acceptable.

      Variable ports shall be named beginning with v_ followed by a reasonable name.

      Ports based upon an unconnected Lookup shall begin with lkp_.

      Mapping Parameters names shall begin with $$.

      Mapping Variables names shall begin with $$V_.

    5.2 PowerCenter Workflow Manager Standards

    When the ETL Developer is creating new workflow or editing a workflow, the followingstandards and conventions will be adhered to.

    5.2.1 Informatica Power Center Server

      The Informatica Server name is based on the UNIX machine where it is located

    followed by a „_‟, followed by the environment of the server.

      Ex. piuetl01_prod1

    5.2.2 Workflow Standards

      If a workflow is scheduled its name shall start with SCHED_.

      If a workflow is web service enabled its name shall start with WS_.

      The Workflow name shall include the Project name and any other reasonabledetails.

      Default Error Handling shall be “Suspend on Error”.

      If any database connections are required other than the standard ones, create atemporary one with prefix as your initials. This will help removing them later.E.g. If a database connection is required by Naresh for testing from productionthen the database connection name should start with Naresh_. ETLAdministrators will create and destroy all database connections as necessary.

  • 8/15/2019 ETL Standards Document

    28/38

      ETL Standards Document

    Arbitron, Inc. Page 28 12/12/2013

    5.2.3 Worklet Standards

      Worklet names shall start with WL followed by numeric representation (2characters) e.g. WL01, followed by a reasonable name associating it to a project.

      For worklets within a worklet, the numeric representation should be followed by

    an alphabet. E.g. If a worklet is within a worklet starting with WL01 then theworklet under this should start with WL01A.

      If more than two levels of hierarchical arrangement are required for worklets, thenalternate the alphabet representation with the numeric representation. E.g. WL01then WL01A then WL01A01 then WL01A01A and so on.

    5.2.4 Session Standards

      Sessions names shall begin with S_ followed by a name that clearly represents themapping associated to it.

      Sessions should have a session log named exactly the same as the session name.

      Bad file names should be unique.

      Historical loads (or other ad-hoc) may use multiple sessions per workflow. Ifmultiple sessions for same mapping are needed, use a reusable session. Sessioninstance names should be unique and descriptive with matching session lognames.

      By default sessions should use standard db connections previously created.Developers should submit a request for a new connection (if needed) to anadministrator.

      By default sessions should have target load type defined as “normal”. 

      Target properties should only have db operations checked that are needed. I.e. Donot check update box if the job is insert-only.

      Whenever possible, server variables such as $source, $target, $PMRoot should beused.

      The SQL_OVR metadata extension shall be used if the SQL is modified atsession level. The default value is „N‟.

      Default error handling shall be set as “Fail Parent If Task Fails”. 

      The DTM buffer size shall be set to 24M.

      The block buffer size shall be set to 128K.

      If a session calls a UNIX script, the metadata extension UNIX_SCRIPT shallinclude the name and full path of the UNIX script with a brief description. Thedefault value is „N‟.

    5.2.5 Miscellaneous Workflow Standards

      Email Tasks shall be named EMAIL_ followed by the name of theassociated distribution list.

  • 8/15/2019 ETL Standards Document

    29/38

      ETL Standards Document

    Arbitron, Inc. Page 29 12/12/2013

      Command Tasks shall be named CMD_ followed by the name of the UNIXscript or any other reasonable name.

      Relational Connections shall be named at the discretion of the ETLAdministrators.

      Queue Connection TBD.

      FTP Connections shall be named FTP_ followed by the name of the targetedserver.

      Application Connection TBD.

      Loader Connection shall be named as the discretion of the ETLAdministrators.

      Workflow Parameters shall be named $$W_ followed by any reasonablename.

      Workflow Variables shall be named starting with $$WV_ followed by anyreasonable name.

      Workflow Parameter Files shall be named starting with W_PARAMfollowed by the workflow it belongs to.

      Session Parameter Files shall be named starting with S_PARAM followed by the session it belongs to.

    5.3 Power Centers Web Services Standards

      TBD There is a meeting scheduled 1/12/2004

  • 8/15/2019 ETL Standards Document

    30/38

      ETL Standards Document

    Arbitron, Inc. Page 30 12/12/2013

    Appendix A: Code Review

    Upon Completion of Step 5 of the PowerCenter Lifecycle, a mapping and workflow review shall take

    place to confirm development was completed following Enterprise Standards.

    The code review shall be conducted by at least one ETL Administrator and another ETL Developer.

    The developer shall bring 4 items for every mapping/workflow to be reviewed

    o  Code Checklisto  Power Analyzer Mapping Standards Reporto  Power Analyzer Workflow Standards Reporto  Erwin Model of the Source and Targets or File Definitionso  Source to Target Mapping Requirementso  Screen shots of Mappings to be reviewed

    Code Checklist

    The following is a checklist to be completed prior to the Code Review. Please review each mapping to ensurethat you have completed the following steps and place an „X‟ in the box when you have confirmed that the stepis completed. This will be the basis for Code Review.

    Source file name (s)

    Target table name(s)

    Mapping name

    Workflow name

    Session name(s)

    Folder mapping is located in

    Developer name 

    Date mapping completed

    Code Reviewed by

    Date Review Completed

    Checks and Balances Completed Comments

    Mapping Checks

    Did you follow all naming standards? YES

    Did you add mapping comments including your initials andchanges to the change log?

    YES

    Did you put comments into key transformations, and are the YES

  • 8/15/2019 ETL Standards Document

    31/38

      ETL Standards Document

    Arbitron, Inc. Page 31 12/12/2013

    transformations named in a descriptive and identifying way?Please pay special attention to any SQL Overrides, as theWHERE and JOIN clauses should be copied into thecomments section of the Source Qualifier.

    Did you remove the Shortcut_to for all Sources and Targets YES

    Did you connect only the necessary ports from the lookup? YESDid you use $Source and $Target for Location InformationProperty for Lookup?

    Did you use an Update Strategy only for updated targets, notfor inserts?

    YES

    Did you use DD_UPDATE in the Update StrategyExpression?

    YES

    Did you only connect the ports being updated? YES

    Does your mapping have a SQL override in the sourcequalifier when it is not needed (i.e. you could just use thesource filter or user defined join)?

    YES

    Does your mapping use unique target names distinguished by operation?

    YES

    Workflow / Session Checks

    Did you assign the Workflow to the correct Server? YES

    Is the name of the Session Log the same as the name of theSession?

    YES

    Are all flat file options selected correctly? Example: FixedWidth files that use Spaces to signify NULL should have thefollowing options checked:1) Change Null Character to Space,

    2) Repeat Null Character,3) Strip Trailing Blanks4) Line Sequential Format

    YES

    Are you only using the Standard Database Connectionnames?

    YES

    Is the session using „Normal‟ or „Bulk‟ loadingappropriately?

    YES

    For a given Target Instance, did you only check theoperations that will be done to that particular TargetInstance?

    YES

    Does the bad file name make sense? Is there more than 1mapping loading a target table at the same time? If yes, haveyou made the bad files unique?

    YES

    If your session uses a parameter file, did you test readingfrom that parameter file successfully?

    YES

  • 8/15/2019 ETL Standards Document

    32/38

      ETL Standards Document

    Arbitron, Inc. Page 32 12/12/2013

    Data and Unit Testing

    When you viewed the target data, did it look clean?Examples:

    1) Are you consistently getting NO data values in a particular column in the target table?2) Did any string data get truncated?3) If a Number passed into a String column, are there anydecimals in the value, such as 1.000000? (Hint: there shouldnot be.)

    YES

    When the session completed, did you open the session logand look for any records that did not insert due totransformation errors?

    YES

    Did the session log show that any fields are being truncatedcoming from the source?

    YES

    Reviewer’s Additional Comments 

  • 8/15/2019 ETL Standards Document

    33/38

      ETL Standards Document

    Arbitron, Inc. Page 33 12/12/2013

    Appendix B: Mapping Specification Templates

    This section provides templates for mapping specification and transformations created in the repository. It isrecommended to use these templates to document enterprise level reusable objects to document mapping detailswhere repository reports are not sufficient.

    Mapping Specification Template

    Mapping Name

    Source System Target System

    Initial Rows Rows/Load

    Short Description

    Load Frequency

    Preprocessing

    Post Processing

    Error StrategyReload Strategy

    Unique Source

    Fields

    SourcesTables

    Table Name Schema/Owner Selection/Filter

    Files

    File Name File Location Fixed/Delimited Additional File Info

    TargetsTables Schema Owner

    Table Name Update Delete Insert Unique Key

    Files

    File Name File Location Fixed/Delimited Additional File Info

    Lookups

    Lookup Name

    Table Location

    Match Condition(s)

  • 8/15/2019 ETL Standards Document

    34/38

      ETL Standards Document

    Arbitron, Inc. Page 34 12/12/2013

    Filter/SQL Override

    High Level Process Overview

    Processing Description (Detail)

    Templates For Reusable Tr ansformations

    Lookups

    Folder Name

    Lookup Name

    Lookup Type

    Table Location

    Match Condition(s)

    Outputs

    Filter/SQL Override

    Comments Created by :

    Modified by :

    Stored Procedures

    Folder Name

    Stored Procedure

    Trans. Name

    Call Text

    Stored Procedure

    Source Target

  • 8/15/2019 ETL Standards Document

    35/38

      ETL Standards Document

    Arbitron, Inc. Page 35 12/12/2013

    Type

    Comments Created by :

    Modified by :

    External Procedures

    Folder Name

    External Procedure

    Trans. Name

    Inputs I

    Return Value

    Module /

    Programmic

    Identifier

    Procedure Name

    Runtime Location

    Is_Partitionable

    Comments

    Mapplets

    Folder Name

    Mapplet Name

    Inputs

    Outputs

    Comments

    Aggregators

    Folder Name

    Aggregator Name

    Groups Group By Size

  • 8/15/2019 ETL Standards Document

    36/38

      ETL Standards Document

    Arbitron, Inc. Page 36 12/12/2013

    Sorted Ports Used

    Number of Output

    Ports

    Comments [Note: Please copy and paste your comments from your Informatica Object here]

    Ranks

    Folder Name

    Rank Name

    Number of Ranks Number ofGroups

    Total Output

    Column Size

    Comments [Note: Please copy and paste your comments from your Informatica Object here]

    Router

    Folder Name

    Router Name

    Group Names Number ofGroups Including

    Default

    Group 1 Condition

    Group 2 Condition

    Group 3 Condition

    Default GroupUtilized (Yes, No)

    Output

    Comments [Note: Please copy and paste your comments from your Informatica Object here]

    Joiner

    Folder Name

    Joiner Name

    Master Source

    Detail Source

    Join Type

    Join Condition

  • 8/15/2019 ETL Standards Document

    37/38

      ETL Standards Document

    Arbitron, Inc. Page 37 12/12/2013

    Comments

    [Note: Please copy and paste your comments from yourInformatica Object here]

    Normalizer

    Folder NameNormalizer Name

    Occurs Field Number ofOccurs

    Occurs Field Number ofOccurs

    Occurs Field Number ofOccurs

    Comments [Note: Please copy and paste your comments from your Informatica Object here]

    Expressions

    Folder Name

    Expression Name

    Inputs

    Outputs

    Comments

  • 8/15/2019 ETL Standards Document

    38/38

      ETL Standards Document

    ETL Estimations

    This section provides the typical work estimates (in man hours) for average ETL work. Theseestimates may vary significantly depending on the nature of the work.

    Source to Staging, ODS or 3NF Model

    Standard Mappings (One source to One target) 4 hours Additional sources or targets (same mapping) 2 hours per additional table

    Special Case (One source to One Target) 8 hoursNon-database to database

    Processing Flat FilesStandard mapping UNIX 4 hours additionalStandard mapping LAN 8 hours additional

    Building mechanism for retrieving files Project Dependent (complexity)

    Staging to ODS or 3NF Model

    Standard mapping (One source to One target) 4 hours Additional sources or targets 2 hours per additional table

    PSA to MDA or Mart

    Mapping (assumes many sources to a single target) 8 hoursPerformance tuning 8 hours

    These estimations include defining the table in Informatica and some query tuning. All special casesoutside of these guidelines need written requirements and analysis by ETL developer.

    It is assumed that all DBMS objects will be created by the DA and/or DBA assigned to the task withtheir appropriate sizing where applicable and that all mapping documentations will be entered inDesigner prior to the start of development efforts.