Upload
sahapa
View
237
Download
1
Embed Size (px)
Citation preview
8/18/2019 ETL Example
http://slidepdf.com/reader/full/etl-example 1/17
4/9/2007
1
Control Data Flow WithoutMapping Changes
Greg Whitaker
ETL Lead Developer
Choicepoint
8/18/2019 ETL Example
http://slidepdf.com/reader/full/etl-example 2/17
4/9/2007
2
Environment:The Boca Raton, FL location utilizes Informaticain a 32-bit RedHat Linux/Intel environment. Wehave version 7.1.1 installed on 4 boxes, 3
production and 1 development.
Corporate headquarters in Alpharetta, GA hasan IBM P690 running AIX 5.3. This is a 64-bitenvironment and is running version 8.1.1 SP2.
8/18/2019 ETL Example
http://slidepdf.com/reader/full/etl-example 3/17
4/9/2007
3
Major Points:
• How to reference column names as port values.
• How to make data “generic” using a simple
mapplet, creating a write once use everywheredata flow that can be used on any source data.
• Route data to transformations at runtime usingLookups and Routers.
8/18/2019 ETL Example
http://slidepdf.com/reader/full/etl-example 4/17
4/9/2007
4
Brief History
• DataQA mappings were needed for hundredsof files we process.
• A dynamic data driven quality grading solution
was developed but each column needed to bemapped specific to that dataset.
• We wanted to create a generic solution that
would work for any source data.
8/18/2019 ETL Example
http://slidepdf.com/reader/full/etl-example 5/17
4/9/2007
5
Demo Only
• This is only a Demo.
• This is not a robust out of the box
solution.• We’re hoping that you can use ideas
from this Demo in your own mappings
8/18/2019 ETL Example
http://slidepdf.com/reader/full/etl-example 6/17
4/9/2007
6
Demo Mapping:
8/18/2019 ETL Example
http://slidepdf.com/reader/full/etl-example 7/17
4/9/2007
7
Mapping Source:
8/18/2019 ETL Example
http://slidepdf.com/reader/full/etl-example 8/17
4/9/2007
8
Mapping Target:
8/18/2019 ETL Example
http://slidepdf.com/reader/full/etl-example 9/17
4/9/2007
9
Mapplet:
8/18/2019 ETL Example
http://slidepdf.com/reader/full/etl-example 10/17
4/9/2007
10
Mapplet Input:
8/18/2019 ETL Example
http://slidepdf.com/reader/full/etl-example 11/17
4/9/2007
11
Mapplet Union Input:
8/18/2019 ETL Example
http://slidepdf.com/reader/full/etl-example 12/17
8/18/2019 ETL Example
http://slidepdf.com/reader/full/etl-example 13/17
4/9/2007
13
Sample Lookup Data:•"COLUMN_NAME", "QA_FUNCTION"
•"ASSN_NAME_1", "|IS_ZERO1|IS_NUMERIC|NOT_NUMERIC|"
•"ASSN_NAME_2", "|IS_ZERO1|"
•"ASSN_STREET", "|IS_ZERO|"
•"ASSN_CITY", "|IS_ZERO|NOT_NUMERIC|"
•"ASSN_STATE", "|IS_ZERO|NOT_NUMERIC|"
•"ASSN_ZIP", "|IS_ZERO|IS_NUMERIC|"
•"ASSN_ZIP4", "|IS_ZERO|IS_NUMERIC|"
•"ASSN_BOX_NUM", "|IS_ZERO|IS_NUMERIC|"•"ASSN_NUMBER", "|IS_ZERO|"
•"ORIG_MAN_DT", "|IS_ZERO|VALID_DATE_YYYYMMDD|"
•"PROJ_NAME_1", "|IS_ZERO|"
•"PROJ_NAME_2", "|IS_ZERO|"
•"PROJ_STREET", "|IS_ZERO|"
•"PROJ_CITY", "|IS_ZERO|NOT_NUMERIC|"
•"PROJ_STATE", "|IS_ZERO|NOT_NUMERIC|"
•"PROJ_ZIP", "|IS_ZERO|NOT_NUMERIC|"
•"DECLARE_DATE", "|VALID_DATE_YYYYMMDD|"
8/18/2019 ETL Example
http://slidepdf.com/reader/full/etl-example 14/17
4/9/2007
14
Mapplet Data Router:
8/18/2019 ETL Example
http://slidepdf.com/reader/full/etl-example 15/17
4/9/2007
15
Mapplet Function Call:
8/18/2019 ETL Example
http://slidepdf.com/reader/full/etl-example 16/17
4/9/2007
16
Mapplet Union & Output:
8/18/2019 ETL Example
http://slidepdf.com/reader/full/etl-example 17/17
4/9/2007
17
Sample Output Data:•Column Name Max Min Function T/F Count
•ASSN_NAME_1 ZURICH 100 IS_NUMERIC FALSE 22069
•ASSN_ZIP N7V2G IS_NUMERIC FALSE 13
•ASSN_ZIP4 R5 IS_NUMERIC FALSE 12502
•ASSN_BOX_NUM 999 014059 IS_NUMERIC TRUE 2642
•ASSN_ZIP 99999 01202 IS_NUMERIC TRUE 22056
•ASSN_BOX_NUM 999 IS_ZERO FALSE 22069
•ASSN_CITY ZOLFO AA IS_ZERO FALSE 22069
•ASSN_STATE WI AL IS_ZERO FALSE 22069
•PROJ_CITY ZOLFO AA IS_ZERO FALSE 22069
•PROJ_NAME_1 ZURICH 100 IS_ZERO FALSE 22069•PROJ_STATE FL FL IS_ZERO FALSE 22069
•PROJ_STREET ZARRAGO #1 IS_ZERO FALSE 22069
•PROJ_ZIP 99999 22308 NOT_NUMERIC FALSE 22033
•ASSN_CITY ZOLFO AA NOT_NUMERIC TRUE 22069
•ASSN_NAME_1 ZURICH 100 NOT_NUMERIC TRUE 22069
•ASSN_STATE WI AL NOT_NUMERIC TRUE 22069
•PROJ_CITY ZOLFO AA NOT_NUMERIC TRUE 22069
•PROJ_STATE FL FL NOT_NUMERIC TRUE 22069
•PROJ_ZIP 3302 NOT_NUMERIC TRUE 36
•DECLARE_DATE 20021231 19051217 VALID_DATE_YYYYMMDD TRUE 22060