ETL Example

Preview:

Citation preview

8/18/2019 ETL Example

http://slidepdf.com/reader/full/etl-example 1/17

4/9/2007

1

Control Data Flow WithoutMapping Changes

Greg Whitaker 

ETL Lead Developer 

Choicepoint

8/18/2019 ETL Example

http://slidepdf.com/reader/full/etl-example 2/17

4/9/2007

2

Environment:The Boca Raton, FL location utilizes Informaticain a 32-bit RedHat Linux/Intel environment. Wehave version 7.1.1 installed on 4 boxes, 3

production and 1 development.

Corporate headquarters in Alpharetta, GA hasan IBM P690 running AIX 5.3. This is a 64-bitenvironment and is running version 8.1.1 SP2.

8/18/2019 ETL Example

http://slidepdf.com/reader/full/etl-example 3/17

4/9/2007

3

Major Points:

• How to reference column names as port values.

• How to make data “generic” using a simple

mapplet, creating a write once use everywheredata flow that can be used on any source data.

• Route data to transformations at runtime usingLookups and Routers.

8/18/2019 ETL Example

http://slidepdf.com/reader/full/etl-example 4/17

4/9/2007

4

Brief History

• DataQA mappings were needed for hundredsof files we process.

• A dynamic data driven quality grading solution

was developed but each column needed to bemapped specific to that dataset.

• We wanted to create a generic solution that

would work for any source data.

8/18/2019 ETL Example

http://slidepdf.com/reader/full/etl-example 5/17

4/9/2007

5

Demo Only

• This is only a Demo.

• This is not a robust out of the box

solution.• We’re hoping that you can use ideas

from this Demo in your own mappings

8/18/2019 ETL Example

http://slidepdf.com/reader/full/etl-example 6/17

4/9/2007

6

Demo Mapping:

8/18/2019 ETL Example

http://slidepdf.com/reader/full/etl-example 7/17

4/9/2007

7

Mapping Source:

8/18/2019 ETL Example

http://slidepdf.com/reader/full/etl-example 8/17

4/9/2007

8

Mapping Target:

8/18/2019 ETL Example

http://slidepdf.com/reader/full/etl-example 9/17

4/9/2007

9

Mapplet:

8/18/2019 ETL Example

http://slidepdf.com/reader/full/etl-example 10/17

4/9/2007

10

Mapplet Input:

8/18/2019 ETL Example

http://slidepdf.com/reader/full/etl-example 11/17

4/9/2007

11

Mapplet Union Input:

8/18/2019 ETL Example

http://slidepdf.com/reader/full/etl-example 12/17

8/18/2019 ETL Example

http://slidepdf.com/reader/full/etl-example 13/17

4/9/2007

13

Sample Lookup Data:•"COLUMN_NAME", "QA_FUNCTION"

•"ASSN_NAME_1", "|IS_ZERO1|IS_NUMERIC|NOT_NUMERIC|"

•"ASSN_NAME_2", "|IS_ZERO1|"

•"ASSN_STREET", "|IS_ZERO|"

•"ASSN_CITY", "|IS_ZERO|NOT_NUMERIC|"

•"ASSN_STATE", "|IS_ZERO|NOT_NUMERIC|"

•"ASSN_ZIP", "|IS_ZERO|IS_NUMERIC|"

•"ASSN_ZIP4", "|IS_ZERO|IS_NUMERIC|"

•"ASSN_BOX_NUM", "|IS_ZERO|IS_NUMERIC|"•"ASSN_NUMBER", "|IS_ZERO|"

•"ORIG_MAN_DT", "|IS_ZERO|VALID_DATE_YYYYMMDD|"

•"PROJ_NAME_1", "|IS_ZERO|"

•"PROJ_NAME_2", "|IS_ZERO|"

•"PROJ_STREET", "|IS_ZERO|"

•"PROJ_CITY", "|IS_ZERO|NOT_NUMERIC|"

•"PROJ_STATE", "|IS_ZERO|NOT_NUMERIC|"

•"PROJ_ZIP", "|IS_ZERO|NOT_NUMERIC|"

•"DECLARE_DATE", "|VALID_DATE_YYYYMMDD|"

8/18/2019 ETL Example

http://slidepdf.com/reader/full/etl-example 14/17

4/9/2007

14

Mapplet Data Router:

8/18/2019 ETL Example

http://slidepdf.com/reader/full/etl-example 15/17

4/9/2007

15

Mapplet Function Call:

8/18/2019 ETL Example

http://slidepdf.com/reader/full/etl-example 16/17

4/9/2007

16

Mapplet Union & Output:

8/18/2019 ETL Example

http://slidepdf.com/reader/full/etl-example 17/17

4/9/2007

17

Sample Output Data:•Column Name Max Min Function T/F Count

•ASSN_NAME_1 ZURICH 100 IS_NUMERIC FALSE 22069

•ASSN_ZIP N7V2G IS_NUMERIC FALSE 13

•ASSN_ZIP4 R5 IS_NUMERIC FALSE 12502

•ASSN_BOX_NUM 999 014059 IS_NUMERIC TRUE 2642

•ASSN_ZIP 99999 01202 IS_NUMERIC TRUE 22056

•ASSN_BOX_NUM 999 IS_ZERO FALSE 22069

•ASSN_CITY ZOLFO AA IS_ZERO FALSE 22069

•ASSN_STATE WI AL IS_ZERO FALSE 22069

•PROJ_CITY ZOLFO AA IS_ZERO FALSE 22069

•PROJ_NAME_1 ZURICH 100 IS_ZERO FALSE 22069•PROJ_STATE FL FL IS_ZERO FALSE 22069

•PROJ_STREET ZARRAGO #1 IS_ZERO FALSE 22069

•PROJ_ZIP 99999 22308 NOT_NUMERIC FALSE 22033

•ASSN_CITY ZOLFO AA NOT_NUMERIC TRUE 22069

•ASSN_NAME_1 ZURICH 100 NOT_NUMERIC TRUE 22069

•ASSN_STATE WI AL NOT_NUMERIC TRUE 22069

•PROJ_CITY ZOLFO AA NOT_NUMERIC TRUE 22069

•PROJ_STATE FL FL NOT_NUMERIC TRUE 22069

•PROJ_ZIP 3302 NOT_NUMERIC TRUE 36

•DECLARE_DATE 20021231 19051217 VALID_DATE_YYYYMMDD TRUE 22060

Recommended