Open Source ETL using Talend Open Studio

Preview:

DESCRIPTION

Open Source ETL using Talend Open Studio

Citation preview

Open Source ETL using Talend Open Studio

Luıs Santosluis@luissantos.pt

February 14, 2013

Luıs Santos luis@luissantos.pt Open Source ETL February 14, 2013 1

Overview

1 Who am i?

2 What is ETL?

3 ETL Software Suites

4 Talend Open Studio for Data Integration

5 Hands on

6 Conclusion

Luıs Santos luis@luissantos.pt Open Source ETL February 14, 2013 2

Warning!!!

This presentation was created using LatexWhy?

Because i can!

Luıs Santos luis@luissantos.pt Open Source ETL February 14, 2013 3

Who am i?

Luıs Santos luis@luissantos.pt Open Source ETL February 14, 2013 4

Who am i?

Software Engineer andMathematics Student

Open Source addicted

PHP and Java Developer

Luıs Santos luis@luissantos.pt Open Source ETL February 14, 2013 5

What is ETL?

Luıs Santos luis@luissantos.pt Open Source ETL February 14, 2013 6

What is ETL?

In computing, Extract, Transform and Load (ETL) refers to aprocess in database usage and especially in data warehousingthat involves:

Extracting data from outside sourcesTransforming it to fit operational needs (which can includequality levels)Loading it into the end target (database, more specifically,operational data store, data mart or data warehouse)

(2013, http://en.wikipedia.org/wiki/Extract, transform, load)

Luıs Santos luis@luissantos.pt Open Source ETL February 14, 2013 7

ETL Software Suites

Pentaho Data Integration (Kettle)

SQL Server Integration Services

Talend Open Studio for Data Integration

etc...

Luıs Santos luis@luissantos.pt Open Source ETL February 14, 2013 8

Talend Open Studio for Data Integration

Talend Open Studio is a set of tools for developing, testing, deploying andapplication integration projects.

Talend Open Studio for Big Data

Bonita Open Solution (BPM)

Talend Open Studio for Data Integration

Talend Open Studio for Data Quality

Talend ESB

Talend Open Studio for MDM

Luıs Santos luis@luissantos.pt Open Source ETL February 14, 2013 9

Datasource(rer)s

Luıs Santos luis@luissantos.pt Open Source ETL February 14, 2013 10

Datasources (Extract and Load)

Mysql, MSSQL, Oracle, Sqlite, FirebirdSQL, XLS, CSV, XML, SOAP,REST, HTTP, FTP, SSH, Imap

Luıs Santos luis@luissantos.pt Open Source ETL February 14, 2013 11

Transformers

Luıs Santos luis@luissantos.pt Open Source ETL February 14, 2013 12

Transformers (Transform)

Sort data

Convert data

Cross data between datasources

Filter data

Fuzzy search

Normalize and Denormalize data

Luıs Santos luis@luissantos.pt Open Source ETL February 14, 2013 13

Where and how ?

Where ?

Multi-platform ( Linux, MacOs, BSD-* even on windows )You just need a JVM (Java Virtual Machine)

How ?

Execute it from your favorite programming language using syscallsCommand lineFrom your JVM based application (Java, Groovy, JRuby)Webservices runing on the top Java App Server (Tomcat, Glassfish)

Luıs Santos luis@luissantos.pt Open Source ETL February 14, 2013 14

Where and how ?

Where ?

Multi-platform ( Linux, MacOs, BSD-* even on windows )You just need a JVM (Java Virtual Machine)

How ?

Execute it from your favorite programming language using syscallsCommand lineFrom your JVM based application (Java, Groovy, JRuby)Webservices runing on the top Java App Server (Tomcat, Glassfish)

Luıs Santos luis@luissantos.pt Open Source ETL February 14, 2013 14

Hands on

Luıs Santos luis@luissantos.pt Open Source ETL February 14, 2013 15

Hands on

Querying data

Joining data from multiple datasources

Filtering and sorting data

Exporting data

Deploying your job

Calling it from PHP

Luıs Santos luis@luissantos.pt Open Source ETL February 14, 2013 16

Database Schema

Luıs Santos luis@luissantos.pt Open Source ETL February 14, 2013 17

Example

Luıs Santos luis@luissantos.pt Open Source ETL February 14, 2013 18

”With great power comes great responsability.”(Voltair)

Luıs Santos luis@luissantos.pt Open Source ETL February 14, 2013 19

The End

email: luis@luissantos.pt

twitter: @santosluis87

linkedin: https://www.linkedin.com/in/luissantos87

Luıs Santos luis@luissantos.pt Open Source ETL February 14, 2013 20

Recommended