Kettle Cookbook Pentaho Community Gathering 2010

Embed Size (px)

Citation preview

  • 8/2/2019 Kettle Cookbook Pentaho Community Gathering 2010

    1/28

    Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/1

    Auto-documentationforKettle jobs and transformations

    Kettle-Cookbook

    http://rpbouman.blogspot.com/http://rpbouman.blogspot.com/
  • 8/2/2019 Kettle Cookbook Pentaho Community Gathering 2010

    2/28

    Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    2

    Thanks for attending!

    Roland Bouman; Leiden, Netherlands

    Ex MySQL AB, Sun Microsystems

    Web and BI Developer

    Co-author of Pentaho Solutions

    ...and Pentaho Kettle Solutions

    Blog: http://rpbouman.blogspot.com/ Twitter: @rolandbouman

    http://rpbouman.blogspot.com/http://rpbouman.blogspot.com/http://rpbouman.blogspot.com/http://rpbouman.blogspot.com/
  • 8/2/2019 Kettle Cookbook Pentaho Community Gathering 2010

    3/28

    Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    3

    Agenda

    Documentation

    Introducing Kettle-Cookbook

    Demonstration Roadmap

    Questions and answers

    Links and resources

    http://rpbouman.blogspot.com/http://rpbouman.blogspot.com/
  • 8/2/2019 Kettle Cookbook Pentaho Community Gathering 2010

    4/28

    Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    4

    Agenda

    Documentation

    Introducing Kettle-Cookbook

    Demonstration Roadmap

    Questions and answers

    Links and resources

    http://rpbouman.blogspot.com/http://rpbouman.blogspot.com/
  • 8/2/2019 Kettle Cookbook Pentaho Community Gathering 2010

    5/28

    Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    5

    Documentation

    http://rpbouman.blogspot.com/http://rpbouman.blogspot.com/
  • 8/2/2019 Kettle Cookbook Pentaho Community Gathering 2010

    6/28

    Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    6

    Documentation

    http://rpbouman.blogspot.com/http://rpbouman.blogspot.com/
  • 8/2/2019 Kettle Cookbook Pentaho Community Gathering 2010

    7/28

    Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    7

    Documentation

    j

    http://rpbouman.blogspot.com/http://rpbouman.blogspot.com/
  • 8/2/2019 Kettle Cookbook Pentaho Community Gathering 2010

    8/28

    Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    8

    Documentation

    http://rpbouman.blogspot.com/http://rpbouman.blogspot.com/
  • 8/2/2019 Kettle Cookbook Pentaho Community Gathering 2010

    9/28

    Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    9

    Documentation

    http://rpbouman.blogspot.com/http://rpbouman.blogspot.com/
  • 8/2/2019 Kettle Cookbook Pentaho Community Gathering 2010

    10/28

    Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    10

    Documentation

    http://rpbouman.blogspot.com/http://rpbouman.blogspot.com/
  • 8/2/2019 Kettle Cookbook Pentaho Community Gathering 2010

    11/28

    Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    11

    Documentation Benefits

    Allows an ETL solution to be verified againstdesign documents

    If done right, can help to train developers

    Can be used to understand data lineage

    Facilitate auditing processes

    http://rpbouman.blogspot.com/http://rpbouman.blogspot.com/
  • 8/2/2019 Kettle Cookbook Pentaho Community Gathering 2010

    12/28

    Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandbouman

    kettle-cookbook: http://code.google.com/p/kettle-cookbook/

    12

    Documentation?Whaddya mean, documentation?

    http://rpbouman.blogspot.com/http://rpbouman.blogspot.com/
  • 8/2/2019 Kettle Cookbook Pentaho Community Gathering 2010

    13/28

    Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandbouman

    kettle-cookbook: http://code.google.com/p/kettle-cookbook/

    13

    WTH isn't there anydocumentation?

    Benefits are not immediate

    Not popular w/ developers

    Documentation Myths My software is self-explanatory

    Documentation is always outdated

    Who reads documentation anyway?

    http://rpbouman.blogspot.com/http://rpbouman.blogspot.com/
  • 8/2/2019 Kettle Cookbook Pentaho Community Gathering 2010

    14/28

    Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandbouman

    kettle-cookbook: http://code.google.com/p/kettle-cookbook/

    14

    Documentation Myths:My Software is self-explanatory

    I already explained, it's self-explanatory.

    Software is only self-describing in the sensethat it may be clear *what* it does.

    By itself, software cannot explain *why* itwas built this way.

    http://rpbouman.blogspot.com/http://rpbouman.blogspot.com/
  • 8/2/2019 Kettle Cookbook Pentaho Community Gathering 2010

    15/28

    Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandbouman

    kettle-cookbook: http://code.google.com/p/kettle-cookbook/

    15

    Documentation Myths:Docs are always outdated

    Yeah, documentation is always outdated.Let's blame documentation

    Documenting should be part of the

    development process You can test documentation like you can test

    software

    http://rpbouman.blogspot.com/http://rpbouman.blogspot.com/
  • 8/2/2019 Kettle Cookbook Pentaho Community Gathering 2010

    16/28

    Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandbouman

    kettle-cookbook: http://code.google.com/p/kettle-cookbook/

    16

    Documentation Myths:Who reads docs anyway?

    Is theredocumentation?

    Waddaya mean, yes?Well, *I* am not

    going to read that

    Yes

    Find an excuseto not write any

    Of course not!

    No docs to read.self-fulfilling prophecy proved true

    Well done! :)

    Start

    Who reads thatstuff anyway?

    http://rpbouman.blogspot.com/http://rpbouman.blogspot.com/
  • 8/2/2019 Kettle Cookbook Pentaho Community Gathering 2010

    17/28

    Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandbouman

    kettle-cookbook: http://code.google.com/p/kettle-cookbook/

    17

    Agenda

    Documentation

    Introducing Kettle-Cookbook

    Demonstration Roadmap

    Questions and answers

    Links and resources

    http://rpbouman.blogspot.com/http://rpbouman.blogspot.com/
  • 8/2/2019 Kettle Cookbook Pentaho Community Gathering 2010

    18/28

    Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandbouman

    kettle-cookbook: http://code.google.com/p/kettle-cookbook/

    18

    Kettle-cookbook:What is it?

    A documentation generator for Kettle ETLsolutions

    Built in Kettle

    Inspired by Benjamin Kallman's Kettledocumentation generator (Mainz, 2008)

    Open Source (LGPL)

    Available on google code

    http://rpbouman.blogspot.com/http://rpbouman.blogspot.com/
  • 8/2/2019 Kettle Cookbook Pentaho Community Gathering 2010

    19/28

    Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandbouman

    kettle-cookbook: http://code.google.com/p/kettle-cookbook/

    19

    Kettle-cookbook:How to use

    While creating/designing, enter descriptions:

    Job and Transformation Settings

    Description

    Extended Description

    Job entry, Transformation Step:

    Description

    Run kettle-cookbook. Parameters: INPUT_DIR

    OUTPUT_DIR

    http://rpbouman.blogspot.com/http://rpbouman.blogspot.com/
  • 8/2/2019 Kettle Cookbook Pentaho Community Gathering 2010

    20/28

    Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandbouman

    kettle-cookbook: http://code.google.com/p/kettle-cookbook/

    20

    Kettle-cookbook:How it works

    Kettle job scans a directory for .ktr and .kjbfiles creating an XML index

    XSLT is applied to XML, outputs HTML

    http://rpbouman.blogspot.com/http://rpbouman.blogspot.com/
  • 8/2/2019 Kettle Cookbook Pentaho Community Gathering 2010

    21/28

    Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandbouman

    kettle-cookbook: http://code.google.com/p/kettle-cookbook/

    21

    Kettle-cookbook:Features

    Table of contents to navigate docs

    Exposes value of description fields

    Data flow Diagram Crosslinks

    Overviews: Variables, Connections, Fields

    Syntax highlighting (SQL, Javascript)

    http://rpbouman.blogspot.com/http://rpbouman.blogspot.com/
  • 8/2/2019 Kettle Cookbook Pentaho Community Gathering 2010

    22/28

    Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandbouman

    kettle-cookbook: http://code.google.com/p/kettle-cookbook/

    22

    Kettle-cookbook:Hacking and Extending

    It's built on Kettle. Change jobs andtransformations in the pdi directory to addcustom processing

    Documentation generated with XSLT. Editthe kettle-report.xslt file to add customoverviews / HTML rendering

    HTML uses externalized CSS and Javacript.Hint: you'll find it in the css and js directories

    Icons in the images directory

    http://rpbouman.blogspot.com/http://rpbouman.blogspot.com/
  • 8/2/2019 Kettle Cookbook Pentaho Community Gathering 2010

    23/28

    Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandbouman

    kettle-cookbook: http://code.google.com/p/kettle-cookbook/

    23

    Agenda

    Documentation

    Introducing Kettle-Cookbook

    Demonstration

    Roadmap

    Questions and answers

    Links and resources

    http://rpbouman.blogspot.com/http://rpbouman.blogspot.com/
  • 8/2/2019 Kettle Cookbook Pentaho Community Gathering 2010

    24/28

    Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandbouman

    kettle-cookbook: http://code.google.com/p/kettle-cookbook/

    24

    Agenda

    Documentation

    Introducing Kettle-Cookbook

    Demonstration

    Roadmap

    Questions and answers

    Links and resources

    http://rpbouman.blogspot.com/http://rpbouman.blogspot.com/
  • 8/2/2019 Kettle Cookbook Pentaho Community Gathering 2010

    25/28

    Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandbouman

    kettle-cookbook: http://code.google.com/p/kettle-cookbook/

    25

    Roadmap

    High level data flow diagrams Overviews (variables, connections) across

    ETL solution

    Replace Kettle Job with Kettle API (BenjaminKallman)

    Dependencies / where-used list

    Not just ETL, entire Pentaho Solution (Actionsequences, Mondrian Cubes, Reports)

    Data lineage

    http://rpbouman.blogspot.com/http://rpbouman.blogspot.com/
  • 8/2/2019 Kettle Cookbook Pentaho Community Gathering 2010

    26/28

    Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandbouman

    kettle-cookbook: http://code.google.com/p/kettle-cookbook/

    26

    Agenda

    Documentation

    Introducing Kettle-Cookbook

    Demonstration

    Roadmap

    Questions and answers

    Links and resources

    http://rpbouman.blogspot.com/http://rpbouman.blogspot.com/
  • 8/2/2019 Kettle Cookbook Pentaho Community Gathering 2010

    27/28

    Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandbouman

    kettle-cookbook: http://code.google.com/p/kettle-cookbook/

    27

    Agenda

    Documentation

    Introducing Kettle-Cookbook

    Demonstration

    Roadmap

    Questions and answers

    Links and resources

    http://rpbouman.blogspot.com/http://rpbouman.blogspot.com/
  • 8/2/2019 Kettle Cookbook Pentaho Community Gathering 2010

    28/28

    Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandbouman

    kettle-cookbook: http://code.google.com/p/kettle-cookbook/

    28

    Links and resources Project: http://code.google.com/p/kettle-cookbook/

    Getting Started: see the project wiki

    Issues: http://code.google.com/p/kettle-cookbook/issues/list

    Downloads: https://code.google.com/p/kettle-cookbook/downloads/list

    Source: http://code.google.com/p/kettle-cookbook/source/checkout

    http://rpbouman.blogspot.com/http://code.google.com/p/kettle-cookbook/http://code.google.com/p/kettle-cookbook/issues/listhttps://code.google.com/p/kettle-cookbook/downloads/listhttps://code.google.com/p/kettle-cookbook/downloads/listhttp://code.google.com/p/kettle-cookbook/issues/listhttp://code.google.com/p/kettle-cookbook/http://rpbouman.blogspot.com/