28
Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandbouman kettle-cookbook: http://code.google.com/p/kettle-cookbook/ 1 Auto-documentation for Kettle jobs and transformations Kettle-Cookbook

Kettle-Cookbook · Roland Bouman: Twitter: @rolandbouman kettle-cookbook: 2 Thanks for attending!

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

  • Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    1

    Auto-documentation for

    Kettle jobs and transformations

    Kettle-Cookbook

    http://rpbouman.blogspot.com/

  • Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    2

    Thanks for attending!● Roland Bouman; Leiden, Netherlands● Ex MySQL AB, Sun Microsystems● Web and BI Developer● Co-author of “Pentaho Solutions”● ...and “Pentaho Kettle Solutions”● Blog: http://rpbouman.blogspot.com/● Twitter: @rolandbouman

    http://rpbouman.blogspot.com/http://rpbouman.blogspot.com/

  • Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    3

    Agenda

    ● Documentation● Introducing Kettle-Cookbook● Demonstration● Roadmap● Questions and answers● Links and resources

    http://rpbouman.blogspot.com/

  • Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    4

    Agenda

    ● Documentation● Introducing Kettle-Cookbook● Demonstration● Roadmap● Questions and answers● Links and resources

    http://rpbouman.blogspot.com/

  • Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    5

    Documentation

    http://rpbouman.blogspot.com/

  • Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    6

    Documentation

    http://rpbouman.blogspot.com/

  • Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    7

    Documentation

    j

    http://rpbouman.blogspot.com/

  • Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    8

    Documentation

    http://rpbouman.blogspot.com/

  • Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    9

    Documentation

    http://rpbouman.blogspot.com/

  • Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    10

    Documentation

    http://rpbouman.blogspot.com/

  • Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    11

    Documentation Benefits

    ● Allows an ETL solution to be verified against design documents

    ● If done right, can help to train developers● Can be used to understand data lineage● Facilitate auditing processes

    http://rpbouman.blogspot.com/

  • Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    12

    Documentation? Whaddya mean, documentation?

    http://rpbouman.blogspot.com/

  • Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    13

    WTH isn't there any documentation?

    ● Benefits are not immediate● Not popular w/ developers● Documentation Myths

    – My software is self-explanatory– Documentation is always outdated– Who reads documentation anyway?

    http://rpbouman.blogspot.com/

  • Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    14

    Documentation Myths: My Software is self-explanatory● I already explained, it's self-explanatory.● Software is only self-describing in the sense

    that it may be clear *what* it does.● By itself, software cannot explain *why* it

    was built this way.

    http://rpbouman.blogspot.com/

  • Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    15

    Documentation Myths: Docs are always outdated

    ● Yeah, documentation is always outdated. Let's blame documentation

    ● Documenting should be part of the development process

    ● You can test documentation like you can test software

    http://rpbouman.blogspot.com/

  • Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    16

    Documentation Myths: Who reads docs anyway?

    Is there documentation?

    Waddaya mean, “yes”? Well, *I* am not

    going to read that

    Yes

    Find an excuse to not write any

    Of course not!

    No docs to read. self-fulfilling prophecy proved true

    Well done! :)

    Start

    Who reads that stuff anyway?

    http://rpbouman.blogspot.com/

  • Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    17

    Agenda

    ● Documentation● Introducing Kettle-Cookbook● Demonstration● Roadmap● Questions and answers● Links and resources

    http://rpbouman.blogspot.com/

  • Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    18

    Kettle-cookbook:What is it?

    ● A documentation generator for Kettle ETL solutions

    ● Built in Kettle● Inspired by Benjamin Kallman's Kettle

    documentation generator (Mainz, 2008)● Open Source (LGPL)● Available on google code

    http://rpbouman.blogspot.com/

  • Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    19

    Kettle-cookbook:How to use

    ● While creating/designing, enter descriptions:– Job and Transformation Settings

    ● Description● Extended Description

    – Job entry, Transformation Step:● Description

    ● Run kettle-cookbook. Parameters:– INPUT_DIR– OUTPUT_DIR

    http://rpbouman.blogspot.com/

  • Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    20

    Kettle-cookbook:How it works

    ● Kettle job scans a directory for .ktr and .kjb files creating an XML index

    ● XSLT is applied to XML, outputs HTML

    http://rpbouman.blogspot.com/

  • Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    21

    Kettle-cookbook:Features

    ● Table of contents to navigate docs● Exposes value of description fields● Data flow Diagram● Crosslinks● Overviews: Variables, Connections, Fields● Syntax highlighting (SQL, Javascript)

    http://rpbouman.blogspot.com/

  • Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    22

    Kettle-cookbook:Hacking and Extending

    ● It's built on Kettle. Change jobs and transformations in the pdi directory to add custom processing

    ● Documentation generated with XSLT. Edit the kettle-report.xslt file to add custom overviews / HTML rendering

    ● HTML uses externalized CSS and Javacript. Hint: you'll find it in the css and js directories

    ● Icons in the images directory

    http://rpbouman.blogspot.com/

  • Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    23

    Agenda

    ● Documentation● Introducing Kettle-Cookbook● Demonstration● Roadmap● Questions and answers● Links and resources

    http://rpbouman.blogspot.com/

  • Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    24

    Agenda

    ● Documentation● Introducing Kettle-Cookbook● Demonstration● Roadmap● Questions and answers● Links and resources

    http://rpbouman.blogspot.com/

  • Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    25

    Roadmap● High level data flow diagrams● Overviews (variables, connections) across

    ETL solution● Replace Kettle Job with Kettle API (Benjamin

    Kallman)● Dependencies / where-used list● Not just ETL, entire Pentaho Solution (Action

    sequences, Mondrian Cubes, Reports)● Data lineage

    http://rpbouman.blogspot.com/

  • Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    26

    Agenda

    ● Documentation● Introducing Kettle-Cookbook● Demonstration● Roadmap● Questions and answers● Links and resources

    http://rpbouman.blogspot.com/

  • Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    27

    Agenda

    ● Documentation● Introducing Kettle-Cookbook● Demonstration● Roadmap● Questions and answers● Links and resources

    http://rpbouman.blogspot.com/

  • Roland Bouman: http://rpbouman.blogspot.com/ Twitter: @rolandboumankettle-cookbook: http://code.google.com/p/kettle-cookbook/

    28

    Links and resources● Project: http://code.google.com/p/kettle-cookbook/● Getting Started: see the project wiki● Issues: http://code.google.com/p/kettle-cookbook/issues/list● Downloads: https://code.google.com/p/kettle-cookbook/downloads/list● Source: http://code.google.com/p/kettle-cookbook/source/checkout

    http://rpbouman.blogspot.com/http://code.google.com/p/kettle-cookbook/http://code.google.com/p/kettle-cookbook/issues/listhttps://code.google.com/p/kettle-cookbook/downloads/list

    Slide 1Slide 2Slide 3Slide 4Slide 5Slide 6Slide 7Slide 8Slide 9Slide 10Slide 11Slide 12Slide 13Slide 14Slide 15Slide 16Slide 17Slide 18Slide 19Slide 20Slide 21Slide 22Slide 23Slide 24Slide 25Slide 26Slide 27Slide 28