Upload
votuong
View
231
Download
1
Embed Size (px)
Citation preview
Open Source meets Business Intelligence
An Introduction to Pentaho
Seminar “Business Intelligence”06.02.07 Konstanz
Monika Podolecheva
Agenda
BI concept and goals
BI Market Overview
Proprietary BI Vendors
Open Source and BI Open Source Market
Open Source BI Vendors
OS SW: pros and cons
From Data gathering, Processing and Analysis to Report
JasperSoft Inc.
Pentaho Products
Pentaho Examples and Demo
BI: Concept and Objectives
BI Concept: Techniques as Data warehousing, Data mining, and Reporting based on gathering, storage, preprocessing, analysis and reporting for data
Objectives: help companies to become a more comprehensive knowledge of the factors affecting their business and help companies to make better business decisions.
Strategic planning, deriving trends ,
objectives definition
Huge amount ofunstructured data
BI Solutions
BI Market Segmentation and Overview
• Query, reporting, and analysis SW includes ad hoc query and multidimensional analysis tools as well as dashboards and production reporting tools. Query and reporting tools are designed specifically to support ad hoc data access and report building by either IT or business users.
• Advanced analytics software includes data mining and statistical software and uses technologies such as neural nezworks, rule induction, and clustering, etc. to discover relationships in data and make predictions
Source: IDC, July 2006
BI Market Facts and Trends•BI market grows because applying BI tools leads to– better market analysis– better budget controlling – better strategy planning
•Broader adoption of BI software is expected to continue as more end users gain access to query and reporting tools and as organizations embed BI software into operational applications supporting all business processes
•A IDC shows an optimistically trend:
!The BI market is dominated by larger, full-service companies, such as IBM and Oracle, and specialized vendors, such as SAS, Cognos, Business Objects and Hyperion.
therefore brings higher rate of return
Source: IDC, August 2006
• Arcplan• Actuate • Business Objects • Cognos• Hyperion Solutions • Information Builders • Microsoft • MicroStrategy• Oracle • Panorama Software • SAP • SAS Institute, etc.
Proprietary BI Vendors
Source: Gartner (January 2007)
First signs that OOS is coming into the BI tools market: Vendors such as Pentaho, JasperSoft, and Actuate clearly display the first signs of a potential market niche.
The impact of open source BI tools will be very limited over the next five yearsDuring the latter part of the current 15-year cycle of the BI market, OSS may develop into a stronger competitive force (IDC, 2006) especially because of the costs for the commercial tools.
Trends: OS Databases widely used
BI OS Strategy:
JasperSoft Idea: BI becomes embedded in individual applications and much more transepent. And by making a complex function affordable, that function becomes universal.
Open Source and BI Open Source Market
• Pentaho – Complete solution with Pentaho BI Suite
• Palo – MOLAP-Server (German vendor Jedox)
• JasperSoft – Specialist for Reporting-Tools such as JasperReports
• BIRT – Reporting-Solution of Eclipse Foundation (taken over by Acuate)
• Weka – Algorithm collection for Data Mining of the University of Waikato
Open Source BI Vendors
• Data Gathering and Storage: open source databases MySQL, PostgreSQL, and the Jedox database Palo
• Preprocessing: Extraction, Transformation & Loading (ETL): Tool Kettle, the CloverETL-Framework and Enhydra Octopus
• Analysis:
OLAP (On Line Analytical Processing) – The most popular free OLAP-Server is the Java based Mondrian-Project ;Another algorithm collection for Data Mining is Weka
From Data gathering, Processing and Analysis to Report (1)
From Data gathering, Processing and Analysis to Report (2)
• Reporting-Engines such as the Java-Bibliothek JasperReports(JasperSoft)
• Visualizing the Reports - iReport-Designer (Jaspersoft)
Further: combining OLAP-Server JasperAnalysis and the ETL-Tool JasperETL ! JasperIntelligence-Suite BI-Framework.
• Pentaho: offers ETL-, Analysis-, Reporting- und Workflow-Solutions that can be combined in the Pentaho BI Suite. These solutions can be integrated in Standalone-Applications under the Mozilla Public License, Version 1.1.)
• The German partner of Pentaho is Ancud IT since Mai 2006.
OS SW: pros and cons
Advantages OSS
– no license costs
– reduced dependence on software vendors
– flexible and easier to customize
Disadvantages OSS
– lack of long-term support
– lack of long-term maintenance
JasperSoft is founded 2004 in San Francisco, CA, U.S.A with less than 50 employees. JasperSoft is delivering Commercial Open Source in the area of Business Intelligence
Specialist in Reporting, Analysis and Integration
•Open Source Products:
– JasperReport - delivers reports to the screen, printer or into PDF, HTML, XLS, CSV and XML files; can stand alone or be embedded directly into a user's application to give it advanced reporting capabilities.
– JasperServer - interactive and managed reporting for JasperReports– JasperAnalysis - interactive data analysis / OLAP server – JasperETL - high-performance data integration – iReport - powerful graphical report designer
•Commercial line: JasperDecisions
JasperSoft Inc.
Pentaho
Pentaho is founded 2004 in Orlando, U.S.A
Pentaho manages, facilitates, supports, and takes the lead development role in the Pentaho BI Project - a pioneering initiative by the Open Source development community to provide organizations with a comprehensive set of Business Intelligence (BI) capabilities that enable them to radically improve business performance, efficiency, and effectiveness.
Experienced team:
Founded by industry veterans with a track record of delivering successful BI products for leading commercial vendors including Business Objects, Cognos, Hyperion, IBM, Oracle, and SAS.
Pentaho Products: ReportingPentaho Reporting
allows organizations to easily access, format, and distribute information to employees, customers, and partners (former known as JFreeReport)
Pentaho Reporting has the following features:
– Full on-screen print preview; – Output to the screen, printer or various export formats: PDF,
HTML,CSV, Excel – Support for servlets (uses the JFreeReport extensions) – Complete source code included (subject to the GNU LGPL); – Extensive source code documentation – Unmatched flexibility through a heavily modularized architecture etc
Pentaho Products: AnalysisPentaho Analysis (The Mondrian project)
helps to operate with maximum effectiveness by gaining the insights and understanding to make optimal decisions; Mondrian is an OLAP server that enables you to interactively analyze very large datasets stored in SQL databases without writing SQL.
Pentaho Analysis has the following features:
– Integrates directly with Microsoft Excel PivotTable services, supporting data refresh, drill-down, data pivoting and more
– Provides an easy, interactive way for business users to analyze critical business information, by exploring the data to quickly uncover trends or anomalies. For example, a user looking at sales information for last year could easily “drill down” from the yearly summary to break out sales by quarter, compare sales across product lines, or analyze specific sales performance in different geographic regions
– Allows users to enhance the data with Excel formatting and Excel charts etc
Pentaho Products: Data IntegrationPentaho Data Integration (The Kettle project)
delivers powerful Extraction, Transformation and Loading (ETL) capabilities
Pentaho Data Integration is used for:
– Data warehouse population with built-in support for slowly changing dimensions, junk dimensions and much, much more.
– Export of database(s) to text-file(s) or other databases – Import of data into databases, ranging from text-files to excel sheets – Data migration between database applications – Exploration of data in existing databases. (tables, views, synonyms, ) – Information enrichment by looking up data in various information stores
(databases, text-files, excel sheets, ) – Data cleaning by applying complex conditions in data transformations – Application integrationist
Pentaho Products: DashboardsPentaho Data Integration (adopted Kettle project)
provide immediate insight into individual, departmental, or enterprise performance and gives business users the critical information for understanding and improvement organizational performance
Pentaho Products: Data MiningPentaho Data Mining
WEKA is integrated in Pentaho Data Mining and provides the BI suite with a the following set of machine learning algorithms:
– Clustering– Neural Networks – Decision Trees etc.
– Graphical data mining design and administration tools are integrated– Graphical user interfaces are provided for data pre-processing,
classification, regression, clustering, association rules, and visualization.
Literatur
Worldwide Business Intelligence Tools 2005 Vendor Shares, July 2006Source: http://www.sas.com/news/analysts/idc_bi_0706.pdf
Worldwide Business Intelligence Tools 2005 Vendor Shares, October 2006Source: http://www.sas.com/news/analysts/idc_analytics2_1006.pdf
Alexandra Kleijn, “Business Intelligence mit Open Source”, Heise Open, 2006.Source: http://www.heise.de/open/artikel/73725
Martin LaMonica : “Open source meets business intelligence”, CNET News.com, published: April 23, 2006, 2006.http://news.com.com/2100-7344_3-6064045.html
http://www.JasperSoft.org
http://www.pentaho.com