Upload
ansana
View
229
Download
0
Embed Size (px)
Citation preview
7/29/2019 Open Source Business Intelligence Tools
1/33
Open Source Business IntelligenceTools
Alex MeadowsTriLUG, January 2012
7/29/2019 Open Source Business Intelligence Tools
2/33
Agenda
Business Intelligence Overview
Review of OSBI Tools
Data Warehousing
Data Integration
Reporting/OLAP
Visualization
Statistical Analysis/Predictive Analytics
7/29/2019 Open Source Business Intelligence Tools
3/33
What Is Business Intelligence?
Utilizing technology to identifyand analyze trends in data to make
better business decisions.
7/29/2019 Open Source Business Intelligence Tools
4/33
Source: Back In Business, Klimberg, Miori (www.informs.org)
Overlapping Fields
7/29/2019 Open Source Business Intelligence Tools
5/33
Source: Competing on Analytics; Thomas Davenport, Jeanne Harris
Competing On Analytics
7/29/2019 Open Source Business Intelligence Tools
6/33
Phases of Growth
7/29/2019 Open Source Business Intelligence Tools
7/33
The Three Types of Questions
What happened?
How was performance last week?
What is currently happening?
How is performance right now?
What will happen?
What can I do to reach our goals?
7/29/2019 Open Source Business Intelligence Tools
8/33
Data Warehousing
Store data outside of application/normalbusiness environment (i.e. ERP systems)
Specific for reporting/analytics
Modeling Styles
3NF (normal database modeling)
Data Marts (aka star schemas)
Data Vault (hybrid 3NF/Data Mart) Anchor Modeling (6NF)
7/29/2019 Open Source Business Intelligence Tools
9/33
Data Warehousing
Databases
MySQL, Postgres, etc
Columnar Data Stores
Infobright*, LucidDB, InfiniDB*, etc.
Hybrid Data Warehouse Databases
Greenplum* (both RDBMS and Columnar)
NoSQL
Hadoop, CouchDB, MongoDB, etc.
*Hardware and/or Software limitations in community editions
7/29/2019 Open Source Business Intelligence Tools
10/33
RDBMS vs Columnar
Source: http://www.calpont.com/column-oriented-database-bi
7/29/2019 Open Source Business Intelligence Tools
11/33
NoSQL?
Not Only SQL
Unstructured/semi-structured data
Huge (multi-terrabyte to petabyte+ data sets)
Source: http://www.information-management.com/specialreports/20040622/1005301-1.html
7/29/2019 Open Source Business Intelligence Tools
12/33
Data Integration
Syncing data across systems
Includes:
ETL (Extract, Transform, Load)
MDM (Master Data Management)
EAI (Enterprise Application Integration)
EII (Enterprise Information Integration)
7/29/2019 Open Source Business Intelligence Tools
13/33
Talend
Data Management Tool Suite
ETL
MDM
Data Profiling
Data Quality
Code generator
Eclipse based Extensible plugin architecture
7/29/2019 Open Source Business Intelligence Tools
14/33
7/29/2019 Open Source Business Intelligence Tools
15/33
Pentaho K.E.T.T.L.E.
Kettle Extraction, Transport, Transformation,and Loading Environment
Focus on ETL
Extensible plugin architecture
Engine based
7/29/2019 Open Source Business Intelligence Tools
16/33
7/29/2019 Open Source Business Intelligence Tools
17/33
Reporting
Focus: Historical Analysis
7/29/2019 Open Source Business Intelligence Tools
18/33
Reporting Options
*Flat Files, NoSQL, etc.
MDX PivotTable
Charting SQL Other Sources*
DrillThrough
Parameterized
BIRT
Pentaho
JasperReports
SQL PowerWabit
Saiku
7/29/2019 Open Source Business Intelligence Tools
19/33
BIRT Example
7/29/2019 Open Source Business Intelligence Tools
20/33
7/29/2019 Open Source Business Intelligence Tools
21/33
7/29/2019 Open Source Business Intelligence Tools
22/33
Visualization
Focus: Trending and Present
7/29/2019 Open Source Business Intelligence Tools
23/33
7/29/2019 Open Source Business Intelligence Tools
24/33
7/29/2019 Open Source Business Intelligence Tools
25/33
7/29/2019 Open Source Business Intelligence Tools
26/33
Pentaho CDE/CDF
Dashboard framework and editor built intoPentaho BI Server
Community developed uses open web
languages (Javascript, HTML, etc).
7/29/2019 Open Source Business Intelligence Tools
27/33
7/29/2019 Open Source Business Intelligence Tools
28/33
7/29/2019 Open Source Business Intelligence Tools
29/33
Statistics/Predictive Analytics
Focus: All relevent data used to predictoutcomes
7/29/2019 Open Source Business Intelligence Tools
30/33
Statistics/Predictive Analytics
R stats oriented
Weka machine learning oriented
RapidMiner mixed
Originally YALE
Weka and R Plugins
Like SAS Enterprise Miner
7/29/2019 Open Source Business Intelligence Tools
31/33
BI From Reporting to Statistical
7/29/2019 Open Source Business Intelligence Tools
32/33
BI From Reporting to StatisticalAnalysis
* Utilizes Talend ETL**Utilizes Weka Data Mining
***All use Mondrian for OLAP, with different front ends
ETL Metadata Reporting Dashboards OLAP*** Statistics AutomatedDecisions
Jaspersoft *
Pentaho **
SpagoBI * * **
7/29/2019 Open Source Business Intelligence Tools
33/33
Shameless Plug
RTP Pentaho User Group On LinkedIn (soon to be also on Meetup)
Meets quarterly