8/6/2019 Pentaho_Pentaho Agile BI
http://slidepdf.com/reader/full/pentahopentaho-agile-bi 1/12
Copyright © 2007 ‐ 2010 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For the
latest information, please visit our web site at www.pentaho.com .
Pentaho Agile BI™: An iterative methodology
for flexible, fast and cost ‐ effective BI projects
James
Dixon
Chief Geek, Pentaho
November 2010
8/6/2019 Pentaho_Pentaho Agile BI
http://slidepdf.com/reader/full/pentahopentaho-agile-bi 2/12
Pentaho Agile BI™: An iterative methodology for flexible, fast and cost ‐ effective BI projects
Pentaho ©
Contents
Contents.................................................................................................................................. 2
Introduction ............................................................................................................................ 3
The Challenges of Traditional Business Intelligence .................................................................. 3 Moore’s Law.............................................................................................................................. 4 Cloud Computing....................................................................................................................... 4 Fuzzy Return.............................................................................................................................. 4 Lack of Shared Vision ................................................................................................................ 4 Development Latency ............................................................................................................... 4 Top ‐ Down Deficiencies.............................................................................................................. 5 Bottom ‐ Up Deficiencies ............................................................................................................ 5 No ‘Small’ BI Projects ................................................................................................................ 5 The Prototyping Costs ............................................................................................................... 5 Abandonment ........................................................................................................................... 6 Summary of the Problems......................................................................................................... 6
The Agile Approach to Business Intelligence............................................................................. 7
What Do We Mean by Agile BI? ................................................................................................ 7 Agile and Lean Principles........................................................................................................... 7 Lean Delivery............................................................................................................................. 7 Agile Teams ............................................................................................................................... 8 Agile Hardware.......................................................................................................................... 8 Agile Software ........................................................................................................................... 9
Pentaho’s Agile BI Initiative ..................................................................................................... 9 Tools ........................................................................................................................................ 10 Deployment Options ............................................................................................................... 10 Agile Behavior ......................................................................................................................... 10 Agile BI Use Cases.................................................................................................................... 11
The Boundaries of Agile BI ..................................................................................................... 11
Summary ................................................................................................................................. 12 Download and Contact Information........................................................................................ 12 References............................................................................................................................... 12
8/6/2019 Pentaho_Pentaho Agile BI
http://slidepdf.com/reader/full/pentahopentaho-agile-bi 3/12
Pentaho Agile BI™: An iterative methodology for flexible, fast and cost ‐ effective BI projects
3
Introduction
At Pentaho we believe that the old technologies, the old pricing, and the old approaches used for Business
Intelligence products are not well suited to today’s environment.
This white ‐ paper introduces Pentaho’s Agile BI initiative which encompasses:
• Technology: Provides integrated design, modeling, and visualization tools.
• Participants: Expands the BI developer base.
• Processes: Enables new behaviors and new BI use cases.
• Deployment: Enables migration between desktop, public/private clouds, and on ‐ premise
environments.
• Economics: Reduces the overall costs and allows incremental spending as value is realized.
Pentaho’s Agile BI, by changing the technical, operational, and economic factors of BI, enables new
behaviors by all participants in BI projects, thereby increasing the number of successful BI projects, and
reducing the proliferation of spreadmarts.
The Challenges of Traditional Business Intelligence
The Business Intelligence (BI) market is faced with many factors that are bound to change it.
Spreadsheets are widely recognized as the most commonly used BI tool. These BI spreadsheets are known
as ‘spreadmarts’. However these spreadmart solutions have many issues of their own, including security,
data quality, consistency, scalability, maintenance costs and lack of many important BI features. Despite
these downsides they exist because many of the tools and techniques that are designed specifically for BI
do not provide a better alternative: each comes with its own problems. The result is that BI projects often
fail. They are abandoned before they are started, abandoned during development, or never used because
they do not deliver the features or value that users expect.
The sections below describe the problems faced by BI projects and tools.
8/6/2019 Pentaho_Pentaho Agile BI
http://slidepdf.com/reader/full/pentahopentaho-agile-bi 4/12
Pentaho Agile BI™: An iterative methodology for flexible, fast and cost ‐ effective BI projects
4
Moore’s Law
Moore’s Law states that computer chip performance doubles every 20 months . Data warehouses were
first invented in the mid 1980’s, and only the biggest companies could afford them. During that era the
commodity chip, the Intel 386 chip, had 275k transistors. Today the equivalent commodity chip, the Core
2 Duo, has 291 million transistors . In the 15 years since the invention of data warehouses, computing
power has increased by a factor of 1000. It is only natural that, as computing power increases, systems
that were previously expensive become cheaper, and eventually a commodity. The Business Intelligence
market will, naturally, be affected by this trend: companies can create BI solutions that they could not
afford before, and individual users have equipment capable of running basic BI solutions.
Cloud Computing
The emergence of public computing clouds, such as the Amazon EC cloud, and on ‐ premise clouds, such as
Eucalyptus, have the potential to affect the BI industry. Utility pricing and the ability to create an instance
of an BI server quickly and cheaply is very powerful.
Fuzzy Return
While the benefit (or return) of a completed BI project is often difficult or impossible to quantify, it is
relatively easy to make qualitative statements such as: “We will be able to make quicker decisions that
will help reduce project costs,” or: “We will be able to make better marketing decisions that will increase
sales.” Making quantitative statements like: “We will cut project costs by 15% by making quicker
decisions” are much more difficult because the return on investment (ROI) is dependent on the as ‐ yet
unknown return on the BI project. With an uncertain ROI, an appropriate level of investment also
becomes difficult to estimate. With BI tools that have large up ‐ front costs, this problem becomes even
worse because there will be no significant investment unless there is some expectation of a large return.
As a consequence, many BI projects are never started.
Lack of Shared Vision
Many users cannot completely envision the end result that is being developed. Frequently, when BI users
first get access to a new system, they will immediately perceive a whole new set of requirements that
they had not realized before. Unfortunately they often cannot provide feedback on the requirements,
design, or value of a BI system until they see actual results with real data. As a result BI developers often
work with initial requirements that are either accurate or incomplete.
In addition, most users don’t understand the terminology used to describe the planned BI system. This
makes it even harder to create a shared vision.
Development Latency
During the execution of a BI project, there should be checkpoints to gather user feedback. This feedback
should be used to validate that the system being developed meets the business expectations. In many BI
projects, the time between checkpoints is too long, increasing the risk and likelihood of failure. The
problem deepens when time or resources constraints prevent the feedback from being incorporated into
the solution.
8/6/2019 Pentaho_Pentaho Agile BI
http://slidepdf.com/reader/full/pentahopentaho-agile-bi 5/12
Pentaho Agile BI™: An iterative methodology for flexible, fast and cost ‐ effective BI projects
5
Top ‐ Down Deficiencies
In a top ‐ down approach to a BI project, you start by gathering requirements, then you design a system
implied by those requirements, and then you implement that system. The problem with this approach is
that, due to communication and vision gaps, it is likely that your initial requirements are incomplete,
resulting in a project that falls short of expectations. Additionally, you’ll spend considerable time and
money before this fact becomes evident. If you can’t begin with a set of requirements that is clear,
accurate, complete, relevant, timely, understood and trusted, a top ‐ down approach is very risky.
Bottom ‐ Up Deficiencies
In a bottom ‐ up approach to a BI project, you start by providing a BI solution for a source system (ERP,
CRM, etc.) or a data source without much regard for user requirements. By providing reports, dashboards,
trending, summaries, and slice ‐ and ‐ dice functionality for a source system, you are likely to meet at least
some of your users’ requirements. The problem with this approach is that the answers to users’ biggest
problems might be outside of the available data. A little enrichment of the data might add significant
value. As with the top ‐ down approach, you will spend considerable time and money before you discover
this shortcoming.
No ‘Small’ BI Projects
Many BI experts recommend that BI teams “Start small, but think big.” They recommend starting with a
small project to get some success and momentum, and then continue to bigger and bigger things.
However, even starting small can be hard when, to make any progress, you need the time and skills of
sponsors, end users, IT developers, consultants, business analysts, and DBAs. In many cases it takes a
strategic initiative or a mandate from management to get a cross ‐ functional group like this to work on a
project together. Under typical workloads and business pressures it is hard to get participation from all
the necessary groups. They may also believe that duration of the potential BI project is tool long, thereby
reducing the benefit of completing it.
In addition to the requisite human capital, the hardware and software costs increase the size of the BI
project’s initial investment. Ideally you should collect feedback from a large user population during a BI
project, but software that is licensed on a per ‐ user basis may prohibit this.
The Prototyping Costs
Given the problems above, it seems sensible to perform a prototype, pilot, or feasibility study before
starting a BI project. This way, users will have the opportunity to provide concrete feedback about the
solution and its benefits. Indeed, many BI experts recommend using 5 ‐ 10% of the project’s budget to
create a prototype. Prototyping is valuable because it provides the opportunity to perform a second
iteration of the requirements and design of the system before building it for production.
Prototyping works well when you have a large budget, but when the budget for a BI project is small, there
is a problem – 5% of a small budget is a very small budget.
Unfortunately, many of the BI tools available today are expensive and licensed conservatively, making
them too costly to use for prototyping without violating their license agreements. To help alleviate this
problem, some BI vendors provide pre ‐ sales support to jumpstart the project. However, involving a
8/6/2019 Pentaho_Pentaho Agile BI
http://slidepdf.com/reader/full/pentahopentaho-agile-bi 6/12
Pentaho Agile BI™: An iterative methodology for flexible, fast and cost ‐ effective BI projects
6
vendor at this phase often interferes with the flexibility and scheduling of the project.
For many BI projects any significant spending at this early stage is enough to put the project on hold.
Aside from licensing and pricing issues, many of the required BI tools are not designed to be used for
lightweight and quick prototyping.
Abandonment
In reality some BI solutions fall into disuse over time. Sometimes this happens quickly, other times slowly.
There are numerous reasons for this, not all of which are necessarily bad, and include:
• The expected benefit wasn’t delivered.
• The insights provided by the BI solution shift focus from discovering issues to solving them.
• It becomes quickly apparent that operational changes are needed to fix data quality issues (e.g.
incomplete data in critical elements, for example ‐ ‘Reason Account Closed’)
• Change in corporate priorities, or departmental goals.
Investing time and money building BI solutions that have an uncertain longevity is obviously risky.
Summary of the Problems
There are multiple problems encountered by the traditional approach to BI projects. These problems can
be grouped into categories:
• People and skills required: Many projects never start because the number and diversity of the people
required is too great.
• Lack of iterations: Many projects fail because the initial prototype, if done at all, is the only iteration of
requirements and design.
• Suitability of the tools: The usability and productivity of the existing BI tools are impediments for many
BI projects, as are the hardware requirements for the combined tool ‐ set.
• Costs: The pricing and licensing of BI software and the cost of the necessary hardware increases the risk
of undertaking a BI project.
8/6/2019 Pentaho_Pentaho Agile BI
http://slidepdf.com/reader/full/pentahopentaho-agile-bi 7/12
Pentaho Agile BI™: An iterative methodology for flexible, fast and cost ‐ effective BI projects
7
The Agile Approach to Business Intelligence
As discussed above there are problems in BI projects related to people, processes, software/hardware,
and costs. Any solution to these problems should address all of these areas. We at Pentaho believe that
Agile BI achieves this.
What Do We Mean by Agile BI?
The word “agile” is used as a buzzword in many contexts and in different ways. We are using the word in
its traditional definition: the ability to move quickly and easily, in a nimble and well ‐ coordinated way.
So, by Agile BI, we mean ‘the ability to create BI solutions quickly and easily, in a nimble and well ‐
coordinated way’.
Using an agile approach improves the success of BI projects, and enables you to start more projects. It
does this by changing the economics, the technical solution, and the execution of the projects.
Agile and Lean Principles
In recent years organizations have been increasingly using agile and lean software development
methodologies and tools. This rise in popularity is spurring the adoption of agile philosophies in other
domains.
• Adapting the principles of the Agile Manifesto to work with BI leads to these:
• Satisfy the customer through early and continuous delivery of valuable data and features.
• Welcome changing requirements, even late in development.
• Deliver a working solution frequently and measure progress by this.
• Foster a closer working relationship between businesspeople and developers throughout the project.
• Build projects around motivated and knowledgeable individuals.
• Decide late, deliver fast.
• The frequent delivery of a working solution will obviously solve some of the problems BI projects face:
• Communication and vision gaps will be reduced in each iteration as end users see the working results.
• Development latency will be significantly reduced.
• Shortcomings of the top ‐ down or bottom ‐ up approach will be alleviated as rapid iterations allow a
hybrid approach that combines or alternates them.
Lean Delivery
You can reduce development tasks and costs by using the “decide late” principle. By treating the first
delivery of a BI solution as temporary until proven otherwise, you avoid extra work and cost. Some
examples of savings are:
• Use manual flat ‐ file extracts from source systems instead of fully ‐ automated data flows.
8/6/2019 Pentaho_Pentaho Agile BI
http://slidepdf.com/reader/full/pentahopentaho-agile-bi 8/12
Pentaho Agile BI™: An iterative methodology for flexible, fast and cost ‐ effective BI projects
8
• Extract a partial (but still useful) set of data. The data can be limited by a time range or can be
restricted to a subset of a geographical, organizational or other dimension. Make sure the extracted
data is fully useful to a subset of users, not partially useful to all of them.
• Transform the data into simple fact tables instead of star, snowflake or other complex data schemas.
•
Install
the
solution
on
existing
hardware,
or
cloud‐
based
hardware.
• Use open source databases, middleware, and front ‐ end software instead of proprietary software.
• Don’t bother with automation, auditing, production controls, etc.
Monitor the usage of the system for a month or two. Only if the system is still being used frequently after
this period of time should you automate the data transformations, increase the scale of the data, optimize
the performance, provision hardware, switch software and/or implement production controls and
automation. Some organizations invest in these ‘institutionalization’ levels in phases that can span a year .
This is not a case of trying to ignore, or hide, the long term costs of successful BI projects. It is a way to
invest in BI projects incrementally as their value becomes proven.
The advantages of an agile approach can be applied to different aspects of a BI project:
• Agile BI can be used to develop a straight ‐ forward BI solution in its entirety
• Agile BI can be used to develop the requirements for a large scale project
• Agile BI can be used to investigate data quality or data integration issues
Agile Teams
An agile BI team is typically made up of 4 ‐ 5 people, each typically having one of these roles: IT Developer,
Project Manager, BI Consultant, End User, Business Analyst, and/or a Database Administrator. Any of
these people is capable of starting a project on their own.
Many spreadmarts in existence today are complicated and intricate. Most have been constructed by end ‐
users because an officially sanctioned BI solution is neither available nor planned. This shows that there is
a population of technically ‐ oriented end ‐ users who are willing and able to create BI solutions. Having
these individuals on the team and giving them tools that enable them to experiment will help BI projects
significantly.
Ideally the team should be based in the same location, and if they can work in the same room most of the
time, that’s even better. Regardless of location, the team should be provided with tools to help them
collaborate, such as forums, mailing lists, wikis, and a document/content management system.
Agile Hardware
If you need to acquire computing hardware before a BI project can begin, you can run into trouble. In
some cases it delays the start of a project, in other cases it is a contributing factor in a project’s
cancellation.
To get a project going quickly, or enable a prototype to be conducted cheaply, you’ll find it advantageous
to use one of the following:
• User hardware: Using existing desktops, workstations, or laptops means no procurement delays or
8/6/2019 Pentaho_Pentaho Agile BI
http://slidepdf.com/reader/full/pentahopentaho-agile-bi 9/12
Pentaho Agile BI™: An iterative methodology for flexible, fast and cost ‐ effective BI projects
9
budget spending. A desktop environment is great for a business analyst or technically ‐ oriented end ‐
user to get started on a project.
• Cloud computing: Cloud computing quickly and cheaply make a BI solution available to a distributed
group of people. This includes both public clouds like Amazon EC2 and private clouds like Eucalyptus.
In
some
cases
user’s
hardware
is
locked
down
and
only
certain
applications
are
available
to
them
such
as
office productivity, email, web, and corporate applications. In these case cloud computing gives
technically ‐ oriented end ‐ users a new option
In most cases a BI solution will go into production on dedicated, on ‐ premise hardware. But prototyping
and development can be done on desktop machines and cloud environments. The ability to migrate easily
from user hardware to cloud environments, and cloud environments to static deployments further
increases the productivity of the team and the flexibility of the project.
For these hardware options to be viable, the BI software must be suitable (in terms of licensing and
hardware requirements) for all those environments. The software must scale up to meet the demands of
the production deployment, but it must also scale down onto laptops and utility hardware.
Agile Software
An agile approach works best when iterations of the BI solution are frequently delivered to a group of
end ‐ users, who provide valuable feedback and changing requirements based on the progress so far.
This implies some requirements on the software used. The BI software used should:
• Support quick iterations: Iterations will take longer if the tools are cumbersome, hard to use, or do not
work well together.
• Offer full BI capabilities: Even the quickest prototype or iteration is likely to involve data
transformation, data quality, modeling, visualization, and content creation.
• Make basic features easy to use: The software should enable technically ‐ oriented end ‐ users to
participate in or initiate development of a BI solution.
• Allow delivery to a large audience: Valuable feedback will be lost if the licensing of the software
restricts the potential pool of end ‐ users providing feedback. For this reason you should avoid software
that is licensed per ‐ user.
• Allow prototyping: The ability to perform prototypes or pilot projects at will, without the hindrance of
software licensing issues, enables many more BI projects to be considered for development.
Pentaho’s Agile BI Initiative
In 2009 Pentaho started an Agile BI initiative: http://www.pentaho.com/agile_bi/
Along with the release of this paper Pentaho is launching the first toolset with all the deployment and
pricing options needed for Agile BI. This is the first version of these tools, complete with integrated design
tools and utility pricing.
8/6/2019 Pentaho_Pentaho Agile BI
http://slidepdf.com/reader/full/pentahopentaho-agile-bi 10/12
Pentaho Agile BI™: An iterative methodology for flexible, fast and cost ‐ effective BI projects
10
Tools
• An integrated ETL, modeling and design environment, and a BI server:
• Unlimited, perpetual, free ‐ use options: Free desktop design environment. Open source BI server.
• Enterprise options: Enterprise repository for security, collaboration, and versioning. Enterprise ETL
server with stuff?. Enterprise BI server with enhanced end ‐ user web ‐ based design tools. Support and
training.
The features in these tools include:
• Data Integration: extract, transform, and load (ETL) capabilities to integrate data from disparate
sources into data marts and data warehouses
• Reporting: pixel perfect or ad hoc reporting either directly against source systems or using a centralized
BI metadata layer
• Analysis: interactive data analysis using a relational OLAP (ROLAP) architecture that delivers high
performance for business users even in large ‐ data environments
• Dashboards: integrated views of key business metrics using reports, charts, dials, maps, or other visual
display techniques
• Predictive Analytics: advanced data analysis designed to uncover hidden patterns in data and to
support predictive analytics
• BI Server: the supporting infrastructure for Pentaho’s end user BI capabilities which includes services
for scheduling, distribution, metadata, security, portal integration, and more
Deployment Options
Design tools and servers are cross ‐ platform ‐ Windows, Linux, OS X, Solaris. All tools and servers can be
run on commodity laptops.
Enterprise options are available on ‐ premise, hosted, or cloud ‐ based with utility pricing.
Agile Behavior
Specifically, these tools, deployment options, and pricing options allow BI practitioners to behave in new
ways:
• A BI project can be started by a single end ‐ user, business analyst, IT developer, database administrator,
or consultant.
•
Different
participants
can
be
engaged
sequentially,
not
simultaneously.
An
end‐
user,
business
analyst,
or consultant can create a BI project, then the IT group can institutionalize it over time, as its usage
dictates.
• A BI project can be developed on a laptop, on a hosted service, in the cloud, or in a data center. The
project can be easily moved among these environments.
• A prototype can be completed for less than a few hundred dollars cash outlay, or no cash outlay.
• Spreadmart developers become BI developers, and have the advantages of both: control, flexibility,
8/6/2019 Pentaho_Pentaho Agile BI
http://slidepdf.com/reader/full/pentahopentaho-agile-bi 11/12
Pentaho Agile BI™: An iterative methodology for flexible, fast and cost ‐ effective BI projects
11
self ‐ sufficiency, scalability, security, and reliability.
Agile BI Use Cases
Agile BI can be used in different scenarios. These are some examples of using Agile BI for projects that are
driven
by
IT.
• Fast Track: Take your most important BI project, and be agile with it. Create a prototype using existing
or cloud hardware. Iterate quickly ‐ weekly, daily, or even hourly. Provide access to it for a large user
community. Enable the users to communicate and collaborate together. Iterate until they are happy
with the data and the content. Only at this point should you decide whether to bring the project on ‐
premise or not. Don’t fully institutionalize the project until after 6 months of consistent usage have
passed.
• Backlog Shotgun: Perform a quick bottom ‐ up iteration of all your backlogged BI projects. Use cloud
computing for the hardware. Always use real data: end ‐ users cannot get excited by fake data, nor can
they find data quality issues that exist with the real data. Let your users explore the solutions for a few
weeks
then
let
them
decide
which
ones
to
develop
further.
In
each
iteration
take
their
top
requests
and implement them in no more than 4 weeks. See which projects get traction and which ones fade
away: institutionalize the successful ones.
• Data Quality Hunt: Provide bottom ‐ up solutions of your operational systems to let the users determine
where interesting data fields are not consistently populated. Alter the application logic or operational
procedures so that those fields become suitable for future analysis.
• These are some examples of using Agile BI for projects that are driven by end users.
• Spreadmart Conversion: Find your spreadmart authors, provide them the tools to turn their
spreadmarts into scalable, secure, centralized solutions, and give them the ability to enhance and
develop those solutions further. The central IT group can provide access to a ‘dimension store’ which
contains standard hierarchies for the organizations main dimensions (products, geography, business
units etc). Providing ways for developers to check the consistency of their data with these standard
dimensions will improve quality, consistency and lower integration costs.
• Scratch Space: Provide some on ‐ premise or cloud ‐ based hardware to your technical end ‐ users and let
them create their own prototypes and solutions. Monitor them to see which are used frequently. Turn
these into supported solutions.
The Boundaries of Agile BI
So where are the boundaries of Agile BI? What is not ‘Agile BI’?
Agile BI is not a product ‐ it is combination of technology, economics, and execution that enables new
behaviors.
Agile BI is not an alternative to the Kimball Data Warehouse methodology. Agile BI provides new ways to
approach BI projects. You can use Agile BI to create data ‐ marts one at a time or in parallel, and then use
the Kimball DW methodology to approach the creation of a data warehouse.
Agile BI, because of its iterative nature, it is not ideal for fixed ‐ price, waterfall ‐ style projects. As an
alternative approach, some consulting companies offer their technical expertise on a ‘pay ‐ per ‐ iteration’
8/6/2019 Pentaho_Pentaho Agile BI
http://slidepdf.com/reader/full/pentahopentaho-agile-bi 12/12
Pentaho Agile BI™: An iterative methodology for flexible, fast and cost ‐ effective BI projects
basis specifically to support agile projects.
Agile BI is not the same as BI delivered using a Software as a Service (SaaS) model. SaaS BI offerings are
hosted, are typically focused on a specific domain, can be hard to customize, and are not easy to move
out of their hosted environment.
Agile
BI
is
not
a
way
to
falsely
under‐
estimate
the
long
costs
of
BI
projects.
It
is
a
way
to
incrementally
invest as the value is proven, and a way to make use of utility pricing if suitable.
Summary
Agile BI changes our perception of BI projects by dramatically changing their economics and execution.
Instead of regarding them as something that ‘the organization might start next quarter if they can line up
the resources’, they can be viewed as something that ‘I can start this afternoon’.
The traditional BI vendors have talked about ‘BI for the masses’, ‘BI everywhere’, and ‘BI for everyone’ for
years. What none of them have done is deliver a toolset that enables this to actually happen. Pentaho’s
Agile BI, by changing the technical, operational, and economic factors of BI, enables new behaviors by all
participants in BI projects. These new behaviors enable BI to cross the chasm from being management ‐
mandated, to being user ‐ driven.
Download and Contact Information
Pentaho Agile BI ‐ http://www.pentaho.com/agile_bi
References
Pentaho: http://www.pentaho.com
Agile Manifesto: http://www.agilemanifesto.org
Agile Software Development: http://en.wikipedia.org/wiki/Agile_software_development
Lean Software Development: http://en.wikipedia.org/wiki/Lean_software_development
Lean Delivery: http://blogs.forrester.com/boris_evelson/10-03-03-333_rule_keep_your_bi_apps_check
Moore’s Law: http://en.wikipedia.org/wiki/Moore%27s_law
Transistor Counts: http://en.wikipedia.org/wiki/Transistor_count