Upload
lethien
View
213
Download
0
Embed Size (px)
Citation preview
| Rethinking Data Warehousing 2
Good decisions start with organized data
It’s a fact: Better data analysis results in better decisions. And, good
decisions have a direct impact on improved business results. According to
the Atre Group, 80% of the costs associated with data analytics is spent on
organizing the data before any meaningful analysis can be done.
The high cost of organizing data can seem
excessive. If your organization has captured
and stored the data, how can it cost so much
to organize it? The short answer is: garbage in,
garbage out. There are many essential tasks
involved in organizing and preparing data for
analysis that companies often overlook. The
higher the potential value derived from the analysis, the more
sophisticated the work required to structure and organize the data.
Complex analytics, such as customer profitability, require more insight
into current and historic data, and data from multiple functional areas
within the company. The more global the analysis, such as, consolidated
general ledgers, the greater the need for access to homogenized data
from a variety of different systems.
80% of the cost of analytics is spent on organizing data
| Rethinking Data Warehousing 3
5 tasks to organizing your data
Research shows that the effort and costs associated with organizing the
data for analysis do not always align with the potential analytical value. In
other words, you can spend a lot of time and effort organizing data
without achieving better results from your analysis.
What’s more important are the capabilities designed into the data
warehouse that constructs and organizes the data. Simply put, the more
intelligent the data warehouse the less effort and cost required to do high
value analytics. To better understand the relationship between cost and
the design of a data warehouse solution it’s important to examine the
necessary tasks in more detail.
Task #1: Identifying the source data
The source data comes in many different forms and is often stored in
cryptic fields and tables deep within ERP systems. For example, the JD
Edwards Business Unit table is named F0006 with the Business Unit Field
named MCMCU—a less than intuitive abbreviation. Also problematic are
seemingly easy queries for things like “customer ship to” or “customer
sold to.” These commonly used queries need to deal with data
fragmentation and can require 50 or so joins of various data elements.
| Rethinking Data Warehousing 4
Task #2: Exporting data into a common model
Data may exist on one system or multiple disparate systems across the
company. The systems may all be the same ERP (type and version) but
more likely the data exists on a myriad of different ERP or other systems.
Exporting source data from an ERP system to a data warehouse is
conceptually simple. But someone needs to first find the data, put the
data into a form that is more easily understood, then consolidate the data
from multiple sources into a common data model. This can be left to the
analysts, or done by building deep knowledge and experience into the
data warehouse. This knowledge will be unique for each source system.
Task #3: Keeping data current and updated
Moving the data once is not sufficient since the data is only current for an
instant. Periodically updates must be scheduled, or a technique must be
developed to continuously update the data.
Task #4: Identifying how data sets are related
Data is often structured by functional area, such as sales data, inventory
data, or financial data. However, high value analysis, like customer or
product profitability, requires analysis across functional data sets. Proper
customer analysis requires both detailed current and historic information.
An analyst can take these data sets into consideration, but this can create
bottlenecks and introduces the potential for error. Alternatively, this
capability to understand data relationships and associations can be
integrated into a data warehouse design.
| Rethinking Data Warehousing 5
Task #5: Standardize reporting
Finally, a single analyst may believe the analytics they are performing are
new and novel when, in fact, their analysis has been done many times
before. A well-‐designed system should pre-‐calculate a host of values and
produce a standard set of reports and dashboards to make incremental
analysis more efficient across the enterprise.
What level of analytics do you need?
Different analytics systems take different approaches in how to address
the above needs. Some BI tools, e.g. Tableau or QlikView, are designed for
agility and flexibility and allow for quick analysis. These solutions are most
appropriate in department settings where data relationships,
comprehensive data sets, data currency, and data governance are not the
highest priority. However, these tools quickly run out of steam when
dealing with broader company analytics.
Most ERP vendors offer tools, e.g. OBIA, that focus on providing a subset
of the data housed on their system. These offerings do not address the
need for the higher value analytics because of the limited data and data
source. In addition, it is not easy to provide real time updates or cross-‐
dimensional capabilities when using these tools.
| Rethinking Data Warehousing 6
Trade offs between BI Toolsets
BI Tool [1] OBIA [2] RD [3]
Multi-‐source data No No Yes
Cross dimensional Analysis No No Yes
Reports & Dashboards Yes Yes Yes
Continuous Data Update No No Yes
Transform Data Yes No Yes
Complete Data Set No No Yes
BI Toolsets examples 1. BI tools, e.g. QlikView 2. Vendor specif ic tools, e.g. OBIA 3. Purpose build data warehouse, e.g. RapidDecision EDW
Finally, there are custom data warehouses that ensure a holistic approach
to the organizing the data and data governance. Clearly any of the above
tasks can be dealt with, since it is only a matter of software, but the world
is littered with companies that have tried to take on this seemingly simple
task and later understood the experience and sophistication required to
successfully build a holistic data warehouse.
| Rethinking Data Warehousing 7
RapidDecision: A Holistic Approach to Data Organization & Governance
RapidDecision’s innovative design addresses the need for agility, flexibility
and the requirement of integrating data from multiple, disparate sources.
RapidDecision represents a fundamental breakthrough that addresses the
underlying challenges of previous generations of data warehouses.
At the core is a unified data model designed with a deep understanding of
the data structures used by Oracle in their JD Edwards, PeopleSoft and
EBS ERP systems. RapidDecision extracts data from obscure locations
within the Oracle ERP, transforms the data into more easily understood
formats and populates the proprietary data model. While RapidDecision
was purpose built for Oracle ERP systems it can also support data from
other ERP vendors and non-‐ERP sources.
RapidDecision ensures data is continuously updated with a patent pending
technique that mitigates impacts to system performance. The result is a
data warehouse that has the most comprehensive data set, 100% of all
historic and operational transaction data, and is continuously updated.
The system pre-‐calculates a vast number of items to reduce the time and
effort for the analyst, and creates a portfolio of reports by subject area,
such as sales or general ledger, or cross functional department. The
results also include metadata.
| Rethinking Data Warehousing 8
In the past, it was believed that systems must either optimize flexibility
and agility by providing limited subsets of data or maximize data
consistency with large structured data warehouses. An innovative
alternative exists that provides the best of both worlds and offers new
levels of intelligence and analytic capabilities.