Data Warehouse 2.0: Master Techniques for EPM Guys (Powered by ODI)

Data Warehouse 2.0 Master techniques for EPM guys

(powered by ODI)

Ricardo Giampaoli

Rodrigo Radtke

DevEpm.com

@RZGiampaoli

@RodrigoRadtke

@DEVEPMAbout the Speakers

Giampaoli, Ricardo

• Oracle Ace

• Master in Business Administration and IT management

• EPM Consultant @ Dell

• Essbase/Planning/OBIEE/ODI Certified Specialist

• Blogger @ devepm.com

Radtke, Rodrigo

• Oracle Ace

• Graduated in Computer Engineering

• Software Developer Sr. Advisor at Dell

• ODI, Oracle and Java Certified

• Blogger @ devepm.com

DevEpm.com

@RZGiampaoli

@RodrigoRadtke

@DEVEPMWhat we'll learn

• EPM Application Processes

• Traditional Data Warehouse

• DW for EPM Applications• Metadata Process

• Data Load Process

• Data Extract Process

• Oracle Partitioning

DevEpm.com

@RZGiampaoli

@RodrigoRadtke

@DEVEPMEPM Tools

• The architecture of EPM applications are very similar and for the simplicity purpose this presentation is going to use Planning/Essbase as example

• Three main possible processes that an EPM applications could have:• Metadata process: sync the metadata between the source system and the

EPM applications

• Data Load process: load the data to the EPM Applications

• Data Extract process: extract data from the EPM Applications

• Normally it is done manually or using a script to load a text file or SQL with all data/metadata to the EPM application.

Why this is not so good?

• Manual processes are always error prone

• Tons of files to load/manage

• Not centralized

• Not scalable for big environments

• Not change friendly

• Data quality issues

• Harder to achieve audit standards

• Not feasible for huge volume of data

All can be fixed by creating a

supporting Data Warehouse (DW)

DevEpm.com

@RZGiampaoli

@RodrigoRadtke

@DEVEPMTraditional Data Warehouse

• The DW should be implemented in a relational database (RDBMS) since they are more suitable for the Central Data Warehouse role than multidimensional databases (OLAP servers)

• The data model for the DW should be based on a dimensional design (Star Schema, Snow Flake or Hybrid) to facilitate integration and scalability, and provide greater performance for analytical processing. • No matter if is Star Schema, Snow Flake or Hybrid all models are based in

Dimensions that joins with a fact table thought PK’s and FK’s

• The DW can then provide data directly to other systems like EPM Applications

DevEpm.com

@RZGiampaoli

@RodrigoRadtke

@DEVEPMDW for EPM Applications

Traditional DW

• The data is spread over numerous tables

• The data is related between the table by PKs and FKs

• We can have different data in different tables that has no direct relationship

• We can query any table to get any data

• The metadata inside the tables has no meaning for the database (it’s just data)

EPM Applications

• The data will be confine inside a cube

• The data is directly related with the members of the dimensions

• It’s impossible to have a data that is not related with all dimensions

• To query we must inform at least one member of each dimension

• The metadata has a parent/child relationship, has a specific order and each member will behave depending on its dimension type

DevEpm.com

@RZGiampaoli

@RodrigoRadtke


• The problem is: A DW for EPM applications should be totally different from a Traditional DW• EPM is already a “DW” since it has all dimensions on it and stores all data

inside the cubes

• We don’t need a Star Schema, Snow Flake or Hybrid model to manage dimensions inside EPM

• We can manage dimensions more efficiently using a “metadata repository”

• The relationship between the EPM apps and the outside systems are the members POV, and this information is already inside our data table

• We don’t need any PK’s or FK’s to our “metadata repository”

• We need to model our DW thinking about EPM concepts/needs

DevEpm.com

@RZGiampaoli

@RodrigoRadtke


Fact

Dim 1

Dim 6

Dim 5

Dim 2

Dim 3

Dim 4

Cell

Dim

1Dim 2

Dim

4

DevEpm.com

@RZGiampaoli

@RodrigoRadtke

@DEVEPMMetadata Process

• The first process needs to be the Metadata Process since without the members in the EPM we cannot load data to the cubes

• Depending of the EPM application and the dimension we want to load we will have different properties and its values• But for all EPM suit we always have the member information like its parent,

type of storage, consolidation sign and more

• To create a good Metadata process we need to design our table in the most efficient way and for that we need to know what each EPM Applications requires

DevEpm.com

@RZGiampaoli

@RodrigoRadtke

@DEVEPMMetadata Process: Dimensions

• Planning/Essbase has 4 different types of Dimensions• Account

• Entity

• User Defined Dimension

• Attribute Dimension

• Each Dimension has its own properties but most of them are the same

Account Dimension Entity Dimension User Defined Dimension Attibute DimensionMember Member Member Member

Parent Parent Parent Parent

Alias: Default Alias: Default Alias: Default Alias: Default

Operation Operation Operation Operation

Valid For Consolidations Valid For Consolidations Valid For Consolidations

Data Storage Data Storage Data Storage

Two Pass Calculation Two Pass Calculation Two Pass Calculation

Description Description Description

Formula Formula Formula

UDA UDA UDA

Smart List Smart List Smart List

Data Type Data Type Data Type

Aggregation Aggregation Aggregation

Plan Type Plan Type Plan Type

Account Type

Time Balance

Skip Value

Exchange Rate Type

Variance Reporting

Source Plan Type

Base Currency

DevEpm.com

@RZGiampaoli

@RodrigoRadtke

@DEVEPMMetadata Process: Generic Table

• One Table to “rule” them all• Instead of having one

table per dimension, a generic table will have one unique column for each source (white)

• One extra column to identify to where that member belongs (yellow)

• Any other useful information (Orange)

Account Dimension Entity Dimension User Defined Dimension Attibute Dimension Metadata TableAccount Entity Products Prod_Attrib MEMBER

Parent Parent Parent Parent PARENT

Alias: Default Alias: Default Alias: Default Alias: Default ALIAS

Operation Operation Operation Operation OPERATION

Valid For Consolidations Valid For Consolidations Valid For Consolidations VALID_FOR_CONSOL

Data Storage Data Storage Data Storage DATASTORAGE

Two Pass Calculation Two Pass Calculation Two Pass Calculation TWOPASS_CALC

Description Description Description DESCRIPTION

Formula Formula Formula FORMULA

UDA UDA UDA UDA

Smart List Smart List Smart List SMARTLIST

Data Type Data Type Data Type DATA_TYPE

Aggregation Aggregation Aggregation CONS_PLAN_TYPE1

Plan Type Plan Type Plan Type PLAN_TYPE1

Account Type ACCOUNT_TYPE

Time Balance TIME_BALANCE

Skip Value SKIP_VALUE

Exchange Rate Type EXC_RATE

Variance Reporting VARIANCE_REP

Source Plan Type SRC_PLAN_TYPE

Base Currency CURRENCY

APP_NAME

DIM_TYPE

HIER_NAME

GENERATION

HAS_CHILDREN

POSITION

DevEpm.com

@RZGiampaoli

@RodrigoRadtke

@DEVEPMMetadata Process: Connect By

DevEpm.com

@RZGiampaoli

@RodrigoRadtke

@DEVEPMMetadata Process: Generic Table Benefits

• Centralized: metadata repository that contains all metadata for all EPM applications

• Scalable: architecture that can have any number of Metadata without need of changes

• Dynamic: can use generic objects to load any number of EPM applications

• Accessible: All metadata from all EPM application are easily available if needed (Data quality, queries, as metadata to other systems…)

• Performance: Table can be partitioned by Application or/and Hierarchy

DevEpm.com

@RZGiampaoli

@RodrigoRadtke

@DEVEPMMetadata Process: Overview

Sources

Oracle

Stage

Area

Table 1

Table 2

Table 3

Table 4

Table N

EPM

App

1

App

2

App

N

SQL Server

Teradata

Excel

XML

Metadata Table

Metadata

Generic Components

Send

Email

Error

Handling

App

3

1

2

4

3

DevEpm.com

@RZGiampaoli

@RodrigoRadtke

@DEVEPMDW Powered by ODI: Metadata Process

• ODI can read the EPM application repositories to understand the structure and configuration of that application• Based on the repository ODI can create dynamic code

• ODI can tie out metadata from the source based on the application repository

• Metadata load becomes more efficient and powerful allowing better management of Moved Members, Attribute Member movement, Reorder sibling members, Deleted or move Shared Members

• No extra code to add new applications/dimensions

• Complete details at https://devepm.com/2014/12/18/otnarchbeat-publication/

https://devepm.com/2014/12/18/otnarchbeat-publication/

DevEpm.com

@RZGiampaoli

@RodrigoRadtke

@DEVEPMData Load Processes

• To load data into any EPM Application we must inform one member for each dimension and the value we want to load

• Depending of the Application we can have more or less Dimensions but by default we have some standard dimensions that exists in all Apps• Accounts

• Entity

• Years

• Periods

• Scenario

• Version

• Currency

DevEpm.com

@RZGiampaoli

@RodrigoRadtke

@DEVEPM

• We can create only one generic inbound table (Fact table) that contains one column from each planning dimension (Distinct of all dimensions from all Applications) to build a centralized structure to hold all data

Data Load Processes: Inbound Tables

DevEpm.com

@RZGiampaoli

@RodrigoRadtke

@DEVEPM

• We can go further in the inbound design and create one column for each period• Smaller table (less rows) and faster to query

• Load performance greatly improved (one line has the entire year information)

• In either case we have• Centralized repository of data (easy to add

new applications)

• Scalable to all EPM Applications

• Data is reusable (No data replication)

• Generic objects (to load, error handling, email sending…)

Data Load Processes: Multi-Periods

DevEpm.com

@RZGiampaoli

@RodrigoRadtke

@DEVEPMData Load Processes: Pivot/Unpivot

• To use the Multi-Periods architecture we will need to have the ability to pivot and unpivot data• Most of the source system will not have the capability to provide data in

multi-period format as well to receive in this format

• To these we can use PIVOT/UNPIVOT command for Oracle DB• The PIVOT operator takes data in separate rows, aggregates it and converts it

into columns

• The UNPIVOT operator converts column-based data into separate rows

DevEpm.com

@RZGiampaoli

@RodrigoRadtke

@DEVEPMData Load Processes: Pivot

1. Define the columns to be Pivoted

2. Use an consolidation function on the data column1. SUM, AVG, MIN, MAX, COUNT…

2. Specify the data to me Pivoted

3. The data MUST be a constant in the “IN” Clouse

3. Data is Pivoted

DevEpm.com

@RZGiampaoli

@RodrigoRadtke

@DEVEPMData Load Processes: Unpivot

1. Define the columns to be UnPivoted

2. Select a name for the Data column and the Member column1. Specify the data to me UnPivoted

2. The data MUST be a constant in the “IN” Clouse

3. Data is UnPivoted

DevEpm.com

@RZGiampaoli

@RodrigoRadtke

@DEVEPMData Load Processes: Data Quality

• EPM applications does not like bad data• For example, if we try to load an invalid member in Essbase using ODI, it

switches to cell mode greatly impacting the load process performance

• Having just one metadata and inbound table makes the data quality process way simpler• All metadata is stored in a single place

• All data is stored in a single place

• Data quality check can be done for all applications in a single process

• Error handling/send email process are easy to create since everything is gathered in the same place

DevEpm.com

@RZGiampaoli

@RodrigoRadtke

@DEVEPM

• With only one inbound generic table, we will have only one generic E$ table• Stores all the POV and the data that fails

the validation

• ODI_Cons_name, Interface_Name, App_Name, Cube and ODI_Sess_NOidentifies what was the error, from which package that error came from and to which application it should have loaded

DW Powered by ODI: Data Quality

DevEpm.com

@RZGiampaoli

@RodrigoRadtke

@DEVEPMData Load Processes: Overview

Sources

Oracle

Stage

Area

Table 1

Table 2

Table 3

Table 4

Table N

EPM

App

1

App

2

App

N

SQL Server

Teradata

Excel

XML

E$ Table

E$ Inbound

Generic

Inbound Table

Inbound

Generic

Generic Components

Send

Email

Error

Handling

App

3

1

2

3

4

3

DevEpm.com

@RZGiampaoli

@RodrigoRadtke

@DEVEPMData Extract Processes

• The structure for the outbound table is the same as the inbound and the benefits are almost the same• Faster to export (mainly if is one year export and a BSO cube)

• Smaller table (less rows) and faster to query

• Centralized repository of data (easy to add new applications)

• Scalable to all EPM Applications

• Data is reusable (No data replication)

• Create views for the target system access the data

• In the same way that we have multi-periods in the inbound table we can have it in the outbound table

DevEpm.com

@RZGiampaoli

@RodrigoRadtke

@DEVEPMData Extract Processes: Overview

1 Outbound Table

Outbound

Generic

Generic Components

Send

Email

Error

Handling

3

View

Layer

View 1

View 2

View 3

View 4

View N

Targets

Oracle

SQL Server

Teradata

EPM

App

1

App

2

App

N

App

3

2

4

DevEpm.com

@RZGiampaoli

@RodrigoRadtke

@DEVEPMOracle Partition

• Partitioning enhances the performance, manageability, and availability of a wide variety of applications and helps reduce the total cost of ownership for storing large amounts of data• Partitioning allows tables, indexes, and index-organized tables to

be subdivided into smaller pieces, enabling these database objects to be managed and accessed at a finer level of granularity

• Oracle provides a rich variety of partitioning strategies and extensions to address every business requirement

• Since it is entirely transparent, partitioning can be applied to almost any application without the need for potentially expensive and time consuming application changes.

DevEpm.com

@RZGiampaoli

@RodrigoRadtke

@DEVEPMOracle Partition: Types

Hash

Partitioning

H1

H2

H3

H4

Scenario

List

Partitioning

Actual

Forecast

Budget

Range

PartitioningPeriod

Jan to Mar

Apr to Jun

Jul to Sep

Oct to Dec

DevEpm.com

@RZGiampaoli

@RodrigoRadtke

@DEVEPMOracle Sub Partition: Types

• Composite Partitioning• Range-Range

• Range-Hash

• Range-List

• List-Range

• List-Hash

• List-List

Composite Partitioning

List - Range

Scenari

o

Actual

Forecast

Budget

Period

Jan to Mar Apr to Jun Jul to Sep Oct to Dec

DevEpm.com

@RZGiampaoli

@RodrigoRadtke

@DEVEPMDW Powered by ODI: Partitioning

• ODI can be used to Manage table partitions• Using command on source to query

All_Tab_Partitions and verify if the partition exists or not

• Using command on target to Truncate/Drop/Create partitions

• ODI can also manage Sub-partitions• Harder to maintain

• Better to use a composite key

DevEpm.com

@RZGiampaoli

@RodrigoRadtke

@DEVEPMOverview of our environment

• 10000+ users around the world

• 24x7 operation

• 10+ source systems• 18 billions+ inserts/month

• 50 millions+ updates/month

• 60 millions+ deletes/month

• 14 thousand+ ODI sessions/month

DevEpm.com

@RZGiampaoli

@RodrigoRadtke

@DEVEPM

Ricardo Giampaoli – TeraCorp

Rodrigo Radtke de Souza - Dell

Thank you!

Data & Analytics

Data Warehouse 2.0: Master Techniques for EPM Guys (Powered by ODI)