62
© Copyright 2013 Wellesley Information Services, Inc. All rights reserved. Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse Dr. Bjarne Berg COMERIT

Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

  • Upload
    lucas

  • View
    120

  • Download
    4

Embed Size (px)

DESCRIPTION

Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse. Dr. Bjarne Berg COMERIT. What We’ll Cover …. Introductions EDW Data Design and Data Modeling Data Loading and Fast Activations - PowerPoint PPT Presentation

Citation preview

Page 1: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

© Copyright 2013Wellesley Information Services, Inc.

All rights reserved.

Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

Dr. Bjarne BergCOMERIT

Page 2: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

2

What We’ll Cover …

• Introductions• EDW Data Design and Data Modeling• Data Loading and Fast Activations• Tips and Tricks for Faster Query Times• Real-Time and Near-Time reporting for the EDW • EDW Reduction and Cleanup• In Memory Options with HANA• Wrap-Up

Page 3: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

3

In This Session …

• This is a advanced technical presentation intended for developers with significant experience with SAP BW in a hands-on role

• We will look at many EDW technical design options and the pros and cons of some of the new design features in HANA and BW 7.3

• During the presentation we will look at many real examples from 5 real implementations and explore what can be learned from these

Page 4: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

4

What We’ll Cover …

• Introductions• EDW Data Design and Data Modeling• Data Loading and Fast Activations• Tips and Tricks for Faster Query Times• Real-Time and Near-Time reporting for the EDW • EDW Reduction and Cleanup• In Memory Options with HANA• Wrap-Up

Page 5: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

5

Data Design The Use of Layered Scalable Architecture (LSA)

The LSA consists logically of: Acquisition layer Harmonization/quality layer Propagation layer Business transformation layer Reporting layer Virtualization layer

SAP BW 7.3 SP-3 has a set of 10 templates to help build a layered data architecture for large-scale data warehousing

Page 6: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

6

EDW Design Vs. Evolution

An organization has two fundamental choices:

1. Build a new well architected EDW

2. Evolve the old EDW or reporting system

Both solutions are feasible, but organizations that selects an evolutionary approach should be self-aware and monitor undesirable add-ons and ‘workarounds”.

Failure to break with the past can be detrimental to an EDW’s long-term success…

Page 7: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

7

Data Design - Real Example of LSA Implementation

This company implemented a full LSA Architecture and also partitioned the Infoproviders for faster data loads and faster query performance.

While this architecture has benefits, there are

significant issues around data volumes and the

Total cost of Ownership when changes are made

to the data model(s)

Page 8: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

8

Data Design Example of LSA Simplification In HANA

Since many of the benefits sought by the LSA architecture are inherent in HANA, significant simplifications can be made to the data design and data flows

This design has a dramatically smaller cost of ownership (fewer objects to maintain,

design and change) than the traditional LSAs

Page 9: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

9

ConformedReportable DSO

Write OptimizedDSO

EDW - Complex Layered Architectures• This BPC on BW system was experiencing

substantial load performance issues• Some of this was due to underlying SAP BW

configuration, while some was due to the technical configuration of the data store architecture and data flow inside SAP BW

Production Issues included:1) Dependent jobs not running

sequentially, i.e., load from Summary cube to Staging cube is sometimes executed before the summary cube data is loaded and activated, resulting in zero records in the staging cube.

2) Long latency with 6 layers of PSA, DSOs, and InfoCubes before consolidation processes can be executed.

FIGL_D15S FIGL_D10S FIGL_D08FIGL_D13S FIGL_D11S

FIGL_D21 FIGL_D17 FIGL_D14FIGL_D20 FIGL_D18

GL Summary Cube

(FIGL_C03)

BPC Staging Cube

(BPC_C01)

Consolidation Cube

(OC_CON)

ECC 6.0Asia-

Pacific

ECC 6.0North-

America

ECC 4.7Latin-

America

R/3 3.1iEU

ECC 4.7ASIA

Persistent Staging Area (PSA)

Consolidation Processes:1) Clearing2) Load3) Foreign Exchange4) Eliminations5) Optimizations

Real Example

Page 10: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

10

Write OptimizedDSO

Fixes to Complex EDW Architecture

• The fix to this system included removing the conformed DSO layer, with BEx flags for data stores that are never reported on.

• Also, the BPC staging cube served little practical purpose since the data is already staged in the GL Summary cube and the logic can be maintained in the load from this cube directly to the consolidation cube.

FIGL_D15S FIGL_D10S FIGL_D08FIGL_D13S FIGL_D11S

GL Summary Cube

(FIGL_C03)

Consolidation Cube

(OC_CON)

ECC 6.0Asia-

Pacific

ECC 6.0North-

America

ECC 4.7Latin-

America

R/3 3.1iEU

ECC 4.7ASIA

Persistent Staging Area (PSA)

Consolidation Processes:1) Clearing2) Load3) Foreign Exchange4) Eliminations5) Optimizations

Long-term benefits included reduced data latency, faster data activation, less data replication, smaller system backups as well as simplified system maintenance.

Real Example

Page 11: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

11

EDW Data Design - Use of MultiProvider Hints in BW

• If a query has restrictions on this characteristic, the OLAP processor is already checked to see which part of the cubes can return data for the query. The data manager can then completely ignore the remaining cubes.

Problem: To reduce data volume in each InfoCube, data is partitioned by Time period.

A query must now search in all InfoProviders to find the data. This is very slow.

Solution: We can add “hints” to guide the query execution. In the RRKMULTIPROVHINT table, you can specify one or several characteristics for each MultiProvider, which are then used to partition the MultiProvider into BasicCubes.

An entry in RRKMULTIPROVHINT only makes sense if a few attributes of this characteristic (that is, only a few data slices) are affected in the majority of, or

the most important, queries (SAP Notes: 911939. See also: 954889 and 1156681).

2002 2003 2004 2005 2006 2007 2008

Page 12: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

12

EDW Data Design - Semantic Partitioned Objects (SPO)• When data stores and InfoCubes are allowed to grow over time, the data load

and query performance suffers• Normally objects should be physically partitioned when the numbers of

records exceed 100 – 200 million However, this may be different depending on the size of your hardware and

the type of database you use• In SAP NetWeaver BW 7.3 we get an option to create a Semantic Partitioned

Object (SPO) through wizards You can partition based on fields such as calendar year, region, country, etc.

Page 13: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

13

Data Design - Semantic Partitioned Objects (cont.)

• When an SPO is created, a reference structure keeps track of the partitions. The structure is placed in the MultiProvider for querying.

SPO Wizards create all Data Transfer Processes (DTP), transformations, filters for each data store, and a process chain

Page 14: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

14

InfoCubes should be performance tuned if the number of records exceeds 100 million and partitioned before they are approaching 200+ million records. This

creates faster loads, better query performance, and easier management.

-25 50 75

100 125 150 175 200 225 250 275 300 325 350 375 400 425

Mill

ions

InfoCubes Number of RecordsReal Example

InfoCube Design - Size

• In this example, many of the InfoCubes are very large and not partitioned

• Several have over 100 million records and one is approaching 500 million

• In this system SPOs in BW 7.3 can be very helpful. For BW 7.0 many of these cubes can be physically partitioned with hints on the MultiProviders

Page 15: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

15

Explore the use of line item dimensions for fields that are frequently conditioned in queries. This model

change can yield faster queries.

InfoCube Design - Use of Line Item Dimensions

• Line item dimensions are basically fields that are transaction oriented

• Once flagged as a line item dimension, the field is actually stored in the fact table and has no table joins

This may result in improvements to query speeds for cubes not in BWA or HANA

Page 16: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

16

InfoCube Design — High Cardinality Flags

• High-Cardinality flag for large InfoCubes with more than 10 million rows

• There are currently 11 InfoCubes with a ratio of more than 30% of the records in the dimensions vs. fact table

• SAP recommends for Indexing and performance reasons to flag these as “high-cardinality” dimensions. However, it has minor impact to smaller cubes.

• In this example, there were four medium and large InfoCubes that are not following the basic design guidelines, and subsequently had slow performance

Many companies should redesign large InfoCubes with high-cardinality to take advantage of the standard performance enhancements available.

InfoCube Number of rowsEntries in dimension compared for F table

FIUC_C03 12,859,780 37%ZGAT_C01 20,793,573 46%FIIF_C02 68,090,967 102%FIGL_C01 156,738,973 88%

Real Example

Page 17: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

17

DSO Design and Locks on Large Oracle Tables

In this example, many of the very large DSOs are not partitioned, and several objects have over 250 million records

Additionally, 101 DSO objects were flagged as being reportable. This resulted in System IDs (SIDs) being created during activation.

Combined, these resulted in frequent locks on the Oracle database and failed parallel activation jobs

-25 50 75

100 125 150 175 200 225 250 275 300 325 350 375 400 425

Mill

ions

DSO Number of Records

Partition DSOs. The lock on very large DSOs during parallel loads are well known and SAP has issued several notes on the topic: 634458 'ODS object:

Activation fails - DEADLOCK' and 84348 'Oracle deadlocks, ORA-00060.'

Real Example

Page 18: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

18

What We’ll Cover …

• Introductions• EDW Data Design and Data Modeling• Data Loading and Fast Activations• Tips and Tricks for Faster Query Times• Real-Time and Near-Time reporting for the EDW • EDW Reduction and Cleanup• In Memory Options with HANA• Wrap-Up

Page 19: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

19

An Example of a - BW System Review

Database: Oracle version 11.2gBW system: BW version 7.3 Operation systems: HP-UNIX; Linux for BWA; AIX 6.4 for three app servers

Area Value TrendActive Users 106 steadyAvg. Availability per Week 100 % upDB Size 3514.24 GB steadyLast Month DB Growth 131.88 GB up

In this section, we take a look at a real example of a BW Implementation and explore what we can learn from it.

Real Example

Page 20: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

20

System Background

• There were over 600 InfoProviders in the system

• The system had been in production for 6 years and has was upgraded in 2012

• Most InfoCubes followed standard development guidelines, but some had abnormalities such as InfoCubes feeding DSOs.

Structured design review sessions should be undertaken as part of every project to assure that this design did not continue.

0

50

100

150

200

250

300

350

400

11395

11 0 939

101

353

10

InfoProvidersReal Example

Page 21: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

21

Faster Data Load and Design Options — Activation

• During activation, SAP NetWeaver BW 7.0 has to lookup in the NIRV table to see if the object already exists This can be a slow process

• In SAP NetWeaver BW 7.0 we may buffer the number ranges to compare the data load with records in-memory This speeds up data activation

• However, in SAP NetWeaver BW 7.3, the data activation is changed from single lookups to package fetch of the active table, resulting in faster activation and less locks on the lookup tables This activation method results in 15-30% faster data activation

Page 22: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

22

Example: Data Activation and Options

This resulted in many reads on the NRIV table that slowed down data activation and process chains (see SAP notes: 857998, 141497, 179224 and 504875)

InfoCube Dimension Rows (thousands)

Buffer range level (thousands)

FIGL_C012 138,286 163,271 FIGL_C016 47,005 53,133 FICP_C053 10,172 29,655 FICP_C059 6,268 18,808

FICP_C02 FICP_C021 3,281 15,740 FICP_C013 9,149 46,308 FIAR_C014 3,804 64,388 FIAP_C017 19,139 25,844

FIIF_C02 FIIF_C022 69,264 79,758

FIGL_C01

FICP_C05

FICP_C01

Start buffering of number ranges of dimensions and InfoObjects or use 7.3 data activation instead (‘packet fetch’)

Table InfoObject Rows (thousands)

Buffer range level (thousands)

/BIC SGOBJKEY 12,792 18,699SCO_ITEM_NO 87,423 93,099SDOC_NUMBER 65,184 71,713SAC_DOC_NO 16,814 19,693SREF_DOC_NO 15,321 18,663SBILL_NUM 10,389 14,704SBA_BELNR 9,433 13,421SCO_DOC_NO 11,468 12,282SMAT_DOC 10,427 12,040STCTSTEPUID 9,951 11,793

/BI0

Real Example• During activation, there was limited use of buffering of number ranges for dimensions and InfoObjects, even when the number of entries were large.

Page 23: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

23

More Data Load Ideas

• In BW 7.3 for data transformations, the option “Read from DataStore” for a faster data lookup is also available

• Additionally, the use of navigational attributes as sources in Masterdata transformations reduce overhead for lookups Combined, this may lead to an additional 10-20%

improvement

The 7.3 initial load runtime option “Insert only” and the “Unique data records only” prevents all lookups during activation and

can dramatically improve data loads when used correctly

Page 24: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

24

More 7.3 Performance and Cockpit Capabilities

• BW 7.3 monitors and cockpit capabilities also include: Monitor of database usage and object sizes (i.e., InfoCubes, DSOs) Query usage statistics are more visible (similar to RSRT, RSRV, RSTT) We can see more of the use of SAP NetWeaver BW Accelerator and sizes Monitor for the actual use of OLAP/MDX Cache and hit ratios You can now selectively delete internal statistics in RSDDSTATWHM by

date through the updated RSDDSTAT_DATA_DELETE ABAP program There is also a MDX Editor for coding and syntax assistance

Solution Manager has been updated to take advantage of

these new monitors.

Page 25: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

25

BW 7.3 Performance and Monitoring

• Additionally BW 7.3 monitors include: DEAMON update information (i.e., RDA capacity status, usage) A performance monitoring workbench for performance trends Process chain monitoring (transaction: RSPCM) with error and

active chain monitoring, user specific displays, and performance threshold monitoring (i.e., for SLAs)

In SAP NetWeaver BW 7.3, the Near Line Storage (NLS) has been enhanced to include archiving, support for write optimized DSOs

Page 26: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

26

ETL Options for EDWs

• In SAP NetWeaver BW 7.3 you can create generic delta extraction for the Universal Data (UD) and Database Connect (DB) options, as well as for flat files

• Additionally, you can use the new DataSource adapter “Web Service Pull” to load data from external Web services

You can even create generic Web services delta loads and load the new data straight into the staging area of SAP NetWeaver BW 7.3

• While Web services does not fully support hierarchies yet, there is integration of hierarchies into the standard process flow such as transformation and DTPs, as well as being able to load hierarchies from flat files using a new DataSource

Page 27: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

27

• SAP NetWeaver BW 7.3 has a new, step-by-step wizard that allows you to generate data flows from flat files or existing data sources

• A great benefit is that the wizards work against any InfoProvider; i.e., you can use the wizards to create loads from DSOs to DSOs or InfoCubes

This wizard reduces the number or manual steps needed to load data. It also simplifies the development process and makes ETL work much easier.

The 7.3 DataFlow Generation Wizard

Page 28: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

28

Example of Poor EDW Data Load Design

• On an average daily dataextraction, transformation, and load process takes 44.8 hours if run sequentially

• A substantial amount of the time is spent on data transformation (51%) and lookups are often done on large DSOs without secondary indexes.

For this System, Developers should revisit extractor design for lookups on source system instead of inside BW

Total Load time (sequential) - hrs 44.8 100% - Time spent on source extraction - hrs 11.1 24.7% - Time spent on error filtering - hrs 0.5 1.1% - Time spent on transformations - hrs 22.9 51.1% - Time spent on target - hrs 10.4 23.1%Number of records extracted from source 371,407,881Number of records written to target 125,102,791 33.7%

ETL Statistics (average per day)Real Example

• Of the 371 million records extracted from the source, only 33.7% are written to disk

• This is due to lack of ability to do delta processing for some files and also a substantial amount of transform and lookup logic in some of the ABAP rules

Page 29: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

29

What We’ll Cover …

• Introductions• EDW Data Design and Data Modeling• Data Loading and Fast Activations• Tips and Tricks for Faster Query Times• Real-Time and Near-Time reporting for the EDW • EDW Reduction and Cleanup• In Memory Options with HANA• Wrap-Up

Page 30: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

30

Database Performance (non-HANA systems)

• Database statistics are used by the database optimizer to route queries.Outdated statistics leads to performance degradation.

• Outdated indexes can lead to very poor search performance in all queries where conditioning is used (i.e., mandatory prompts)

• The current sampling rates for this example were too low, and statistics should only be run after major data loads, and could be scheduled weekly

For many systems, database statistics are outdated and may cause database performance to perform significantly poorer than otherwise would be the case. Sampling should often be changed and process chains may be re-scheduled.

Real Example

Page 31: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

31

Database Design for EDW: B-Tree Indexes on Large Objects

• InfoCubes that are not flagged as high-cardinality use a Bitmap index instead of a classical b-tree index for the joins.

• This type of index does not get “unbalanced” since it uses pointers instead of “buckets”

• When updating these large InfoCubes, dropping and recreating Indexes in the process chain can be very time consuming and actually take longer than the inserts

• It can also result in locks when the objects are very large (100 million+ records) and when attempting to do this in parallel (see ORA-0060)

Rebuilding bitmap indexes in load processing for large objects should not be a default answer for all designs. Any process

chains that do that, may need to be revisited.

Real Example

Page 32: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

32

Aggregates Are Not Needed with BW and HANA (or BWA)

• At this company, there are 11 aggregates in the system

• Four are related to the cube ZIR_C01 and seldom used and (two has never been used by any query)

• The 7 other aggregates are used only by the statistical cubes.

• Every day, 1.9 million records are inserted into the aggregates and take 35.6 minutes of processing time

InfocubeNumber of records in

Aggregate0TCT_C01 895,0700TCT_C21 420,3140TCT_C01 313,5400TCT_C01 188,8750TCT_C21 87,7710TCT_C22 82,0510TCT_C23 69,725

Delete unused aggregates. By reducing the data volume in the underlying statistical cubes (cleanup), the remaining aggregates

will reduce in size and processing time.

Real Example

Page 33: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

33

The OLAP Memory Cache Size Utilization

• The OLAP Cache is by default 100 MB for local and 200 MB for global use

• The system at this company was consuming no more than 80MB on average

• This means that most queries were re-executing the same data (good hit ratio of over 90%)

Real Example

Page 34: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

34

For most companies queries are using CKF and sums and sorts extensively, the cache read mode

for most queries should be turned on

Real ExampleOLAP Cache — Turned Off

• At one client, the OLAP cache was turned off for 131 out of 690 queries (excluding 4 planning queries in BW-IP)

• The cache was also turned off for 24 out of 256 InfoCubes• The OLAP Cache mode for many of their queries could also

have been stored as “Binary Large Objects (BLOB),” that could speed up caching and very large reads

Page 35: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

35

Broadcast to Pre-Fill the Cache

Set up Java connectivity ASAP and use the Broadcasting feature to prefill the MDX cache (OLAP Universes) for BI analytical processing intensive

functions such as CKF, Sorts, Exceptions, Conditions

Real Example

• This company’s Java Stack did not communicate properly with SAP NetWeaver BW, and multiple logons were required

• As a result, broadcasting could not be used until the connectivity was set up correctly

Page 36: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

36

What We’ll Cover …

• Introductions• EDW Data Design and Data Modeling• Data Loading and Fast Activations• Tips and Tricks for Faster Query Times• Real-Time and Near-Time reporting for the EDW • EDW Reduction and Cleanup• In Memory Options with HANA• Wrap-Up

Page 37: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

37

HybridProvider

Query (real time)

RDAReal-Time

Data AcquisitionECC

and ExternalSystems

BWA based InfoCube

Virtual InfoCube

Transaction Data

Direct reads

Real-Time Data Acquisition

Indexing

Direct Access

HybridProvider and Real-Time Data

• The “HybridProvider” (HP) is new in SAP NetWeaver BW 7.3. The core idea is to link the historical data inside BW with real-time data.

• There are two ways of implementing an HP: HP based on a DSO HP based on a Virtual InfoCube

Page 38: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

38

HybridProvider

Query (real time)

RDAReal-Time

Data AcquisitionECC

and ExternalSystems

BWA based InfoCube

Virtual InfoCube

Transaction Data

Direct reads

Real-Time Data Acquisition

Indexing

Direct Access

This solution provides for really fast queries, but delta logic has to be custom designed

Option 1: The DSO-Based HybridProvider for EDWs

Real-time data is in the DSO and historical data in the SAP NetWeaver BW Accelerator-based InfoCube

The DSO use real-time data acquisition (RDA) to load data SAP NetWeaver BW automatically creates a process chain for the

HybridProvder’s data flow The process chain is executed for every closed request

Page 39: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

39

This is a good option if you have a low volume of new records and a high

number of queries or operational dashboards

The DSO-Based HybridProvider for EDW (cont.)

• This solution provides for really fast queries, but delta logic has to be custom designed and may be complex. However, the solution allows for high-frequency updates and very rapid query response.

Page 40: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

40

HybridProvider

Query (real time)

RDAReal-Time

Data AcquisitionECC

and ExternalSystems

BWA based InfoCube

Virtual InfoCube

Transaction Data

Direct reads

Real-Time Data Acquisition

Indexing

Direct Access

Warning: Virtual cubes with many users may place high-stress on the ERP system

Option 2: The Virtual Cube-Based HybridProvider for EDW

Data is read in real-time from SAP ECC, while historical data is read from SAP NetWeaver BW Accelerator

The difference depends on how often SAP NetWeaver BW Accelerator is loaded Non-complex data logic can be applied DTP is permitted if you do not filter the data set

Page 41: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

41

What We’ll Cover …

• Introductions• EDW Data Design and Data Modeling• Data Loading and Fast Activations• Tips and Tricks for Faster Query Times• Real-Time and Near-Time reporting for the EDW • EDW Reduction and Cleanup• In Memory Options with HANA• Wrap-Up

Page 42: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

42

2500

2600

2700

2800

2900

3000

3100

3200

3300

3400

3500

3600

Oracle DB (Mb)

-300

-200

-100

0

100

200

300

400

Feb-2010 Mar-2010 Apr-2010 May-2010 Jun-2010 Jul-2010 Aug-2010 Sep-2010 Oct-2010 Nov-2010 Dec-2010 Jan-2011 Feb-2011

Oracle DB (Mb) Growth

Database Growth and Cleanup - Example• The database has grown between

12 and 344 GB each month for the last year

• Three months of the year saw data, logs, and PSA cleaned. Data volume declined between 63 and 275 Gb those months

The database has grown by 732Gb (26%) in the last year, and the growth is uneven.

Schedule “housekeeping” jobs. Better management of cleanup would result in more predictable patterns.(i.e. we found PSA data that had 10 months of load history).

Real Example

Page 43: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

43

BW Data Reduction – Statistical Cubes

• In our example, there were many statistical cubes with significant volume and no real benefits. During the BW upgrade, most of these were not cleared and are now creating poor system performance. For example:

0TCT_C02 has 408 million rows; others also had millions of rows Stats are collected for over 1,900 objects, queries, InfoProviders,

templates, and workbooks (could be reduced significantly) There are 8 aggregates with over 12 million rows on the stats cubes Creating aggregates on stats cubes inserts 1.9 million rows and took 35.6

minutes for refresh each night High-cardinality flags are set for small cube with only one million rows

(0TCT_C21)

Use RSDDSTAT and select “Delete Data” for old stats and also schedule periodic jobs using standard process chains.

Real Example

Page 44: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

44

EDW on SAP HANA: 12 Cleaning Pre-Steps

1. Clean the Persistent Staging Area (PSA) for data already loaded to DSOs2. Delete the Aggregates (summary tables). They will not be needed again.3. Compress the E and F tables in all InfoCubes. This will make InfoCubes

much smaller.4. Remove data from the statistical cubes (they starts with the technical

name of 0CTC_xxx). These contain performance information for the BW system running on the relational database. You can do this using the transaction RSDDSTAT or the program RSDDSTAT_DATA_DELETE to help you.

5. Look at log files, bookmarks and unused BEx queries and templates (transaction RSZDELETE).

6. Remove as much as possible of the DTP temporary storage, DTP error logs, and temporary database objects. Help and programs to do this is found in SAP Notes 1139396 and 1106393.

Page 45: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

45

EDW on SAP HANA: 12 Cleaning Pre-Steps

7) For write-optimized DSOs that push data to reportable DSOs (LSA approach), remove data in the write- optimized DSOs. It is already available in higher level objects.

8) Migrate old data to Near-Line Storage (NLS) on a small server. This will still provide access to the data for the few users who

infrequently need to see this old data. You will also be able to query it when BW is on HANA, but it does not need to be in-memory.

9) Remove data in unused DSOs, InfoCubes and files used for staging in the BW system. This include possible reorganization of masterdata text and attributes using process type in RSPC

Page 46: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

46

EDW on SAP HANA: 12 Cleaning Pre-Steps

10) You may also want to clean up background information stored in the table RSBATCHDATA. This table can get very big if not managed. You should also consider archiving any IDocs and clean the tRFC queues. All of this will reduce size of the HANA system and help you fit the system tables on the master node.

11) In SAP Note 706478, SAP provides some ideas on how to keep the basis tables from growing too fast too fast in the future, and if you are on Service Pack 23 on BW 7.0, or higher, you can also delete unwanted masterdata directly (see SAP Note: 1370848).

12) Finally, you can use the program RSDDCVER_DIM_UNUSED to delete any unused dimension entries in your InfoCubes to reduce the overall system size.

Page 47: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

47

PSA Cleanup

• For this company there were over 1.6 billion rows in the PSA

• An estimated 29Gb could be freed up if data older than one month is removed

• A formal retention policy that is communicated and enforced should be implemented

Create standard practices for PSA cleanup and schedule regular jobs that take care of this in the future

AGE of PSA Number of Records PercentEstimated

Gb <1 MONTH 591,774,063 35% 14.9 1-3 MONTHS 738,204,015 44% 18.6 3-6 MONTHS 144,476,834 9% 3.6 6-12 MONTH 173,651,469 10% 4.4 1-2 YEARS 35,421,883 2% 0.9

Total 1,683,528,264 100% 42.3

Real Example

Page 48: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

48

What We’ll Cover …

• Introductions• EDW Data Design and Data Modeling• Data Loading and Fast Activations• Tips and Tricks for Faster Query Times• Real-Time and Near-Time reporting for the EDW • EDW Reduction and Cleanup• In Memory Options with HANA• Wrap-Up

Page 49: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

49

Persistence Layer

Looking Inside SAP HANA — In-Memory Computing Engine

Disk Storage

Data

VolumesPage Mgmt.

BusinessObjects Data Services

Log

Volumes

Logger

Metadata

Manager

Authorization

Manager

Transaction

Manager

Relational Engine

-Row Store-Column Store

Load

Controller

SQL Script

Calculation

Engine

Replication Server

SQL Parser

MDX

Session Manager

Inside the Computing Engine of SAP HANA, we have many different components that manage the access and storage of the data. This includes MDX

and SQL access, as well as Load Controller (LC) and the Replication Server.

Page 50: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

50

SAP has a checklist tool for SAP NetWeaver BW powered by HANA (thanks Marc Bernard).

In this tool, SAP provided automatic check programs for both the 3.5 version and the 7.x version of BW. These are found in SAP Note: 1729988.

In version 2.x of this tool, hundreds of checks are done automatically in the BW system. This includes platform checks on database and application and system information.

Moving EDW to SAP HANA – Automated tool 1

There are even basis checks for support packs, ABAP/JAVA stacks, Unicode, BW releases, and add-ons to your system.

Page 51: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

51

The idea of the checklist tool from SAP is that you run it several times throughout the project.

Once before you start, then periodically as you resolve issues and upgrade requirements, and then finally when the system has been migrated to HANA.

The checklist tool also has specific checks for the HANA system that can help you identify any issues before turning over the system to end users..

Moving EDW to SAP HANA – Automated tool 1

Page 52: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

52

Moving EDW to SAP HANA – Automated tool 2

SAP has released a new ABAP based tool that generates a report significantly better sizing fro SAP BW than using just the QuickSizer above. This program takes into consideration existing database compression, different table types and also include the effects of non-active data on the HANA system.

The higher precision you run the estimate at the longer the program is going to run.

To increase speed, you can also suppress analysis tables with less than 1 MB size.

With 14 parallel processors and 8Tb data warehouse, it is not unusual to see 45-75 minutes run time.

Page 53: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

53

Since timeouts are common when running the sizing program, you can temporarily change the parameter in rdisp/max_wprun_time to 0 in BW transaction RZ11. Finally, you estimate the growth for the system as a percentage, or as absolute growth.

This program is attached to SAP Note: 1736976 on SAP Marketplace

After you have downloaded and installed the program, and-selected the parameters above, you can go to SE38 and run SDF/HANA_BW_SIZING as a background job.

The output is stored in the file you specified and the file can now be email emailed to hardware vendors for sizing input and hardware selection.

Moving EDW to SAP HANA – Automated tool 2

Page 54: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

54

Many experienced developers are not aware that moving BW to HANA can in some cases result in slower transformations during data loads.

a. Select for all entries (SFAE) statements without HANA DB hints --> add hints

b. Select * --> specify fields to selectc. Database access in the field routines

--> move to start routined. Loops which do not assign field

symbols --> use field symbolse. Selects from master data tables -->

Use the read master data rulef. Selects from DSOs --> Use the read

DSO ruleg. Direct updates to BW object tables --

> Do not update tables directly

Moving EDW to SAP HANA – Automated tool 3

Page 55: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

55

Hardware Options for EDW on SAP HANA

There are currently 7 different certified HANA hardware vendors with 13 different products.

Some boxes can be used as single nodes with others are intended for scale-out solutions for large multi-node systems

Hardware Memory

128GB 256GB 512GB 1024GB

Cisco C260 X X

Cisco C460 X X

Cisco B440 X X+

Dell R910 X X X X

Hitachi CB 2000 X X X

NEC Express 5800 X X X+

Fujitsu RX 600 S5 X X X

Fujitsu RX 900 S2 X X+

HP DL 580 G7 X X X

HP DL 980 G7 X X

HP BL 680 X X X X+

IBM x3690 X5 X X X

IBM x3950 X5 X X X+

Page 56: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

56

A HANA Hardware Example

In this box, we are see the inside of an IBM x3950 HANA system.

The system basically consists of memory, disk, processors and network cards

The hardware vendor will install, connect and to a health check on your system before handing it over to you. A 3-year service plan is also normally required.

Page 57: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

57

What We’ll Cover …

• Introductions• EDW Data Design and Data Modeling• Data Loading and Fast Activations• Tips and Tricks for Faster Query Times• Real-Time and Near-Time reporting for the EDW • EDW Reduction and Cleanup• In Memory Options with HANA• Wrap-Up

Page 58: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

58

SAP BW Roadmap

BW 7.4 SP05 Should be released this fall. BW 7.3 SP09 is already available

Source: SAP May 2013, https://scn.sap.com/docs/DOC-36273

Page 59: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

59

Additional Resources

• An Introduction to SAP HANA Dr. Berg and Penny Silvia, SAP Press 2013, 2nd Edition

• SAP NetWeaver 7.4 on SAP Community Network http://scn.sap.com/docs/DOC-35096

• LSA Templates and Architecture in SAP NetWeaver BW 7.3 www.sdn.sap.com/irj/scn/weblogs?blog=/pub/wlg/24800

• Roadmap – SAP NetWeaver BW 7.3 www.sdn.sap.com/irj/sdn/bw-73?rid=/library/uuid/300347b5-

9bcf-2d10-efa9-8cc8d89ee72c

Page 60: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

60

7 Key Points to Take Home

• SAP BW has come a long way since version 7.0 and there are significant new capabilities for EDW in later releases

• Do not rely on old ‘knowledge’ – Big Data is changing everything!• Near-time and Real-time EDWs are gradually becoming the norm• Start a long-term plan for moving to HANA (you will do it eventually)• Costs of EDWs are too high due to complex data logic, design and

end of useful life of relational databases for high data volumes• All queries in your future EDW should run under 20 seconds, and

50% of these should be under 10 seconds.• Design is not evolutionary, it is deliberately created.

Page 61: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

61

Your Turn!

How to contact me:Dr. Bjarne Berg

[email protected]

Please remember to complete your session evaluation

Page 62: Best practices to ensure efficient data models, fast data activation, and performance of your SAP NetWeaver BW 7.3 data warehouse

62

Disclaimer

SAP, R/3, mySAP, mySAP.com, SAP NetWeaver®, Duet®, PartnerEdge, and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and in several other countries all over the world. All other product and service names mentioned are the trademarks of their respective companies. Wellesley Information Services is neither owned nor controlled by SAP.