11
Contents Introduction ............................................................................................................................................... 2 What to know and do before you start ........................................................................................................ 3 ECC Source System Preparation................................................................................................................ 4 Netweaver-Based SAP System Landscape Transformation (SLT) Systems ................................................ 6 Load Monitoring ....................................................................................................................................... 9 Process Summary Checklist ..................................................................................................................... 10 Post Load Considerations ........................................................................................................................ 11 Author: Greg Monaco Guenter Weber Version: 1.2 Date: August 3, 2012 Contact: g.monaco@sap.com HANA SLT Multi-Thread Access Plan and Load Load by Row ID

0403 - Data Provisioning - SLT Initial Data Load by Row ID v1.2

Embed Size (px)

DESCRIPTION

hana

Citation preview

Page 1: 0403 - Data Provisioning - SLT Initial Data Load by Row ID v1.2

Contents Introduction ............................................................................................................................................... 2

What to know and do before you start........................................................................................................ 3

ECC Source System Preparation ................................................................................................................ 4

Netweaver-Based SAP System Landscape Transformation (SLT) Systems ................................................ 6

Load Monitoring ....................................................................................................................................... 9

Process Summary Checklist ..................................................................................................................... 10

Post Load Considerations ........................................................................................................................ 11

Author: Greg Monaco Guenter Weber

Version: 1.2 Date: August 3, 2012 Contact: [email protected]

HANA SLT Multi-Thread Access Plan and Load Load by Row ID

Page 2: 0403 - Data Provisioning - SLT Initial Data Load by Row ID v1.2

2

Introduction We had a unique customer database challenge that severely impacted SLT’s ability to efficiently access the data. The challenges include:

Oracle RAC Complex table partitioning models for the larger tables. For example, partitioning based on a

substring of the last character of a column value Oracle database statistics that are ‘frozen’ to remove Oracle CBO operations from the data access

model.

This database architecture was recommended by SAP and has proven to be very effective for the customer. Challenges from this architectural approach to SLT included:

Table load rates of, in some cases, only 1 million records per day when we would expect 10 million records per hour per work process.

The source ECC system was also consuming system-crashing volumes of PSAPTEMP. The statistics freeze prevented us from creating indexes for optimized parallel SLT loads.

The solution to this issue is a new approach to configuring and executing SLT loads which is based on accessing data by row ID (Oracle/DB2) or primary key (MaxDB, SQL Server). This new approach provides a viable alternative to existing multi-threaded SLT load procedures with the distinct advantage of avoiding the request to the customer for a single-field index on the source ECC system. There may be some performance advantages to each approach that may only be identified through trial and error. Configuration complexity of each model is about equal as is execution. It’s nice to have a choice, right?

Page 3: 0403 - Data Provisioning - SLT Initial Data Load by Row ID v1.2

3

What to know and do before you start

Note 1745666 has been applied to source and SLT system. The SLT batch jobs do not need to be down or restarted to apply this note.

In case of cluster tables, use Type 4 INDX CLUSTER (IMPORT FROM DB) instead. For cluster tables, we should

not enforce a full table scan. Also, the parallelization for cluster tables will not make use of Oracle ROWIDs but refer to the primary key fields, as always for the other DB systems. There is a separate white paper to be reviewed which details the configuration considerations for a TYPE 4 load.

May need to request storage and tablespace from customer

Oracle and ASSM issues - Oracle 10.2.0.4 has issues with LOB tables which will impact inserts – they

hang – to table DMC_INDXCL. See note 1166242 for the work around:

alter table SAPR3.DMC_INDXCL modify lob (CLUSTD) (pctversion 10);

Page 4: 0403 - Data Provisioning - SLT Initial Data Load by Row ID v1.2

4

ECC Source System Preparation

Minimum DMIS SP: DMIS 2010 SP07 or DMIS 2011 SP02.

Ensure that note 1745666 is applied via SNOTE

Note: The corrections of this note will be included in DMIS 2010 SP08 and DMIS 2011 SP03.

Create a database container (ie:tablespace) to contain table DMC_INDXCL (optional – see next item)

Create table DMC_INDXCL using program DMC_CREATE_CLUSTER

Always check to see if table DMC_INDXCL already exists on the ECC system. This may indicate that there is an ongoing TDMS project. If so, check with the TDMS project team to make sure that you do not impact each other’s project. Table DMC_INDXCL is loaded with the source system table data. Then it is DMC_INDXCL which will be the source table for the data load via SLT to HANA. Table DMC_INDXCL can be created in its own specific tablespace or you can allow the DMC_CREATE_CLUSTER program to determine the default tablespace for this table. A separate tablespace that does not contain productive tables seems like a safer approach. Often, I use the same tablespace as was created for SLT trigger logs. Note that table space assignment for table DMC_INDXCL is to be considered specifically if Oracle or DB6 is used as the database for the source system. For other database vendors, you need to ensure that enough space is available in the source system. Sizing the DMC_INDXCL table and associated tablespace is, like any sizing exercise, based on a good guess. A good starting point is to assume a compression rate of 10:1 with the observation that for some tables like cluster tables, there may be hardly any compression.

Page 5: 0403 - Data Provisioning - SLT Initial Data Load by Row ID v1.2

5

Note that after the successful initial load, it is perfectly acceptable to remove this cluster and this tablespace to reclaim space. It is not mandatory that you do so but if the space is needed, be confident that there will be no adverse impact on the ongoing operation of SLT. Of course you can only do this if no parallel TDMS operations make use of the same table.

Page 6: 0403 - Data Provisioning - SLT Initial Data Load by Row ID v1.2

6

Netweaver-Based SAP System Landscape Transformation (SLT) Systems

SLT: Minimum DMIS SP: DMIS 2010 SP07 or DMIS 2011 SP02.

SLT: Ensure that note 1745666 is applied via SNOTE.

Note: The corrections of this note will be included in DMIS 2010 SP08 and DMIS 2011 SP03.

SLT: Add entry to table IUUC_PRECALC_OBJ

NUMREC: The number of records to be processed per access plan work process. Let’s say that we have 106,000,000 records and we want to use 5 access plan (ACC*) jobs on SLT – which corresponds to 5 access plan (MWB*) jobs on ECC. Entering 20,000,000 in NUMREC will result in 6 jobs, 5 of them handling 20M records each, the last one dealing with the remaining 6M records. So you could also specify 21000000 as value, in order to have a more even distribution among only 5 jobs.

This table also includes 3 KEYFIELD columns. These can be ignored.

Page 7: 0403 - Data Provisioning - SLT Initial Data Load by Row ID v1.2

7

SLT: Add entry to table IUUC_PERF_OPTION

parall jobs: Number of load jobs. Should equal number of access plan jobs. Sequence number: Up to you. I like 20. Reading Type: “INDX CLUSTER with FULL TABLE SCAN” – > Type 5

HANA: Select table for REPLICATION in Data Provisioning

ECC: Review Transaction SM37 Job DMC_CALC_ROWID_DELIMINATION should be running (Oracle)

Job name for other DB systems would start with /1CADMC/CALC_, followed by the respective table name.

Once this job is finished, one or (ideally) multiple jobs MWBACPLCALC_Z_<table name>_<mt id> should be running.

SLT: Review Transaction SM37

Page 8: 0403 - Data Provisioning - SLT Initial Data Load by Row ID v1.2

8

Job ACC_PLAN_CALC_001_01 is running

Ideally, more than 1 ACC_PLAN_CALC_001_0 should be run as soon as the first job preparing the precalculation has finished. Only 1 job will be started automatically. Starting more, up to the intended parallelization value, has to be done manually. Assuming that one 1 ACC job is already running, the screen below shows how you would run a total of 5 jobs concurrently. Currently, we normally provide only one job for the access plan calculation step. In a future SP, a more flexible control of the calculation jobs will become available. For now, you can use transaction MWBMON - Processing Step Calculate Access Plan to schedule more jobs. In the ADRC example, with five parallel jobs, the screen would look as below. However, you need to make sure that the value in field TABLE_MAX_PARALL in table DMC_MT_TABLES is set accordingly, to allow this degree of parallelization. You can make this sure of this by providing this value in field parall_jobs when maintaining IUUC_PERF_OPTION, as shown above. You can check which access plans are being processed in table DMC_ACS_PLAN_HDR (OWNER guid field of the corresponding records point to field GUID of DMC_PREC_HDR, the OWNER guid of which in turn points to DMC_MT_TABLES-COBJ_GUID). Did you follow that?

Page 9: 0403 - Data Provisioning - SLT Initial Data Load by Row ID v1.2

9

Load Monitoring

SLT: Standard MWBMON processes ECC: SM21/ST22/SM66 ECC: Review Transaction SM37

Review job log for MWBACPLCALC_Z_<table name>_<mt id> to monitor record count and

progress:

Note: Unfortunately, there is a ridiculous bug in SP7 / SP2 which is fixed with SP8 / SP3: You need to multiply the numbers of records by a factor of 5.... otherwise you might be scared, assuming a very slow progress of the processing...

Page 10: 0403 - Data Provisioning - SLT Initial Data Load by Row ID v1.2

10

Process Summary Checklist

ECC: Create tablespace for table DMC_INDXCL

ECC: DMC_CREATE_CLUSTER

SLT: Created entry in IUUC_PRECALC_OBJ

SLT: Created entry in IUUC_PERF_OPTION

SLT: SLT batch jobs are running.

HANA: Table submitted from HANA Data Provisioning

ECC: Job DMC_CALC_ROWID_DELIM_VBUK is running

SLT: ACC* batch job is running

ECC: Job DMC* started/finished in ~10 minutes and job MWBACPLCALC_Z_<table_name>_<mt_id> started

SLT: All ACC* batch jobs completed

ECC: All MWB* batch jobs completed

SLT: Table loading (MWBMON)

HANA: Table loaded and in replication

ECC: DMC_DELETE_CLUSTER_POINTER

Page 11: 0403 - Data Provisioning - SLT Initial Data Load by Row ID v1.2

11

Post Load Considerations

After the load has completed, it is good manners to clean up DMC_INDXCL on the ECC system and remove the table data. SE38->DMC_DELETE_CLUSTER_POINTER

The conversion object is the same as is listed in DMC_COBJ->IDENT on the SLT server. Access plan ID is always 1. Just do it. Select 'delete cluster data' There is no output. If it does not fail, then it worked.