109
Document Version 3.0 March 2014 f SAP NetWeaver Process Integration (PI) Performance Check Guide: Analyzing Performance Problems and Possible Solution Strategies SAP NetWeaver PI 7.1x SAP NetWeaver PI 7.3x SAP NetWeaver PI 7.4

SAP NetWeaver Process Integration (PI)

  • Upload
    others

  • View
    31

  • Download
    1

Embed Size (px)

Citation preview

Document Version 30 March 2014

f

SAP NetWeaver Process Integration (PI)

Performance Check Guide Analyzing Performance Problems and Possible Solution Strategies

SAP NetWeaver PI 71x

SAP NetWeaver PI 73x

SAP NetWeaver PI 74

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

2

DOCUMENT HISTORY

Before you start planning make sure you have the latest version of this document You can find the link to

the latest version via SAP Note 894509 ndash PI Performance Check

The following table provides an overview on the most important changes to this document

Version Date Description

30 March 2014 Reason for new version Include latest improvements in PI 731 and

PI 74 and enhanced tuning options for Java only runtime Many

sections were updated Major changes are

Additional symptom (blacklisting) for qRFC queues in READY status

Long processing time for Lean Message Search

IDoc posting configuration on receiver side

Tuning of SOAP sender adapter

More information about IDOC_AAE adapter tuning

Packaging for Java IDoc and Java Proxy adapter

New Adapter Framework Scheduler for polling adapters

new performance monitor for Adapter Engine

Enhancements of the maxReceiver parameter for individual interfaces

Include latest information for staging and logging (restructured to be part of Java only chapter)

FCA Thread tuning for incoming HTTP requests

Enhancements in the Proxy framework on senderreceiver ERPs

Large message queues on PI Adapter Engine

Generic J2EE database monitoring in NWA

Appendix section XPI_Inspector

20 October 2011 Reason for new version Adaption to PI 73

Added document history

Added options and description about AEX

Added new adapters updated existing adapter parallelization

Description of additional monitors (Java performance monitor and new ccBPM monitor)

General review of entire document with corrections

More details and performance measurements for Java Only

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

3

(ICO) scenarios

Additional new chapters

Prevent blocking of EO queues

Avoid uneven backlogs with queue balancing

Reduce the number of EOIO queues

Adapter Framework Scheduler

Avoid blocking of Java only scenarios

J2EE HTTP load balancing

Persistence of Audit Log information in PI 710 and higher

Logging Staging on the AAE (PI 73 and higher)

11 December 2009 Reason for new version Adaption to PI 71

General review of entire document with corrections

Additional new chapters

Tuning the IDoc adapter

Message Prioritization on the ABAP stack

Prioritization in the Messaging System

Avoid blocking caused by single slowhanging receiver interface

Performance of Module Processing

Advanced Adapter Engine Integrated Configuration

ABAP Proxy system tuning

Wily Transaction Trace

10 October 2007 Initial document version released for XI 30 and PI 70

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

4

TABLE OF CONTENTS

1 INTRODUCTION 6

2 WORKING WITH THIS DOCUMENT 10

3 DETERMINING THE BOTTLENECK 12 31 Integration Engine Processing Time 12 32 Adapter Engine Processing Time 13 321 Adapter Engine Performance monitor in PI 731 and higher 15 33 Processing Time in the Business Process Engine 16

4 ANALYZING THE INTEGRATION ENGINE 20 41 Work Process Overview (SM50SM66) 21 42 qRFC Resources (SARFC) 22 43 Parallelization of PI qRFC Queues 23 44 Analyzing the runtime of PI pipeline steps 29 441 Long Processing Times for ldquoPLSRV_RECEIVER_ DETERMINATIONrdquo PLSRV_INTERFACE_DETERMINATION 30 442 Long Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo 30 443 Long Processing Times for ldquoPLSRV_CALL_ADAPTERrdquo 33 444 Long Processing Times for ldquoDB_ENTRY_QUEUEINGrdquo 34 445 Long Processing Times for ldquoDB_SPLITTER_QUEUEINGrdquo 34 446 Long Processing Times for ldquoLMS_EXTRACTIONrdquo 36 447 Other step performed in the ABAP pipeline 36 45 PI Message Packaging for Integration Engine 37 46 Prevent blocking of EO queues 38 47 Avoid uneven backlogs with queue balancing 38 48 Reduce the number of parallel EOIO queues 40 49 Tuning the ABAP IDoc Adapter 41 491 ABAP basis tuning 41 492 Packaging on sender and receiver side 42 493 Configuration of IDoc posting on receiver side 42 410 Message Prioritization on the ABAP Stack 44

5 ANALYZING THE BUSINESS PROCESS ENGINE (BPE) 45 51 Work Process Overview 45 52 Duration of Integration Process Steps 45 53 Advanced Analysis Load Created by the Business Process Engine (ST03N) 47 54 Database Reorganization 47 55 Queuing and ccBPM (or increasing Parallelism for ccBPM Processes) 48 56 Message Packaging in BPE 48

6 ANALYZING THE (ADVANCED) ADAPTER ENGINE 50 61 Adapter Performance Problem 51 611 Adapter Parallelism 51 612 Sender Adapter 54 613 Receiver Adapter 55 614 IDoc_AAE adapter tuning 56 615 Packaging for Proxy (SOAP in XI 30 protocol) and Java IDoc adapter 56 616 Adapter Framework Scheduler 58 62 Messaging System Bottleneck 60 621 Messaging System in between AFW Sender Adapter and Integration Server (Outbound) 63 622 Messaging System in Between Integration Server and AFW Receiver Adapter (Inbound) 64 623 Interface Prioritization in the Messaging System 64 624 Avoid Blocking Caused by Single SlowHanging Receiver Interface 65 625 Overhead based on interface pattern being used 67 63 Performance of Module Processing 68 64 Java only scenarios Integrated Configuration objects 69

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

5

641 General performance gain when using Java only scenarios 69 642 Message Flow of Java only scenarios 71 643 Avoid blocking of Java only scenarios 73 644 Logging Staging on the AAE (PI 73 and higher) 73 65 J2EE HTTP load balancing 75 66 J2EE Engine Bottleneck 76 661 Java Memory 76 662 Java System and Application Threads 80 663 FCA Server Threads 83 664 Switch Off VMC 84

7 ABAP PROXY SYSTEM TUNING 85 71 New enhancements in Proxy queuing 86

8 MESSAGE SIZE AS SOURCE OF PERFORMANCE PROBLEMS 87 81 Large message queues on PI ABAP 88 82 Large message queues on PI Adapter Engine 88

9 GENERAL HARDWARE BOTTLENECK 90 91 Monitoring CPU Capacity 90 92 Monitoring Memory and Paging Activity 91 93 Monitoring the Database 91 931 Generic J2EE database monitoring in NWA 92 932 Monitoring Database (Oracle) 93 933 Monitoring Database (MS SQL) 94 934 Monitoring Database (DB2) 95 935 Monitoring Database (MaxDB SAP DB) 97 94 Monitoring Database Tables 99

10 TRACES LOGS AND MONITORING DECREASING PERFORMANCE 101 101 Integration Engine 101 102 Business Process Engine 102 103 Adapter Framework 102 1031 Persistence of Audit Log information in PI 710 and higher 103

11 ERRORS AS SOURCE OF PERFORMANCE PROBLEMS 104

APPENDIX A 105 A1 Wily Introscope Transaction Trace 105 A2 XPI inspector for troubleshooting and performance analysis 106

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

6

1 INTRODUCTION

SAP NetWeaver Process Integration (PI) consists of the following functional components Integration Builder

(including Enterprise Service Repository (ESR) Service Registry (SR) and Integration Directory) Integration

Server (including Integration Engine Business Process Engine and Adapter Engine) Runtime Workbench

(RWB) and System Landscape Directory (SLD) The SLD in contrast to the other components may not

necessarily be part of the PI system in your system landscape because it is used by several SAP NetWeaver

products (however it will be accessed by the PI system regularly) Additional components in your PI

landscape might be the Partner Connectivity Kit (PCK) a J2SE Adapter Engine one or several non-central

Advanced Adapter Engines or an Advanced Adapter Engine Extended An overview is given in the graphic

below The communication and accessibility of these components can be checked using the PI Readiness

Check (SAP Note 817920)

Processing at runtime is carried out by the Integration Server (IS) with the aid of the SLD The IS in this

graphic stands for the Web Application Server 70 or higher In the classic environment PI is installed as a

double stack an ABAP and a Java stack (J2EE Engine) The ABAP stack is responsible for pipeline

processing in the Integration Engine (Receiver Determination Interface Determination and so on) and the

processing of integration processes in the Business Process Engine Every message has to pass through

the ABAP stack The Java stack executes the mapping of messages (with the exception of ABAP mappings)

and hosts the Adapter Framework (AFW) The Adapter Framework contains all XI adapters except the plain

HTTP WSRM and IDoc adapters

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

7

With 71 the so called Advanced Adapter Engine (AAE) was introduced The Adapter Engine was enhanced

to provide additional routing and mapping call functionality so that it is possible to process messages locally

This means that a message that is handled by a sender and receiver adapter based on J2EE does not need

to be forwarded to the ABAP Integration Engine This saves a lot of internal communication reduces the

response time and significantly increases the overall throughput The deployment options and the message

flow for 71 based systems and higher are shown below Currently not all the functionalities available in the

PI ABAP stack are available on the AAE but each new PI release closes the gap further

In SAP PI 73 and higher the Adapter Engine Extended (AEX) was introduced In addition to the AAE the

Adapter Engine Extended also allows local configuration of the PI objects The AEX can therefore be seen

as a complete PI installation running on Java only From the runtime perspective no major differences can be

seen compared to the AAE and therefore no differentiation is made in this guide

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

8

Looking at the different involved components we get a first impression of where a performance problem

might be located In the Integration Engine itself in the Business Process Engine or in one of the Advanced

Adapter Engines (central non-central plain PCK) Of course a performance problem could also occur

anywhere in between for example in the network or around a firewall Note that there is a separation

between a performance issue ldquoin PIrdquo and a performance issue in any attached systems If viewed ldquothrough

the eyesrdquo of a message (that is from the point of view of the message flow) then the PI system technically

starts as soon as the message enters an adapter or (if HTTP communication is used) as soon as the

message enters the pipeline of the Integration Engine Any delay prior to this must be handled by the

sending system The PI system technically ends as soon as the message reaches the target system for

example a given receiver adapter (or in case of HTTP communication the pipeline of the Integration Server)

has received the success message of the receiver Any delay after this point in time must be analyzed in the

receiving system

There are generally two types of performance problems The first is a more general statement that the PI

system is slow and has a low performance level or does not reach the expected throughput The second is

typically connected to a specific interface failing to meet the business expectation with regard to the

processing time The layout of this check is based on the latter First you should try to determine the

component that is responsible for the long processing time or the component that needs the highest absolute

time for processing Once this has been clarified there is a set of transactions that will help you to analyze

the origin of the performance problem

If the recommendations given in this guide are not sufficient SAP can help you to optimize the service via

SAP consulting or SAP MaxAttention services SAP might also handle smaller problems restricted to a

specific interface if you describe your problem in a SAP customer incident Support offerings like ldquoSAP

Going Live Analysis (GA)rdquo and ldquoSAP Going Live Verification (GV)rdquo can be used to ensure that the main

system parameters are configured according to SAP best practice You can order any type of service using

SAP Service Marketplace or your local SAP contacts

If you have already worked with the PI Admin Check (SAP Note 884865) you will recognize some of the

transactions This is because SAP considers the regular checking of the performance to be an important

administrational task However this check tries to show the methodology to approach performance problems

to its reader Also it offers the most common reasons for performance problems and links to possible follow-

up actions but does not refer to any regular administrational transactions

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

9

Wily Introscope is a prerequisite for analyzing performance problems at the Java side It allows you to

monitor the resource usage of multiple J2EE server nodes and provides information about all the important

components on the Java stack like mapping runtime messaging system or module processor Furthermore

the data collected by Wily is stored for several days so it is still possible to analyze a performance problem at

a later date Wily Introscope is provided free-of-charge for SAP customers It is delivered as part of Solution

Manager Diagnostics but can also be installed separately For more information see

httpservicesapcomdiagnostics or SAP Note 797147

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

10

2 WORKING WITH THIS DOCUMENT

If you are experiencing performance problems on your Process Integration system start with Chapter 3

Determining the Bottleneck It helps you to determine the processing time for the different runtimes within PI

Integration Engine Business Process Engine and Adapter Engine or connected Proxy System Once you

have identified the area in which the bottleneck is most probably located continue with the relevant chapter

This is not always easy to do because a long processing time is not always an indication for a bottleneck It

can make sense to do this if a complicated step is involved (Business Process Engine) if an extensive

mapping is to be executed (IS) or if the payload is quite large (all runtimes) For this reason you need to

compare the value that you retrieve from Chapter 3 with values that you have received previously for

example The history data provided for the Java components by Wily Introscope is a big help If this is not

possible you will have to work with a hypothesis

Once the area of concern has been identified (or your first assumption leads you there) Chapters 4 5 6

and 7 will help you to analyze the Integration Engine the Business Process Engine the PI Adapter

Framework and the ABAP Proxy runtime

After (or preferably during) the analysis of the different process components it is important to keep in mind

that the bottlenecks you have observed could also be caused by other interfaces processing at the same

time It will lead you to slightly different conclusions and tuning measures You therefore have to distinguish

the following cases based on where the problem occurs

A with regard to the interface itself that is it occurs even if a single message of this interface is processed

B or with regard to the volume of this interface that is many messages of this interface are processed at the same time

C or with regard to the overall message volume processed on PI

This can be analyzed by repeating the procedure of Chapter 3 for A) a single message that is processed

while no other messages of this interface are processed and while no other interfaces are running Then

compare this value with B) the processing time for a typical amount of messages of this interface not simply

one message as before If the values of measurement A) and B) are similar repeat the procedure with C) a

typical volume of all interfaces that is during a representative timeframe on your productive PI system or

with the help of a tailored volume test

These three measurements ndash A) processing time of a single message B) processing time of a typical

amount of messages of a single interface and C) processing time of a typical amount of messages of all

interfaces ndash should enable you to distinguish between the three possible situations

o A specific interface has long processing steps that need to be identified and improved The tuning options are usually limited and a re-design of the interface might be required

o The mass processing of an interface leads to high processing times This situation typically calls for tuning measures

o The long processing time is a result of the overall load on the system This situation can be solved by tuning measures and by taking advantage of PI features for example to establish separation of interfaces If the bottleneck is hardware-related it could also require a re-sizing of the hardware that is used

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

11

Chapters 8 9 10 and 11 deal with more general reasons for bad performance such as heavy

tracinglogging error situations and general hardware problems They should be taken into account if the

reason for slow processing cannot be found easily or if situation C from above applies (long processing times

due to a high overall load)

Important

Chapter 9 provides the basic checks for the hardware of the PI server It should not only be used when

analyzing a hardware bottleneck due to a high overall load andor an insufficient sizing but should also be

used after every change made to the configuration of PI to ensure that the hardware is able to handle the

new situation This is important because a classical PI system uses three runtime engines (IS BPE AFW)

Tuning one engine for high throughput might have a direct impact on the others With every tuning action

applied you have to be aware of the consequences on the other runtimes or the available hardware

resources

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

12

3 DETERMINING THE BOTTLENECK

The total processing time in PI consists of the processing time in the Integration Server processing time in

the Business Process Engine (if ccBPM is used) as well as the processing time in the Adapter Framework (if

adapters except IDoc plain HTTP or ABAP proxies are used) In the subsequent chapters you will find a

detailed description of how to obtain these processing times

For reasons of completeness we will also have a look at the ABAP Proxy runtime and the available

performance evaluations

31 Integration Engine Processing Time

You can use an aggregated view of the performance (details about activation in Note 820622 - Standard jobs

for XI performance monitoring) data for the Integration Engine as shown below

Procedure

Log on to your Integration Engine and call transaction SXMB_IFR In the browser window follow the link to

the Runtime Workbench In there click ldquoPerformance Monitoringrdquo Change the display to ldquoDetailed Data

Aggregatedrdquo select lsquoISrsquo as the Data Source (or lsquoPMIrsquo if you have configured the Process Monitoring

Infrastructurersquo) and ldquoXIIntegrationServerltyour_hostgtrdquo as the component Then choose an appropriate time

interval for example the last day You have to enter the details of the specific interface you want to monitor

here

The interesting value is the ldquoProcessing Time [s]rdquo in the fifth column It is the sum of the processing time of the single steps in the Integration Server For now you are not interested in the single steps that are listed on the right side since you are still trying to find out the part of the PI system where the most time is lost Chapter 44 will work with the individual steps

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

13

If you do not know which interface is affected yet you first have to get an overview Instead of navigating to

Detailed Data Aggregated in the Runtime Workbench choose Overview Aggregated Use the button Display

Options and check the options for sender component receiver component sender interface and receiver

interface Check the processing times as described above

32 Adapter Engine Processing Time

Procedure

Log on to your Integration Server and call transaction SXMB_IFR In the browser window follow the link to

the Runtime Workbench In there click Message Monitoring choose Adapter Engine lthostgt from Database

from the drop down lists and press Display Enter your interface details and ldquoStartrdquo the selection Select one

message using the radio button and press Details

Calculate the difference between the start and end timestamp indicated on the screen

Do the above calculation for the outbound (AFW IS) as well as the inbound (IS AFW) messages

The audit log of successful message is no longer persisted in SAP NetWeaver PI 71 by default in order to minimize the load on the database The audit log has a high impact on the database load since it usually consists of many rows that are inserted individually in the database In newer releases therefore a cache was implemented that keeps the audit log information for a period of time only Thus no detailed information is available any longer from a historical point of view Instead you should use Wily Introscope for historical analyses of the performance of successful messages

The audit log can be persisted on the database to allow historical analysis of performance problems

on the Java stack To do so change the parameter ldquomessagingauditLogmemoryCacherdquo from

true to false for service ldquoXPI Service Messaging Systemrdquo using NetWeaver Administrator (NWA) as described in SAP Note 1314974 Note Only do this temporarily if you have identified the bottleneck on the AFW First try using Wily to determine the root cause of the problem

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

14

Due to above mentioned limitations Wily Introscope is the right tool for performance analysis of the Adapter Engine Wily persists the most important steps like queuing module processing and average response time in historical dashboards (aggregated data per intervals for all server nodes of a system or all systems with filters possible etc)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

15

Especially for Java-Only runtime using Integrated Configuration Wily shows the complete processing time after the initial sender adapter steps (receiver determination mapping and time in the backend call) as can be seen in the Message Processing Time dashboard below

Using the ldquoShow Minimum and Maximumrdquo functionality deviations from the average processing time can be seen In the example below a message processing step takes up to 22 seconds The reason for this could either be a slow mapping or a delay in the call to the receiver system (in the example below Java IDoc adapter is used to transfer the data to ERP)

In case of high processing times in one of your steps further analysis might be required using thread dumps or the Java Profiler as outlined in the appendix section A2 XPI inspector for troubleshooting and performance analysis

321 Adapter Engine Performance monitor in PI 731 and higher

Starting from PI 731 SP4 there is a new performance monitor available for the Adapter Engine More

information on the activation of the performance monitor can be found at Note 1636215 ndash Performance

Monitoring for the Adapter Engine Similar to the ABAP monitor in the RWB the data is displayed in an

aggregated format On the PI start page use the link ldquoConfiguration and Monitoring Homerdquo Go to ldquoAdapter

Enginerdquo and select ldquoPerformance Monitorrdquo The data is persisted on an hourly basis for the last 24 hours and

on a daily basis for the last 7 days The data is further grouped on interface level and per Java server node

With the information provided you can therefore see the minimum maximum and average response time of

an individual interface on a specific Java server node All individual steps of the message processing like

time spent in the Messaging System queues or Adapter modules are listed In the example below you can

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

16

see that most of the time (5 seconds) is spent in a module called SimpleWaitModule This would be the entry

point for the further analysis

Starting with 731 SP10 (740 SP05) this data can be exported to excel in two ways Overview or Detailed (details in SAP Note 1886761)

33 Processing Time in the Business Process Engine

Procedure

Log on to your integration server and call transaction SXMB_MONI Adjust your selection in a way that you

select one or more messages of the respective interface Once the messages are listed navigate to the

Outbound column (or Inbound if your integration process is the sender) and click on PE Alternatively you

can select the radio button for ldquoProcess Viewrdquo on the entranceselection screen of SXMB_MONI

To judge the performance of an integration process it is essential to understand the steps that are executed A long-running integration process in itself is not critical because it can wait for a trigger event or can include a synchronous call to a backend Therefore it is important to understand the steps that are included before analyzing an individual workflow log For example the screenshot below shows a long waiting time after the mapping This is perhaps due to an integrated wait step

Calculate the time difference between the first and the last entry Note that this time for the workflow does not include the duration of RFC calls for example To see this processing time navigate to the ldquoList with Technical Detailsrdquo (second button from the left on the screenshot below or shift + F9) Repeat this step for several messages to get an overview about the most time consuming steps

Please Note that the processing time mentioned above does not include the waiting time of the messages in the ccBPM queues Therefore it is essential to monitor a queue backlog in ccBPM queues as discussed in section ldquoQueuing and ccBPM (or increasing Parallelism for ccBPM Processes)rdquo

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

17

Via transaction SWI2_DURA you can display an aggregated performance overview across all ccBPM

processes running on your PI system In the initial screen you have to choose ldquo(Sub-) Workflow and the time

range you would like to look at

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

18

The results here allow you to compare the average performance of a process It shows you the processing

time based on barriers The 50 barrier means that 50 of the messages were processed faster than the

value specified Furthermore you also see a comparison of the processing time to eg the day before By

adjusting the selection criteria of the transaction you can therefore get a good overview about the normal

processing time of the Integration process and can judge if there is a general performance problem or just a

temporary one

The number of process instances per integration process can be easily checked via transaction

SWI2_FREQ The data shown allows you to check the volume of the Integration Processes and allow you to

judge if your performance problem is caused by an increase of volume which could cause a higher latency in

the ccBPM queues

New in PI 73 and higher

Starting from PI 73 a new monitoring for ccBPM processes is available This monitor can be started from

transaction SXMB_MONI_BPE Integration Process Monitoring (also available in ldquoConfiguration and

Monitoring Homerdquo on PI start page) This is new browser based view that allows a simplified and aggregated

view on the PI Integration Processes

On the initial screen you get an overview about all the Integration Processes executed in the selected time

interval Therefore you can immediately see the volume of each Integration Process

From there you can navigate to the Integration process facing the performance issues and look at the

individual process instances and the start and end time Furthermore there is a direct entry point to see the

PI messages that are assigned to this process

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

19

Choosing one special process instance ID you will see the time spend in each individual processing step

within the process In the example below you can see that most of the time is spend in the Wait Step

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

20

4 ANALYZING THE INTEGRATION ENGINE

If chapter Determining the Bottleneck has shown that the bottleneck is clearly in the Integration Engine there

are several transactions that can help you to analyze the reason for this To understand why this selection of

transactions helps to analyze the problem it is important to know that the processing within the Integration

Engine is done within the pipeline The central pipeline of the Integration Server executes the following main

steps

Central Pipeline Steps Description of Central Pipeline Steps

PLSRV_XML_VALIDATION_RQ_INB XML Validation Inbound Channel Request

PLSRV_RECEIVER_DETERMINATION Receiver Determination

PLSRV_INTERFACE_DETERMINATION Interface Determination

PLSRV_RECEIVER_MESSAGE_SPLIT Branching of Messages

PLSRV_MAPPING_REQUEST Mapping

PLSRV_OUTBOUND_BINDING Technical Routing

PLSRV_XML_VALIDATION_RQ_OUT XML Validation Outbound Channel Request

PLSRV_CALL_ADAPTER Transfer to Respective Adapter IDoc HTTP Proxy AFW

PLSRV_XML_VALIDATION_RS_INBXML Validation Inbound Channel Response

PLSRV_MAPPING_RESPONSE Response Message Mapping

PLSRV_XML_VALIDATION_RS_OUT Validation Outbound Channel Response

The last three steps highlighted in bold are available in case of synchronous messages only and reflect the

time spent in the mapping of the synchronous response message

The steps indicated in red are new in SAP NetWeaver PI 71 and higher and can be used to validate the

payload of a PI message against an XML schema These steps are optional and can be executed at different

times in the PI pipeline processing for example before and after a mapping (as shown above)

With PI 731 an additional option of using an external Virus Scan during PI message processing can be

activated as described in the Online Help The Virus Scan can be configured on multiple steps in the pipeline

ndash eg after the mapping (Virus Scan Outbound Channel Request) when receiving a synchronous response

(Virus Scan Inbound Channel Response) and after the response mapping (Virus Scan Outbound Channel

Response)

It is important to know is that these pipeline steps are executed based on qRFC Logical Unit of Work (LUW)

by using dialog work processes

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

21

With 711 and higher versions you can activate the Lean Message Search (LMS) to be able to search for

payload content This is described in more detail in chapter Long Processing Times for

ldquoLMS_EXTRACTIONrdquo

41 Work Process Overview (SM50SM66)

The work process overview is the central transaction to get an overview of the current processes running in

the Integration Engine For message processing PI is only using Dialog work processes (DIA WP) Therefore

it is essential to ensure that enough DIA WPs available

The process overview is only a snapshot analysis You therefore have to refresh the data multiple times to

get a feeling for the dynamics of the processes The most important questions to be asked are as follows

How many work processes are being used on average Use the CPU Time (clock symbol) to check that not all DIA WPs are used In case all DIA WPs have a high CPU time this indicates a WP bottleneck

Which users are using these processes It depends on the usage of your PI system All asynchronous messages are processed by the QIN scheduler who will start the message processing in DIA WPs The user that is shown in SM66 will be the one that triggered the QIN scheduler This can eg be a communication user for a connected backend (usually a copy of PIAPPLUSER) or the user used to send data from the Adapter Engine (per default PIAFUSER)

Activities related to the Business Process Engine (ccBPM) will be indicated by user WF-BATCH If you see high WF-BATCH activity take a look at chapter 51

Are there any long running processes and if so what are these processes doing with regard to the reports used and the tables accessed

o As a rule of thumb for the initial configuration the number of DIA work processes should be around six times the value of CPU in your PI system (rdispwp_no_dia = 6 to 8 times CPUs cores based on SAP Note 1375656 - SAP NetWeaver PI System Parameters)

If you would like to get an overview for an extended period of time without actually refreshing the transaction

at regular intervals use report SDFMON and schedule the Daily Monitoring It allows you to collect metrics

such as CPU usage amount of free Dialog work processes and most importantly a snapshot of the work

process activity as provided in transactions SM50 and SM66 The frequency for the data collection can be as

low as 10 seconds and the data can be viewed after the configurable timeframe of the analysis Note

SDFMON is only intended for analysis of performance issues and should only be scheduled for a limited

period of time

If you have Solution Manager Diagnostics installed and all the agents connected to PI you can also monitor

the ABAP resource usage using Wily Introscope As you can see in the dashboard below you can use it to

monitor the number of free work processes (prerequisite is that the SMD agent is running on the PI system)

The major advantage of Wily Introscope is that this information is also available from the past and allows

analysis after the problem has occurred in the system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

22

42 qRFC Resources (SARFC)

Depending on the configuration of the RFC server group not all dialog work processes can be used for

qRFC processing in the Integration Server As stated earlier all (ABAP based) asynchronous messages are

processed using qRFC Therefore tuning the RFC layer is one of the most important tasks to achieve good

PI performance

In case you have a high volume of (usually runtime critical) synchronous scenarios you have to ensure that

enough DIA WPs are kept free by the asynchronous interfaces running at the same time Since this is a very

difficult tuning exercise we usually recommend implementing runtime critical synchronous interfaces using

Java only configuration (ICO) whenever possible If this is not possible you have to ensure via SARFC tuning

that enough work processes are kept free to process the synchronous messages

The current resource situation in the system can be monitored using transaction SARFC

Procedure

First check which application servers can be used for qRFC inbound processing by checking the AS Group

assigned in transaction SMQR

Call transaction SARFC and refresh several times since the values are only snapshot values

Is the value for ldquoMax Resourcesrdquo close to your overall number of dialog work processes Max Resources is a fixed value and describes how many DIA work processes may be used for RFC communication If the value is too low your RFC parameters need tuning to provide the Integration Server with enough resource for the pipeline processing The RFC parameters can be displayed by double-clicking on ldquoServer Namerdquo

o A good starting point is to set the values as follows

Max no of logons = 90

Max disp of own logons = 90

Max no of WPs used = 90

Max wait time = 5

Min no of free WP = 3-10 (depending on the size of the application server)

These values are specified in SAP Note 1375656 ndash SAP NetWeaver PI System Parameters (this parameter must be increased if you plan to have synchronous interfaces) To avoid any bottleneck and blocking situation free DIA WPs should always be available The value should be adjusted based on the overall number of DIA WPs available at the system and the above requirements

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

23

Note You have to set the parameters using the SAP instance profile Otherwise the changes are lost

after the server is restarted

The field ldquoResourcesrdquo shows the number of DIA WPs available for RFC processing Is this value reasonably high for example above zero all the time If the ldquoResourcesrdquo value is zero then the Integration Server cannot process XML messages because no dialog work process is free to execute the necessary qRFC There are several conclusions that can be drawn from this observation as follows

1) Depending on the CPU usage (see Chapter ldquoMonitoring CPU Capacityrdquo) it might be necessary at this time to increase the number of dialog work processes in your system If the CPU usage however is already very high increasing the number of DIA will not solve the bottleneck

2) Depending on the number of concurrent PI queues (see Chapter qRFC Queues (SMQ2)) it might be necessary to decrease the degree of parallelism in your Integration Server because the resources are obviously not sufficient to handle the load The next section describes how to tune PI inbound and outbound queues

The data provided in SARFC is also collected and shown in a graphical and historical way using Solution

Manager Diagnostic and Wily Introscope (as can be seen in the screenshot below) This allows easy

monitoring of the RFC resources across all available PI application servers

43 Parallelization of PI qRFC Queues

The qRFC inbound queues on the ABAP stack are one of the most important areas in PI tuning Therefore it

is essential to understand the PI queuing concept

PI generally uses two types of queues for message processing ndash PI inbound and PI outbound queues Both

types are technical qRFC inbound queues and can therefore be monitored using SMQ2 PI inbound queues

are named XBTI (EO) or XBQI (EOIO) and are shared between all interfaces running on PI by default The

PI outbound queues are named XBTO (EO) and XBQO (EOIO) The queue suffix (in red XBTO0___0004)

specifies the receiver business system This way PI is using dedicated outbound queues for each receiver

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

24

system All interfaces belonging to the same receiver business system are contained in the same outbound

queue Furthermore there are dedicated queues for prioritization of separation of large messages To get an

overview about the available queues use SXMB_ADM Manage Queues

PI inbound and outbound queues execute different pipeline steps

PI Inbound Queues

Pipeline Steps Description of Pipeline Steps

PLSRV_XML_VALIDATION_RQ_INB XML Validation Inbound Channel Request

PLSRV_RECEIVER_DETERMINATION Receiver Determination

PLSRV_INTERFACE_DETERMINATION Interface Determination

PLSRV_RECEIVER_MESSAGE_SPLIT Branching of Messages

PI Outbound Queues

Pipeline Steps Description of Pipeline Steps

PLSRV_MAPPING_REQUEST Mapping

PLSRV_OUTBOUND_BINDING Technical Routing

PLSRV_XML_VALIDATION_RQ_OUT XML Validation Outbound Channel Request

PLSRV_CALL_ADAPTER Transfer to Respective Adapter IDoc HTTP Proxy AFW

Before the steps above are executed the messages are placed in a qRFC queue (SMQ2) The messages

will wait till they are first in queue and till a free DIA work process is assigned to a queue The wait time in the

queue are recorded in the performance header in DB_ENTRY_QUEUING (PI inbound queue) and

DB_SPLITTER_QUEUING (PI outbound queue) Most often you will see the long processing times there as

discussed in chapter Long Processing Times for ldquoDB_ENTRY_QUEUEINGrdquo and Long Processing Times for

ldquoDB_SPLITTER_QUEUEINGrdquo

Looking at the steps above it is clear that tuning the queues will have a direct impact on the connected PI

components and also backend systems For example by increasing the number of parallel outbound queues

more mappings will be executed in parallel which will in turn put a greater load on the Java stack or more

messages will be forwarded in parallel to the backend system in case of a Proxy call Thus when tuning PI

queues you must always keep the implications for the connected systems in mind

The tuning of PI ABAP queue parameters is done in transaction SXMB_ADM Integration Engine

Configuration and by selecting the category TUNING

For productive usage we always recommend to use inbound and outbound queues (parameter

EO_INBOUND_TO_OUTBOUND=1) Otherwise only inbound queues will be used which are shared across

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

25

all interfaces Hence a problem with one single backend system will affect all interfaces running on the

system

The principle of less is sometimes more also applies for tuning the number of parallel PI queues Increasing

the number of parallel queues will result in more parallel queues with fewer entries per queue In theory this

should result in a lower latency if enough DIA WPs are available

But practically this is not true for high volume systems The main reason for this is the overhead involved in

the reloading of queues (see details below) Furthermore important tuning measures like PI packaging (see

Chapter ldquoPI Message Packagingrdquo) aim to increase the throughput based on a higher number of messages in

the queue Thus from a throughput perspective it is definitely advisable to configure fewer queues

If you have very high runtime requirements you should prioritize these interfaces and assign a different

parallelism for high priority queues only This can be done using parameters

EO_INBOUND_PARALLEL_HIGH_PRIO and EO_OUTBOUND_PARALLEL_HIGH_PRIO as described in

section ldquoMessage Prioritization on the ABAP Stackrdquo

To tune the parallelism of inbound and outbound queues the relevant parameters are

EO_INBOUND_PARALLEL and EO_OUTBOUND_PARALLEL have to be used

Below you can see a screenshot of SMQ2 You can see the PI inbound queues and outbound queues Also

ccBPM queues (XBQO$PE) are displayed and will be discussed in section 5

Procedure

Log on to your Integration Server call transaction SMQ2 and execute If you are running ABAP proxies on a

separate client on the same system enter lsquorsquo for the client Transaction SMQ2 provides snapshots only and

must therefore be refreshed several times to get viable information

o The first value to check is the number of queues concurrently active over a period of time Since each queue needs a dialog work process to be worked on the number of concurrent queues must not be too high compared to the number of dialog work processes usable for RFC communication (see Chapter qRFC Resources (SARFC)) The optimum throughput can be achieved if the number of active queues is equal to or slightly higher than the number of dialog work processes (provided that the number of DIA work processes can be handled by the CPU of the system see transaction ST06)

In case many of the queues are EOIO queues (eg because the serialization is done on the material number) try to reduce the queues by following Chapter EOIO tuning

o The second value of importance is the number of entries per queue Hit refresh a few times to check if the numbers increasedecreaseremain the same An increase of entries for all queues or a specific

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

26

queue points to a bottleneck or a general problem on the system The conclusion that can be drawn from this is not simple Possible reasons have been found to include

1) A slow step in a specific interface

Bad processing time of a single message or a whole interface can be caused by expensive processing steps as for example the mapping step or receiver determination This can be confirmed by looking at the processing time for each step as shown in ldquoAnalyzing the runtime of pipeline stepsrdquo

2) Backlog in Queues

Check if inbound or outbound queues face a backlog A backlog in a queue is generally nothing critical since the queues ensure that the PI components as well as the backend systems are not overloaded For instance batch triggered interfaces are usually causing high backlogs that get processed over time Only in case the backlog prevents you to meet the business requirements for an interface this should be analyzed

If the backlog is caused by a high volume of messages arriving in a short period of time one solution to minimize the backlog is to increase the number of queues available for this interface (if sufficient hardware is available) If one interface is having a backlog in the outbound queues you could eg specify EO_OUTBOUND_PARALLEL with a sub parameter specifying your interface to increase the parallelism for this interface

But in general a backlog can also be caused by a long processing time in one of the pipeline steps as discussed in 1) or a performance problem with one of the connected components like PI Java stack or connected backend systems Due to the steps performed in the outbound queues (mapping or call adapter) it is more likely that a backlog will be seen in the outbound queues Inbound queue processing is only expensive if Content Based Routing or Extended Receiver Determination is used To understand which step is taking long follow once more chapter ldquoAnalyzing the runtime of pipeline stepsrdquo

In addition to tuning of the number of inbound and outbound queues also ldquoMessage Prioritization on the ABAP Stackrdquo and ldquoPI Message Packagingrdquo can help

The screenshot below shows such a backlog situation in Wily Introscope Please Note that the information about the queues is only collected by Wily in 5 minutes intervals explaining the gaps between the measurement points On the dashboard ldquoTotal qRFC Inbound Queue Entriesrdquo you can see that the number of messages in SMQ2 is increasing The number of inbound queues does not increase at the same time It also offers a dashboard for the number of entries by queue name and the time entries remain in SMQ2 With the information provided here it is also possible to distinguish a blocking situation or a general resource problem after the problem is no longer present in the system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

27

3) Queues stay in status READY in SMQ2

To see the status of the queues use the filter button in SMQ2 as shown below

It is a normal situation to see many queues in READY status for a limited amount of time The reason for this is the way the QIN scheduler reloads LUWs (see point 2 below) A waiting time of up to 5 seconds per queuing step is considered normal during normal load situation (even if there is no backlog in the queue and enough resources are available) If you observe a situation where many queues are in READY status for longer period of time the following situations can apply

A resource bottleneck with regard to RFC resources Confirm by following section qRFC Resources (SARFC)

To avoid overloading a system with RFC load the QIN scheduler only reloads new queues after a certain amount of queues have finished processing This can lead to long waiting times in the queues as explained in SAP Note 1115861 - Behaviour of Inbound Scheduler after Resource Bottleneck Since PI is a non-user system we recommend setting rfcinb_sched_resource_threshold to 3 as described in SAP Note 1375656 ndash SAP NetWeaver PI System Parameters

Long DB loading times for RFC tables If using Oracle ensure that SAP Note 1020260 - Delivery of Oracle statistics is applied

Additional overhead during the pipeline executions occurs due to PI internal mechanisms Per default a lock is set during processing of each message This is no longer necessary and should

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

28

be switched off by setting the parameter LOCK_MESSAGE of category RUNTIME to 0 as described in SAP Note 1058915 In addition during the processing of an individual message the message repeatedly calls an enqueue which leads to a deterioration of the throughput This can be avoided by setting parameter CACHE_ENQUEUE to 0 as described in SAP Note 1366904

Sometimes a queue stays in READY status for a very long time while other queues are getting processed fine After manual unlocking the queue is processed but then stops after some time again A potential reason is described in Note 1500048 - SMQ2 inbound Queue remains in READY state In a PI environment queues in status RUNNING for more than 30 minutes usually happen due to one individual step taking long (as discussed in step 1) or a problem on the infrastructure (eg memory problems on Java or HTTP 200 lost on network) After 30 minutes the QIN scheduler removes this queue from the scheduling and therefore it remains in READY To solve such cases the root cause for the blocking has to be understood

4) Queue in READY status due to queue blacklisting

If a queue is for a very long time in status RUNNING (eg due to a problem with a mapping or due to a very slow backend) it is possible that the QIN scheduler excludes this queue from queue scheduling Hence the queue stays in status READY even though enough work processes are available The ldquoblacklistingrdquo of queues takes place when the runtime of a queue exceeds the ldquoMonitoring thresholdrdquo configured on this queue Notes 1500048 - queues stay in ready (schedule_monitor) and Note 1745298 - SMQR Inbound queue scheduler identify excluded entries provide more input on this topic The screenshot below shows a setting of 2400 seconds (40 minutes) for this parameter

Check if a queue stays in READY status for a long time while others are processing without any issue Ensure Note 1745298 is implemented and check the system log for the following exception ldquoQINEXCLUDE ltqueue namegt lttimedate at which the queue scheduler startedgt If this is the case check why the long runtime occurs (see section Analyzing the runtime of PI pipeline steps) or increase the schedule_monitor in exceptional cases only (if eg the runtime of an individual processing step cannot be improved further)

5) An error situation is blocking one or more queues

Click the Alarm Bell (Change View) push button once to see only queues with error status Errors delay the queue processing within the Integration Server and may decrease the throughput if for example multiple retries occur Navigate to the first entry of the queue and analyze the problem by forward navigation to the relevant message in SXMB_MONI

Often you see EO queues in SYSFAIL or RETRY due to the problems during the processing of an individual message This can be prevented by following the description of Chapter Prevent blocking of EO queues

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

29

44 Analyzing the runtime of PI pipeline steps

The duration of the pipeline steps is the ultimate answer to long processing times in the Integration Engine

since it describes exactly how much time was spent at which point The recommended way to retrieve the

duration of the pipeline steps is the RWB and will be described below Advanced users may use the

Performance Header of the SOAP message using transaction SXMB_MONI but the timestamps are not

easy to read If you still prefer the latter method here is the explanation 20110409092656165 must be read

as yyyymmddhh(min)(min)(sec)(sec)(microseconds) that is the timestamp corresponds to April 9 2011 at

092656 and 165 microseconds Please note that these timestamps in Performance Header of

SXI_MONITOR transaction are stored and displayed in UTC time therefore conversion to system time must

be done when analyzing them

In case PI message packaging is configured the performance header will always reflect the processing time

per package Hence a duration of 50 seconds does not mean a single message took 50 seconds but

eventually the package contained 100 messages so that every message took 05 ms More details about

this can be found in section PI Message Packaging

Procedure

Log on to your Integration Server and call transaction SXMB_IFR In the browser window follow the link to

the Runtime Workbench In there click Performance Monitoring Change the display to Detailed Data

Aggregated and choose an appropriate time interval for example the last day For this selection you have to

enter the details of the specific interface you want to monitor

You will now see in the lowerndashpart of the screen a split of the processing time into single steps Check the time difference between the steps Does any step take longer than expected In the example in the screenshot below the DB_ENTRY_QUEUEING starts after 0032 seconds and ends after 0243 seconds which means it took 0211 seconds (211ms)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

30

Compare the processing times for the single steps for different measurements as outlined in Chapter

Pipeline Steps (SXMB_MONI or RWB) For example is a single step only long if many messages

are processed or if a single message is processed This helps you to decide if the problem is a

general design problem (single message has long processing step) or if it is related to the message

volume (only for a high number of messages this process step has large values)

Each step has different follow-up actions that are described next

441 Long Processing Times for ldquoPLSRV_RECEIVER_ DETERMINATIONrdquo PLSRV_INTERFACE_DETERMINATION

Steps that can take a long time in inbound processing are the Receiver and Interface Determination In these

steps the receiver system and interface is calculated Normally this is very fast but PI offers the possibility of

enhanced receiver determinations In these cases the calculation is based on the payload of a message

There are different implementation options

o Content-Based Routing (CBR)

CBR allows defining XPath expressions that are evaluated during runtime These expressions can be combined by logical operators for example to check the value for multiple fields The processing time of such a request directly correlates to the number and complexity of the conditions defined

No tuning options exist for the system in regard to CBR The performance of this step can only be changed by reducing the number of rules by changing the design of the interfaces Another option is to use an extended receiver determination (executing a mapping program) since this is faster than CBR using many rules

o Mapping to determine Receivers

A standard PI mapping (ABAP Java XSLT) can also be used to determine the receiver If you observe high runtimes in such a receiver determination follow the steps outlined in the next section since the same mapping runtime is used

442 Long Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo

Before analyzing the mapping you must understand which runtime is used Mappings can be implemented

in ABAP as graphical mappings in the Enterprise Service Builder as self-developed Java mappings or

XSLT mappings One interface can also be configured to use a sequence of mappings executed

sequentially In such a case analysis is more difficult because it is not clear which mapping in the sequence

is taking a long time

Normal debugging and tracing tools (transaction ST12) can be used for mappings executed in ABAP Any

type of mapping can be tested from ABAP using transaction SXI_MAPPING_TEST Knowing

senderreceiver partyserviceinterfacenamespace and source message payload it is possible to check the

target message (after mapping execution) and detailed trace output (similar to contents of trace of the

message in SXMB_MONI with TRACE_LEVEL = 3) It can also be used for debugging at runtime by using

the standard debugging functionality

For ABAP based XSLT mappings it is also possible via transaction XSLT_TOOL to test trace and debug

XSLT transformations on the ABAP stack

In general the mapping response time is heavily influenced by the complexity of the mapping and the

message size Therefore to analyze a performance problem in the mapping environment you should

compare the mapping runtime during the time of the problem with values reported several days earlier to get

a better understanding

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

31

If no AAE is used the starting point of a mapping is the Integration Engine that will send the request by RFC

destination AI_RUNTIME_JCOSERVER to the gateway There it will be picked up by a registered server

program The registered server program belongs to the J2EE Engine The request will be forwarded to the

J2EE Engine by a JCo call and then executed by the Java runtime When the mapping has been executed

the result is sent back to the ABAP pipeline There are therefore multiple places to check when trying to

determine why the mapping step took so long

Before analyzing the mapping runtime of the PI system check if only one interface is affected or if you face a

long mapping runtime for different interfaces To do so check the mapping runtime of messages being

processed at the same time in the system

The best tool for such an analysis is Wily Introscope which offers a dashboard for all mappings being

executed at a given time Each line in the dashboard represents one mapping and shows the average

response time and the number of invocations

In the screenshot below you can see that many different mapping steps have required around 500 seconds

for processing Comparing the data during the incident with the data from the day before will allow you to

judge if this might be a problem of the underlying J2EE engine as described in section J2EE Engine

Bottleneck

If there is only one mapping that faces performance problems there would be just one line sticking out in the

Wily graphs If you face a general problem that affects different interfaces you can choose a longer

timeframe that allows you to compare the processing times in a different time period and verify if it is only a

ldquotemporaryrdquo issue ndash this would for example indicate an overload of the mapping runtime

If you have found out that only one interface is affected then it is very unlikely to be a system problem but

rather a problem in the implementation of the mapping of that specific interface

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

32

Check the message size of the mapping in the runtime header using SXMB_MONI Verify if the message size is larger than usual (which would explain the longer runtime)

There can also be one or many lookups to a remote system (using RFC or JDBC) causing the long processing time of a mapping Together with the application you then have to check if the connection to the backend is working properly Such a mapping with lookups can be analyzed using Wily Transaction Trace as explained in the appendix section Wily Transaction Trace

If not one but several interfaces are affected a potential system bottleneck occurs and this is described in

the following

o Not enough resources (registered server programs) available That could either be the case if too many mapping requests are issued at the same time or if one J2EE server node is down and has not registered any server programs

To check if there were too many mapping requests for the available registered server programs compare the number of outbound queues that are concurrently active with the number of registered server programs The number of outbound queues that are concurrently active can be monitored with transaction SMQ2 and counting queues with the names XBTO amp XBQOXB2O XBTA and XBQAXB2A (high priority) XBTZ and XBQZXB2Z (low priority) and XBTM (large messages) In addition to this you have to take into account mapping steps that are executed by synchronous messages or in a ccBPM process Mappings executed by a ccBPM process are not queued but the ccBPM process calls the mapping program directly using tRFC The number of registered server programs can be determined with transaction SMGW Goto Logged On Clients and filtering by the program (ldquoTP Namerdquo) AI_RUNTIME_ltSIDgt In general we recommend configuring 20 parallel mapping connections per server node

If the problem occurred in the past it is more difficult to determine the number of queues that are concurrently active One way is to use transaction SXMB_MONI and specify every possible outbound queue (field ldquoQueue IDrdquo) in the advanced selection criteria (Note wildcard search not available) The timeframe can be restricted in the upper-part of the advanced selection criteria Advanced users could use transaction SE16 to search directly in table SXMSPMAST using the field ldquoQUEUEINTrdquo

The other option is to use the Wily ldquoqRFC Inbound Queue Detailrdquo dashboard as described in qRFC Queue Monitoring (SMQ2)

To check if the J2EE is not available or if the server program is not registered for a different reason use transaction SM59 to test the RFC destination AI_RUNTIME_JCOSERVER If the problem occurred in the past search the gateway trace (transaction SMGW GoTo Trace Gateway Display File) and the RFC developer traces dev_rfc files available in the work directory for the string ldquoAI_RUNTIMErdquo In addition check the std_serverltngtout in the work directory for restarts of the J2EE Engine

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

33

o Depending on the checks above there are two options to resolve the bottleneck Of course if the J2EE was not available the reason for this has to be found and prevented in the future

The two options for tuning are

1) the number of outbound queues that are concurrently active and

2) the number of mapping connections from the J2EE server to the ABAP gateway

The increase of mapping connections is only recommended for strong J2EE servers since each mapping thread needs resources like CPU and Heap memory Too many mapping connections mean too many mappings being executed on the J2EE Engine In turn this might reduce the performance of other parts of the PI system for example the pipeline processing in the Integration Engine or the processing within the adapters of the AFW or can even cause out-of-memory errors on the J2EE engine Add additional J2EE server nodes to balance the parallel requests Each J2EE server node will

register new destinations at the gateway and will therefore take a part of the mapping load

Another option is to reduce the number of outbound queues that are concurrently active to solve the bottleneck This will certainly result in higher backlogs in the queue and is therefore only applicable for less runtime critical interfaces This option can be used if the backlog is caused by a master data interface with high volume but no critical processing time requirements In such a case you can only lower the number of parallel queues for this interface by using a sub-parameter of EO_OUTBOUND_PARALLEL in transaction SXMB_ADM By doing so you slow down one interface to ensure other (more critical) interfaces have sufficient resources available

o If multiple application servers configured on the PI system ensure that all the server nodes are connected to their local gateway (local bundle option) as described in the guide How To Scale PI

443 Long Processing Times for ldquoPLSRV_CALL_ADAPTERrdquo

Call adapter is the last step executed in the PI pipeline and forwards the message to the next component

along the message flow In most cases (local Adapter Engine IDoc or BPE) this is a local call The IDoc

adapter only puts the message on the tRFC layer (SM58) of the PI system so that the actual transfer of the

IDoc is not included in the performance header For ABAP Proxies plain HTTP or calls to a central or

decentral Adapter Engine an HTTP call is made so that network time can have an influence here

Looking at the processing time we have to distinguish asynchronous (EO and EOIO) and synchronous (BE)

interfaces

In asynchronous interfaces the call adapter step includes the transfer of the message by the network and

the initial steps on the receiver side (in most cases this just means persisting the message on the receiverrsquos

database) A long duration can therefore have two reasons

o Network latency For large messages of several MB in particular the network latency is an important factor (especially if the receiving ABAP proxy system is located on another continent for example) Network latency has to be checked in case of a long call adapter step In case HTTP load balancers are used they are considered as part of the network

o Insufficient resources on receiver side Enough resources must be available at the receiver side to ensure quick processing of a message For instance enough ICM threads and dialog work processes must be available in the case of an ABAP proxy system Therefore the analysis of long call adapter steps always has to include the relevant backend system

For synchronous messages (requestresponse behavior) the call adapter step also includes the processing

time on the backend to generate the response message Therefore the call adapter for synchronous

messages includes the time of the transfer of the request message the calculation of the corresponding

response message at the receiver side and the transfer back to PI Therefore the processing time to

process a request at the receiving target system for synchronous messages must always be analyzed to find

the most costly processing steps

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

34

444 Long Processing Times for ldquoDB_ENTRY_QUEUEINGrdquo

The value for DB_ENTRY_QUEUEING describes the time that a message has spent waiting in a PI inbound

queue before processing started In case of errors the time also includes the wait time for restart of the LUW

in the queue The inbound queues (XBTI XBT1 XBT9 XBTL for EO messages and XBQIXB2I

XBQ1XB21 XBQ9XB29 for EOIO messages) process the pipeline steps for the Receiver

Determination the Interface Determination and the Message Split (and optionally XML inbound validation)

Thus if the step DB_ENTRY_QUEUEING has a high value the inbound queues have to be monitored using

transactions SMQ2 and SARFC The reasons are similar as those outlined in chapter qRFC Resources

(SARFC) and qRFC Queues (SMQ2)

o Not enough resources (DIA work processes for RFC communication) available

o A backlog in the queues caused by one of the following reasons

The number of parallel inbound queues is too low to handle the incoming messages in parallel Note that a simple increase of the parameter EO_INBOUND_PARALLEL might not always be the solution (as enough work processes also have to be available to process the queues in parallel) The number of work processes is in turn restricted by the number of CPUs that are available for your system If you would like to separate critical from non-critical sender systems define a dedicated set of inbound queues by specifying a sender ID as a sub parameter of EO_INBOUND_PARALLEL For example by doing so you can separate master data interfaces from business critical interfaces Please note that the separation on inbound queues only works on Business System level and not on interface level

The inbound queues were blocked by the first LUW Use transaction SMQ2 to check if that is still the case To identify if such a situation occurred in the past use the Wily Introscope dashboard ldquoABAP System qRFC Inbound Detailldquo It is generally not easy to find the one message that is blocking the queue It might be achieved by checking RFC traces or by searching for a single message with a long pipeline processing step For EO queues there is no business reason to block the queues in case of a single message error To prevent this follow the description in section Prevent blocking of EO queues

In case you have a different runtime of the messages in the queues (eg due to complex extended Receiver Determinations) the number of messages per queue might be uneven distributed between the queues In PI 73 a new balancing option is available as discussed in Avoid uneven backlogs with queue balancing

Problems when loading the LUW by the QIN scheduler as described in qRFC Queues (SMQ2)

Note If your system uses the parameter EO_INBOUND_TO_OUTBOUND = 0 you must also read chapter

445 for analyzing the reasons EO_INBOUND_TO_OUTBOUND only determines whether inbound queues

(value lsquo0rsquo) or inbound and outbound queues (value lsquo1rsquo) are used for the pipeline processing in the integration

server Check the value with transaction SXMB_ADM Integration Engine Configuration Specific

Configuration The default value is lsquo1rsquo meaning the usage of inbound and outbound queues (the

recommended behavior)

445 Long Processing Times for ldquoDB_SPLITTER_QUEUEINGrdquo

The value for DB_SPLITTER_QUEUEING describes the time that a message has spent waiting in a PI

outbound queue until a work process was assigned In case of errors the time also includes the wait time for

the restart of the LUW in the queue

The outbound queues (XBTO XBTA XBTZ XBTM for EO messages and XBQOXB2O

XBQAXB2A XBQZXB2Z for EOIO messages) process the pipeline steps for the mapping outbound

binding and call adapter

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

35

Thus if the step DB_SPLITTER_QUEUEING has a high value the outbound queues have to be monitored

using transactions SMQ2 and SARFC The reasons are similar as outlined in chapters qRFC Resources

(SARFC) and qRFC Queues (SMQ2) or described in the section above for PI inbound queues

o Not enough resources (DIA work processes for RFC communication) available

o A backlog in the queues caused by one of the following reasons

The number of parallel outbound queues is too low to handle the outgoing messages in parallel Note that simply increasing the parameter EO_OUTBOUND_ PARALLEL might not always be the solution as enough work processes have to be available to process the queues in parallel The number of work processes in turn is limited by the number of CPUs available for the system Also the parameter EO_OUTBOUND_PARALLEL is used differently than the parameter EO_INBOUND_ PARALLEL because it determines the degree of parallelism for each receiver system Thus it is possible to increase the number of parallel outbound queues for specific receivers whereas other receivers are handled with a lower degree of parallelism Note that not every receiver backend is able to handle a high degree of parallelism so that in case of an ABAP Proxy system it might not indicate a problem on PI in case you observe a high value for DB_SPLITTER_QUEUING

The outbound queues were blocked by the first LUW To identify if such a situation occurred in the past use the Wily Introscope dashboard ldquoABAP System qRFC Inbound Detailldquo In general it is not easy to find the one message that is blocking the queue It might be achieved by checking RFC traces or by searching for a single message with a long pipeline processing step For EO queues there is no business reason to block the queues in case of a single message error To prevent this follow the description in section Prevent blocking of EO queues

Since the outbound queues include a processing step that is bound to take longer than the other steps (that is the mapping and the call adapter step) this might mean a long waiting time for all subsequent messages As an example assume that a mapping takes 1 second and that 100 messages are waiting in a queue The first message gets executed at once the second message has to wait about 1 second (a bit more since more steps are executed than just the mapping but this should be ignored at the moment) the third takes 3 seconds and so on The 100

th message

has to wait 100 seconds until it is processed that is the value for DB_SPLITTER_QUEUING is as high as 100 For a given outbound queue you would therefore see an increase for the DB_SPLITTER_QUEUING value over time If you experience this situation proceed with Chapter 442

In case you have a different runtime of the messages in the queues (eg due to different message size) the number of messages per queue might be uneven distributed between the queues In PI 73 a new balancing option is available as discussed in Avoid uneven backlogs with queue balancing

o Problems when loading the LUW by the QIN scheduler as described in qRFC Queues (SMQ2)

Note The number of parallel outbound queues is also connected with the ability of the receiving system to

process a specific amount of messages for each time unit

In section Adapter Parallelism default restrictions of the specific adapters of the J2EE Engine are explained Some adapters (JDBC File Mail) can process only one message for each server node at a time (this can be changed) It does not make sense for such adapters to have too many parallel qRFC queues since this will only move the message backlog from ABAP to Java

For messages that are directed to the IDoc outbound adapter the value for EO_OUTBOUND_PARALLEL is connected to the MAXCONN value of the corresponding outbound destination You can define the maximum number of connections using transaction SMQS ndash the default value is 10 parallel connections If applicable SAP highly recommends that you use IDoc packaging in the outbound IDoc adapter to reduce the load during the transmission of tRFC calls from the PI system to the receiving system

For ABAP proxy systems the number of outbound queues directly determines the number of messages sent in parallel to the receiving system Thus you have to ensure that the resources

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

36

available on PI are aligned with those on the sendingreceiving ABAP proxy system

446 Long Processing Times for ldquoLMS_EXTRACTIONrdquo

Lean Message Search can be configured for newer PI releases as described in the Online Help After

applying Note 1761133 - PI runtime Enhancement of performance measurement an additional header for

LMS will be written to the performance header

The header could look like this indicating that around 25 seconds were spent in the LMS analysis

When using trace level two additional timestamps are written to provide details about this overall runtime

LMS_EXTRACTION_GET_VALUES

This timestamp describes the evaluation of the message according to user-defined filter criteria

LMS_EXTRACTION_ADJUST_VALUES

This timestamp describes the post-processing of the previously found values in the BAdI (as of PI 730)

The runtime of the LMS heavily depends on the number of payload attributes to be extracted Also the

payload size and the complexity of the XPath expression has a direct impact on the performance of the LMS

In general the number of elements to be indexed should be kept at a minimum and very deep and complex

XPath expressions should be avoided

If you want to minimize the impact of LMS on the PI message processing time you can define the extraction

method to use an external job In that case the messages will be indexed after processing only and it will

therefore have no performance impact during runtime Of course this method imposes a delay in the

indexing (based on the frequency of job SXMS_EXTRACT_MESSAGES) so that newly processed

messages cannot be searched using LMS If this delay is acceptable for the monitoring responsible the

messages should be indexed using an external job

447 Other step performed in the ABAP pipeline

As discussed earlier there are additional steps in the ABAP pipeline that can be executed based on the

configuration of your scenario Per default these steps will not be activated and should therefore not

consume any time All these steps are traced in the performance header of the message Below you can see

the details for

- XML validation

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

37

- Virus Scan

In case one of these steps is taking long you have to check the configuration of your scenario In the

example of the Virus scan the problem might be related to the external virus scanner used and tuning has to

happen on that side

45 PI Message Packaging for Integration Engine

PI message packaging was introduced with SAP NetWeaver PI 70 SP13 and is per default activated in PI

71 and higher Message packaging in BPE is independent of message packaging in PI (see chapter 56)

but they can be used together

To improve performance and message throughput asynchronous messages can be assembled to packages

and processed as one entity For this the qRFC scheduler was enabled to process a set of messages (a

package) in one LUW Thus instead of sending one message to the different pipeline steps (for example

mappingrouting) a package of messages will be sent that will reduce the number of context switches that is

required Furthermore access to databases is more efficient since requests can be bundled in one database

operation Depending on the number and size of messages in the queue this procedure improves

performance considerably In return message packaging can increase the runtime for individual messages

(latency) due to the delay in the packaging process

Message packaging is only applicable for asynchronous scenarios (QoS EO and EOIO) Due to the potential

latency of individual messages packaging is not suitable for interfaces with very high runtime requirements

In general packaging has the highest benefit for interfaces with high volume and small message size

The performance improvement achieved directly relates to the number of the messages bundled in each

package Message packaging must not solely be used in the PI system Tests have shown that the

performance improvement significantly increases if message packaging is configured end-to-end - that is

from the sending system using PI to the receiving system Message packaging is mainly applicable for

application systems connected to PI by ABAP proxy

From the runtime perspective no major changes are introduced when activating packaging Messages

remain individual entities in regards to persistence and monitoring Additional transactions are introduced

allowing the monitoring of the packaging process

Messages are also treated individually for error handling If an error occurs in a message processed in a

package then the package will be disassembled and all messages will be processed as single messages Of

course in a case where many errors occur (for example due to interface design) this will reduce the benefits

of message packaging SAP systems connected to PI via ABAP Proxy can also make use of the message

packaging to reduce the HTTP calls necessary to transmit messages between the systems Other sending

and receiving applications in particular will not see any changes because they send and receive individual

messages and receive individual commits

The PI message packaging can be adapted to meet your specific needs In general the packaging is

determined by three parameters

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

38

1) Message count Maximum number of messages in a package (default 100)

2) Maximum package size Sum of all messages in kilobytes (default 1 MB)

3) Delay time Time to wait before the queue is processed if the number of messages does not reach

the message count (default 0 meaning no waiting time)

These values can be adjusted using transactions SXMS_BCONF and SXMS_BCM To better use the

performance improvements offered by packaging you could for example define a specific packaging for

interfaces with very small messages to allow up to 1000 messages for each package Another option could

be to increase the waiting time (only if latency is not critical) to create bigger packages

See SAP Note 1037176 - XI Runtime Message Packaging for more details about the necessary

prerequisites and configuration of message packaging More information is also available at

httphelpsapcom SAP NetWeaver SAP NetWeaver PIMobileIdM 71 SAP NetWeaver Process

Integration 71 Including Enhancement Package 1 SAP NetWeaver Process Integration Library

Function-Oriented View Process Integration Integration Engine Message Packaging

46 Prevent blocking of EO queues

In general messages for interfaces using Quality of Service Exactly Once are independent of each other In

case of an error in one message there is no business reason to stop the processing of other messages But

exactly this happens in case an EO queue goes into error due to an error in the processing of a single

message The queue will then be automatically retried in configurable intervals This retry will cause a delay

of all other messages in the queue which cannot be processed due to the error of the first message in the

queue

To avoid this a new error handling was introduced in SAP Notes 1298448 - XI runtime No automatic retry

for EO message processing (set EO_RETRY_AUTOMATIC = 0 and schedule restart job

RSXMB_RESTART_MESSAGES) and SAP Note 1393039 - XI runtime Dump stops XI queue After

applying this Note EO queues will not go in SYSFAIL status any longer These Notes are recommended for

all PI systems and are per default activated on PI 73 systems By specifying a receiver ID as sub parameter

this behavior can be configured for individual interfaces only

In PI 73 you can also configure the number of messages that would go into error before the whole queue

would be stopped For permanent errors (eg due to a problem with the receiving backend) the queue would

go into SYSFAIL so that not too many messages are going into error This also reduces the number of alerts

for Message Based Alerting To define the threshold the parameter EO_RETRY_AUT_COUNT can be set

The default value is 0 and indicates that the number of messages is not restricted

47 Avoid uneven backlogs with queue balancing

From PI 73 on a new mechanism is available to address the distribution of messages in queues Per default

in all PI versions the messages are getting assigned to the different queues randomly In case of different

runtimes of LUWs caused by eg different message sizes or different runtimes in mappings this can cause

an uneven distribution of messages within the different queues This can increase the latency of the

messages waiting at the end of that queue

To avoid such an unequal distribution the system checks the queue length before putting a new message in

the queue Hence it tries to achieve an equal balancing during inbound processing A queue which has a

higher backlog will get less new messages assigned Queues with fewer entries will get more messages

assigned Therefore it is different than the old BALANCING parameter of category TUNING) which was used

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

39

to rebalance messages already assigned to a queue The assignment of messages to queues is shown in

the diagram below

To activate the new queue balancing you have to set the parameters EO_QUEUE_BALANCING_READ and

EO_QUEUE_BALANCING_SELECT of category TUNING in SXMB_ADM If the value of the parameter

EO_QUEUE_BALANCING_READ is 0 (default value) then the messages are distributed randomly across

the queues that are available If however EO_QUEUE_BALANCING_READ is set to a value n greater than

zero then on average the current fill level of the queue is determined after every nth message and stored in

the shared memory of the application server This data is used as the basis for determining the queue (see

description of parameter EO_QUEUE_BALANCING_SELECT) relevant for balancing

The parameter EO_QUEUE_BALANCING_SELECT specifies the relative fill levels of the queues in percent

Relative here means in relation to the maximum filled queue If there are queues with a lower fill level than

defined here then only these are taken into consideration for distribution If all queues have a higher fill level

then all queues are taken into consideration

Please note that determining the fill level takes place using database access and therefore impacts system

performance The value of EO_QUEUE_BALANCING_READ should for this reason be oriented towards the

message throughput and specific requirements or even distribution For higher volume system a higher value

should be chosen

Letrsquos look at an example In case you set the parameter EO_QUEUE_BALANCING_READ to 1000 after

every 1000th incoming message the queue distribution will be checked This requires a database access

and can therefore cause a performance impact The fill level for each queue will then be written to shared

memory In our example we configured EO_QUEUE_BALANCING_SELECT to 20 At a given point in time

you have three outbound queues on the system XBTO__A contains 500 entries XBTO__B 150 and

XBTO__C contains 50 messages Therefore XBTO__B has a fill level of 30 and XBTO__C of 10 That

means the only queue relevant for balancing is XBTO__C which will get more messages assigned

Based on this example you can see that it is important to find the correct values As a general guideline to

minimize performance impact you should set EO_QUEUE_BALANCING_READ to a rather large value The

correct value of EO_QUEUE_BALANCING_SELECT depends on the criticality of a delay caused by a

backlog

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

40

48 Reduce the number of parallel EOIO queues

As discussed in Chapter Parallelization of PI qRFC Queues it is important to keep the number of queues

limited As a rule of thumb the number of active queues should be equal to the number of available work

processes in the system It is especially crucial to not have a lot of queues containing only one message

since this will cause a high overhead on the QIN scheduler due to very frequent reloading of the queues

Especially for EOIO queues we often see a high number of parallel queues containing only one message

The reason for this is for example that the serialization has to be done on document number to eg ensure

that updates to the same material are not transferred out of sequence Typically during a batch triggered

load of master data this can result in hundreds of EOIO queues in SMQ2 just containing only one or two

messages This is very bad from a performance point of view and will cause significant performance

degradation for all other interfaces running at the same time To overcome this situation the overall number

of EOIO queues (for all interfaces) can be limited as of PI 71 as described in Change Number of EOIO

Queues

The maximum number of queues can be set in SXMB_ADM Configure Filter for Queue Prioritization

Number of EOIO queues

During runtime a new set of queues will be used with the name XB2 as can be seen below for the outbound

queues

These queues will be shared between all EOIO interfaces for the given interface priority and by setting the

number of queues the parallelization for all EOIO interfaces is limited Thus more messages will be using the

same EOIO queue and therefore PI message packaging will work better and also the reloading of the

queues by the QIN scheduler will show much better performance

In case of errors the messages will be removed from the XB2 queues and will be moved to the standard

XBQ queues All other messages for the same serialization context will be moved to the XBQ queue

directly to ensure the serialization is maintained This means that in case of an error the shared EOIO

queues will not be blocked and messages for other serialization contexts will be not delayed

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

41

49 Tuning the ABAP IDoc Adapter

Very often the IDoc adapter deals with very high message volume and tuning of this is very essential

491 ABAP basis tuning

As stated above the IDoc adapter uses the tRFC layer to transfer messages from PI to the receiver system

The IDoc adapter will only put the message in SM58 and from there the standard tRFC layer forwards the

LUW You therefore have to ensure that sufficient resources (mainly DIA WPs) are available for processing

tRFC requests

Wily Introscope also offers a dashboard (shown below) for the tRFC layer which shows the tRFC entries and

their different statuses With these dashboards you are also able to identify history backlogs on tRFC

In order to control the resources used when sending the IDocs from sender system to PI or from PI to the

receiver backend you can also think about registering the RFC destinations in SMQS in PI as known from

standard ALE tuning By changing the Max Conn or the Max Runtime values in SMQS you can limit or

increase the number of DIA WPs used by the tRFC layer to send the IDocs for a given destination This will

mitigate the risk of system overload on sender and receiver side

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

42

492 Packaging on sender and receiver side

One option for improving the performance of messages processed with the IDoc adapter is to use IDoc

packaging

The receiver IDoc adapter in PI (for sending messages from PI to the receiving application system) already

had the option for packaging IDocs in previous releases For this messages processed in the IDoc adapter

are bundled together before being sent out to the receiving system Thus only one tRFC call is required to

send multiple IDocs The message packaging that is being discussed in section PI Message Packaging uses

a similar approach and therefore replaces the ldquooldrdquo packaging of IDocs in the receiver IDoc adapter

Therefore we highly recommend configuring Message Packaging since this helps transferring data for the

IDoc adapter as well as the ABAP Proxy

For the sender IDoc adapter there was previously the option to activate packaging in the partner profile of

the sending system in case IDocs are not send immediately but in batch By specifying the ldquoMaximum

Number of IDocsrdquo in the report RSEOUT00 multiple IDocs are bundled in one tRFC calls At the PI side

these packages will be disassembled by the IDoc adapter and the messages will be processed individually

Therefore this will help mainly to reduce the tRFC resources involved in transferring the data between the

systems but not within PI

This behavior was changed with SAP NetWeaver PI 711 and higher If the sender backend sends an IDoc

package to PI a new option has been introduced which allows the processing of IDocs as a package on PI

as well You can specify an IDoc package size on the sender IDoc Communication Channel The necessary

configuration is described in detail in the following SDN blog - IDoc Package Processing Using Sender IDoc

Adapter in PI EhP1 A significant increase in throughput was recorded for high volume interfaces with small

IDocs

493 Configuration of IDoc posting on receiver side

As described in SAP Note 1828282 - IDoc adapter long runtime XI -gt IDoc-gt ALE also the way the IDocs

are posted on the receiver ERP backend will directly influence the processing of the LUW on the PI side In

general two options exist for IDoc posting

Trigger immediately The IDoc will be posted directly when it is received For this a free dialog workprocess will be required

Trigger by background program If this option is chosen the IDoc is persisted on the ERP side and the posting of the application data will happen asynchronously using report RBDAPP01 via a background job

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

43

When ldquotrigger immediatelyrdquo is configured the sender system (PI) has to wait till the IDoc is posted While the

workprocess will roll out you will see the LUW in ldquoTransaction Executingrdquo during this time For the

IDOC_AAE the RFC connection and the java threads will be kept busy until the IDOC is posted in the

backend therefore leading to backlogs on the IDOC_AAE

The coding for posting the application data can be very complex and therefore this operation can take a long

time consuming unnecessary resources also on the sender side With background processing of the IDocs

the sender and receiver systems are decoupled from each other

Due to this we generally recommend to use ldquotrigger immediatelyrdquo in exceptional cases where the data has to

be posted without any delay In all other cases like high volume replication of master data background

processing of IDocs should be chosen

With Note 1793313 and 1855518 a third option for posting IDocs on the ERP side was introduced This

option is using bgRFC on the ERP side to queue the IDocs As soon as the configured resources for the

bgRFC destination are available the IDocs will be posted By doing so the IDocs will be posted based on

resource availability on the receiver system and no additional background jobs will be required Furthermore

storing the LUWs in bgRFC will ensure that the sender and receiver systems are decoupled This option

therefore guarantees that the IDocs will be posted as soon as possible based on the available resources on

the receiver system without the requirement to schedule many background jobs This new option is therefore

the recommended way of IDoc posting in systems that fulfill the necessary requirements

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

44

410 Message Prioritization on the ABAP Stack

A performance problem is often caused by an overload of the system for example due to a high volume

master interface If business critical interfaces with a maximum response time are running at the same time

then they might face delays due to a lack of resources Furthermore as described in qRFC Queues (SMQ2)

messages of different interfaces use the same inbound queue by default which means that critical and less-

critical messages are mixed up in the same queue

We highly recommend using message prioritization on the ABAP stack to prevent such delays for critical

interfaces (for Java see Interface Prioritization in the Messaging System )

For more information about PI prioritization see httphelpsapcom and navigate to SAP NetWeaver SAP

NetWeaver PIMobileIdM 71 SAP NetWeaver Process Integration 71 Including Enhancement Package 1

SAP NetWeaver Process Integration Library Function-Oriented View Process Integration

Integration Engine Prioritized Message Processing

To balance the available resources further between the configured interfaces you can also configure a

different parallelization level for queues with different priorities Details for this can be found in SAP Note

1333028 ndash Different Parallelization for HP Queues

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

45

5 ANALYZING THE BUSINESS PROCESS ENGINE (BPE)

This chapter helps to analyze performance problems if Chapter Determining the Bottleneck has shown that

the BPE is the slowest step for a message during its time in the PI system Of course this type of runtime

engine is very susceptible to inefficient design of the implemented integration process Information about

best practices for designing BPE processes can be found in the documentation Making Correct Use of

Integration Processes In general the memory and resource consumption is higher than for a simple pipeline

processing in the Integration Engine As outlined in the document linked above every message that is sent

to BPE and every message that is sent from BPE is duplicated In addition work items are created for the

integration process itself as well as for every step More database space is therefore required than

previously expected and more CPU time is needed for the additional steps

51 Work Process Overview

The Business Process Engine executes the steps (work items) of an integration process using tRFC calls to

the RFC destination WORKFLOW_LOCAL_ltclientgt To check if there are enough resources available to

process the work items call transaction SM50 (on each application server) while one of the integration

processes is running

o Look for the user ldquoWF-BATCHrdquo and follow its activities by refreshing the transaction

o Are there always dialog work processes available

The most important point to keep in mind here is the concurrent activity of the Business Process Engine (for

ccBPM processes) and the Integration Engine (for pipeline processing) Both runtime environments need

dialog work processes Thus performance problems in one of the engines cannot be solved by restricting

the other Rather an appropriate balance between the two runtimes has to be found

The activity of the WF-BATCH user is not restricted by default It is highly recommended that you register the

RFC destination WORKFLOW_LOCAL_ltclientgt with transaction SMQS and assign a suitable amount of

maximum connections Otherwise ccBPM activity may block the pipeline processing of the Integration

Server and reduce the throughput of PI drastically SAP Note 888279 - Regulatingdistributing the Workflow

Load gives more details

In addition it might help to use specific servers (dialog instances) to carry out a given workflow or to use load

balancing Of course both options only apply for larger PI systems that consist of at least a central instance

and one dialog instance

52 Duration of Integration Process Steps

As described in section Processing Time in the Business Process Engine with PI 73 there is a new monitor

available in ldquoConfiguration and Monitoring Homerdquo on PI start page It shows the duration of individual steps

of an integration process in a very easy way and is the now tool to analyze performance related issues on

the ccBPM

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

46

For releases prior than PI 73 check in the workflow log for long-running steps as described in the previous

chapter Use the ldquoList with Technical Detailsrdquo to do so Since the design of integration processes varies

significantly there is no general rule for the duration of a step or for specific sequences of steps Instead

you have to get a feeling for your implemented integration processes

Did a specific process step decrease in performance over a period of time

Does one specific process step stick out with regard to the other steps of the same integration process

Do you observe long durations for a transformation step (ldquomappingrdquo)

Do you observe long durations of synchronous sendreceive steps in the integration process which correlates for example to a slow response from a connected remote system

Use the SAP Notes database to learn about recommendations and hints SAP Note 857530 - BPE

Composite Note Regarding Performance acts as a central note for all performance notes and might be used

as the entry point

Once you are able to answer the above questions there are several paths to follow

o Two common reasons should be taken into consideration for a performance decrease over time Firstly the load on PI might have increased over time as well Check the hardware resources (chapter 9) and carefully monitor the work process availability (chapter 51) Alternatively the underlying databases might slow down the integration processes Use chapter 54 to check this possibility

o If it is a specific process step that sticks out the process design itself has to be reviewed The SAP Notes mentioned above might help here

o If the mapping step takes a long time it is worth having a look at chapter 442 since the BPE and the IE use the same resources (mapping connections from the ABAP gateway to the J2EE server) to execute their mapping

o If there is a long processing time for a synchronous sendreceive step check the connected backend system for any performance issues

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

47

53 Advanced Analysis Load Created by the Business Process Engine (ST03N)

Since the Business Process Engine runs on the ABAP part of the PI Integration Server you can use the

statistical data written by the dialog work processes to analyze a performance problem

Procedure

Log in to your Integration Server and call transaction ST03N Switch to ldquoExpert Moderdquo and choose the

appropriate timeframe (this could be the ldquoLast Minutersquos Loadrdquo from the Detailed Analysis menu or a specific

day or week from the Workload menu) In the Analysis View navigate to User and Settlement Statistics and

then to User Profile The user you are interested in is the WF-BATCH user who does all ccBPM-related work

Compare the total CPU Time (in seconds) to the time frame (in seconds) of the analysis In the above example an analysis time frame of 1 hour has been chosen which corresponds to 3600 seconds To be exact these 3600 seconds have to be multiplied by the number of available CPUs Out of these 3600CPU seconds the WF-BATCH user used up 36 seconds

o The above value gives you a good idea of how much load the Business Process Engine creates on your PI Integration Server By comparison with the other users (as well as the CPU utilization from transaction ST06) you can determine if the Integration Server is able to handle this load with the available number of CPUs or not This does not help you to optimize the load but it does provide support when deciding how to distribute the workload of the BPE and the workload of the Integration Server (see Chapter 51)

Experienced ABAP administrators should also analyze the database time wait time and compare these values

54 Database Reorganization

The database is accessed many times during the processing of an Integration Process This is not usually

connected with performance problems but if a specific database table is large then statements may take

longer than for small database tables

Use transaction ST05 to collect a SQL trace while the integration process in question is being executed Restrict the trace to user WF-BATCH When the integration process is finished stop the trace and view it Look for high values in the duration column especially for SELECT statements

Use transaction SE16 to determine the number of entries for the typical workflow tables SWW_WI2OBJ and SWFRXI There are many more database tables used by BPE but if the number of entries is high for the above tables they are high for the rest as well See SAP Note 872388 - Troubleshooting Archiving and Deletion in XI and the links given in this note if you find high amounts of entries

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

48

55 Queuing and ccBPM (or increasing Parallelism for ccBPM Processes)

Until recently ccBPM processes accepted messages from only one queue (one queue XBQO$PEWS for

each workflow) If there was a high message throughput for this workflow then a high backlog occurred for

these queues A couple of enhancements have therefore been implemented to improve scalability if the

ccBPM runtime

o Inbound processing without buffering

o Use of one configurable queue that can be prioritized or assigned to a dedicated server

o Use of multiple inbound queues (queue name will be XBPE_WS)

o Configure transactional behavior of ccBPM process steps to reduce number of persistency steps

Details about the configuration and tuning of the ccBPM runtime are described at

httpswwwsdnsapcomirjsdnhowtoguides SAP Netweaver 70 End-to-End Process Integration

o How to Configure Inbound Processing in ccBPM Part I Delivery Mode

o How to Configure Inbound Processing in ccBPM Part II Queue Assignment

o How to Configure ccBPM Runtime Part III Transactional Behavior of an Integration Process

Procedure

Use transaction SMQ2 to check if the queues of type XBQO$PE show a high backlog

o Based on this information verify if the ccBPM process is suitable to run with multiple queues This is especially the case for collect patterns realized in BPE If you can use multiple queues configure the workflow based on the above guides

56 Message Packaging in BPE

Inbound processing takes up the most amount of processing time in many scenarios within BPE

Message packaging in BPE helps to improve performance by delivering multiple messages to BPE process

instances in one transaction This can lead to an increased throughput which means that the number of

messages that can be processed in a specific amount of time can increase significantly Message packaging

can also increase the runtime for individual messages (latency) due to the delay in the packaging process

The sending of packages can be triggered when the packages exceed a certain number of messages a

specific package size (in kB) or a maximum waiting time The extent of the performance improvement that

can be obtained depends on the type of scenario Scenarios with the following prerequisites are generally

most suitable

o Many messages are received for each process instance

o Messages that are sent to a process instance arrive together in a short period of time

o Generally high load on the process type

o Messages do not have to be delivered immediately

For example collect scenarios are particularly suitable for message packaging The higher the number of

messages is in a package the higher the potential performance improvements will be

Tests that have been performed have shown a high potential for throughput improvements up to factor 47 in

tested Collect Scenarios depending on the packaging size that has been configured For more details about

the functionality of Message Packaging in BPE and its configuration see SAP Note 1058623 - BPE

Message Packaging

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

49

Message packaging in BPE is independent of message packaging in PI (see Chapter 45) but they can be

used together

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

50

6 ANALYZING THE (ADVANCED) ADAPTER ENGINE

This chapter describes further checks and possible reasons for bottlenecks in the (Advanced) Adapter

Engine Although this is not the place to describe the architecture of the Adapter Framework in detail the

following aspects are important to know for the analysis of a performance problem on the AFW

o The AFW contains the Messaging System that is responsible for the persistence and queuing of messages Within the flow of a message it is placed in between the sender adapter (for example a file adapter that polls from a folder) and the Integration Server as well as between the Integration Server and the receiver adapter (for example a JMS adapter that sends a message out to an external system)

o The Messaging System uses 4 queues for each adapter type and these are responsible for receiving and sending messages To separate the message transfer each adapter (for example JMS SOAP JDBC and so on) has its own set of queues One queue for receiving synchronous messages (Request queue) one queue for sending synchronous messages (Call queue) one queue for receiving asynchronous messages (Receive queue) and one queue for sending asynchronous messages (Send queue) The name of the queue always states the adapter type and the queue name (for example JMS_httpsapcomxiXISystemReceive for the JMS receiver asynchronous queues)

o A configurable number of consumer threads are assigned on each queue These threads work on the queue in parallel meaning that the Messaging Queues are not strictly FirstIn-FirstOut Five threads are assigned by default to each queue and thus five messages can be processed in parallel But as will be discussed later the parallel execution of requests to the backend depends heavily on the adapter being used since not every adapter can work in parallel by default

o SAP NetWeaver PI 71 introduced a new queue The dispatcher queue The dispatcher queue is used to realize prioritization on the AAE (see chapter Prioritization in the Messaging System) Every message in the PI system is first assigned to the dispatcher queue before it is assigned to the adapter-specific queues

Viewed from the perspective of a message that enters the PI system using a J2EE Engine adapter (for

example File) and leaves the PI system using a J2EE Engine adapter (for example JDBC) asynchronously

the steps are as follows

1) Enter the sender adapter

2) Put into the dispatcher queue of the messaging system

3) Forwarded to the File send queue of the messaging system (based on Interface priority)

4) Retrieved from the send queue by a consumer thread

5) Sent from the messaging system to the pipeline of the ABAP Integration Engine (if no Integrated Configuration (AAE) is used

6) Processed by the pipeline steps of the Integration Engine

7) Sent to the messaging system by the Integration Engine

8) Put into the dispatcher queue of the messaging system

9) Forwarded to the JDBC receive queue of the messaging system (based on Interface priority)

10) Retrieved from the receive queue (based on the maxReceivers)

11) Sent to the receiver adapter

12) Sent to the final receiver

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

51

All the analysis is based on the information provided in the Audit Log With SAP NetWeaver PI 71 the audit

log is not persisted for successful messages in the database by default to avoid performance overhead

Therefore the audit log is only available in the cache for a limited period of time (based on the overall

message volume) More details can be found in chapter Persistence of Audit Log information in PI 710 and

higher

As described in chapter Adapter Engine Performance monitor in PI 731 and higher a new performance

monitor is available from PI 731 SP4 With this monitor you can display the processing time per interface in

a given interval and identify the processing steps in which a lot of time is spend

With 731 SP10 and 74 SP5 a download functionality for the performance monitor is provided as described

in SAP Note 1886761

61 Adapter Performance Problem

Compared to a bottleneck in the Messaging System it is relatively easy to detect a performance

problembottleneck in the adapter Since the timestamps for inbound and outbound messages are written at

different points in time the procedure is described separately for sender and receiver adapters

Note It is not generally possible for all sender and receiver adapters to work in parallel (discussed in section

Adapter Parallelism)

611 Adapter Parallelism

As mentioned before not all adapters are able to handle requests in parallel Thus increasing the number of

threads working on a queue in the messaging system will not solve a performance problembottleneck

There are 3 strategies to work around these restrictions

1) Create additional communication channels with a different name and adjust the respective senderreceiver agreements to use them in parallel

2) Add a second server node that will automatically have the same adapters and communication channel running as the first server node This does not work for polling sender adapters (File JDBC or Mail) since the adapter framework scheduler assigns only one server node to a polling communication channel

3) Install and use a non-central Adapter Framework for performance-critical interfaces to achieve better separation of interfaces

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

52

Some of the most frequently-used adapters and the possible options are discussed below

o Polling Adapters (JDBC Mail File)

At the sender side these adapters use an Adapter Framework Scheduler which assigns a server node that does the polling at the specified interval Thus only one server node in the J2EE cluster does the polling and no parallelization can be achieved Therefore scaling via additional server node is not possible More details will be presented in section Adapter Framework Scheduler For example since the channels are doing the same SELECT statement on the database or pick up files with the same file name parallel processing will only result in locking problems To increase the throughput for such interfaces the polling interval has to be reduced to avoid a big backlog of data that is to be polled If the volume is still too high you should think about creating a second interface for example the new interface would poll the data from a different directory or database table to avoid locking

At the receiver side the adapters work sequentially on each server node by default For example only one UPDATE statement can be executed for JDBC for each Communication Channel (independent of the number of consumer threads configured in the Messaging System) and all the other messages for the same Communication Channel will wait until the first one is finished This is done to avoid blocking situations on the remote database On the other hand this can cause blocking situations for whole adapters as discussed in section Avoid Blocking Caused by Single SlowHanging Receiver Interface

To allow better throughput for these adapters you can configure the degree of parallelism at the receiver side In the Processing tab of the Communication Channel in the field ldquoMaximum Concurrencyrdquo enter the number of messages to be processed in parallel by the receiver channel For example if you enter the value 2 then two messages are processed in parallel on one J2EE server node The parallel execution of these statements at database level of course depends on the nature of the statements and the isolation level defined on the database If all statements update the same database record database locking will occur and no parallelization can be achieved

o JMS Adapter

The JMS adapter is per default using a push mechanism on the PI sender side This means the data is pushed by the sending MQ provider Per default every Communication channel has one JMS connection established per J2EE server node On each connection it processes one message after the other so that the processing is sequential Since there is one connection per server node scaling via additional server nodes is an option

With Note 1502046 - JMS Adapter message pull mechanism the JMS protocol can be changed to use a pull mechanism By doing so you can specify a polling interval in the PI Communication Channel and PI will be the initiator of the communication Also here the Adapter Framework Scheduler is used which implies sequential processing But in contrary to JDBC or File sender channels the JMS polling sender channel will allow parallel processing on all server nodes of the J2EE cluster Therefore scaling via additional J2EE server nodes is an option

The JMS receiver side supports parallel operation out of the box Only some small parts during message processing (pure sending of the message to the JMS provider) are synchronized Therefore to enable parallel processing on the JMS receiver side no actions are necessary

o SOAP Adapter

The SOAP adapter is able to process requests in parallel The SOAP sender side has in general no limitations in the number of requests it can execute in parallel The limiting factor here is the FCAThreads available to process the incoming HTTP calls This is discussed in more detail in chapter Tuning SOAP sender adapter On the receiver side the parallelism depends on the number of the threads defined in the messaging system and the ability of the receiving system to cope with the load

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

53

o RFC Adapter

The RFC adapter offers parameters to adjust the degree of parallelism by defining the number of initial and maximum connections to be used The initial threads are allocated from the Application thread pool directly and are therefore not available for any other tasks Therefore the number of initial thread should be kept minimal To avoid bottlenecks during the peak time the maximum connections can be used A bottleneck is indicated by the following exception in the Audit Log

comsapaiiaframsapiDeliveryException error while processing message to remote systemcomsapaiiafrfccoreclientRfcClientException resource error could not get a client from JCOPool comsapmwjcoJCO$Exception (106) JCO_ERROR_RESOURCE Connection pool RfcClient hellip is exhausted The current pool size limit (max connections) is 1 connections

Also this should be done carefully since these threads will be taken from the J2EE application thread pool Therefore a very high value there can cause a bottleneck on the J2EE engine and therefore cause a major instability of the system As per RFC adapter online help the maximum number of connections are restricted to 50

o IDoc_AAE Adapter

The IDoc adapter on Java (IDoc_AAE) was introduced with PI 73 On the sender side the parallelization depends on the configuration mode chosen In Manual Mode the adapter works sequential per server node For channels in ldquoDefault Moderdquo it depends on the configuration of the inbound Resource Adapter (RA) Via the parameter MaxReaderThreadCount of the inbound RA you can configure how many threads are globally available for all IDoc adapters running in Default Mode Hence this determines the overall parallelization of the IDoc sender adapter per Java server node Currently the recommended maximum number of threads is 10

The receiver side of the Java IDoc adapter is working parallel per default

The table below gives a summary of the parallelism for the different adapter types

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

54

612 Sender Adapter

The first step(s) to be taken for a sender adapter depend on the type of adapter itself Afterwards however

the message flow is always the same First the message is processed in the Module Processor and

afterwards put into the queue of the Messaging System If the message is synchronous then it is the call

queue if the message is asynchronous that it is the send queue The final action of the Adapter Framework

is then to send the message from the messaging system to the Integration Server (for Non ICO interfaces)

Only the first entries of the audit log are of interest to establish if the sender adapter has a performance

problem From the first entry to the entry ldquoMessage successfully put into the queuerdquo These steps logically

belong to the sender adapter and the Module Processor while the subsequent steps belong to the

Messaging System The sender adapter uses one J2EE Engine application thread to execute its work This

is generally the same thread for all steps involved

Select the ldquoDetailsrdquo of a message in the AFW The first entry of the audit log will be the beginning of the activity of the sender adapter The end of the adapterrsquos activity is marked by the entry ldquoThe application sent the message asynchronously using connection Returning to applicationrdquo

6121 Tuning SOAP sender adapter As described above in general there is no limitation in the parallelization of the SOAP sender adapter itself

The incoming SOAP requests are limited by the available FCA Threads for HTTP processing The FCA

Threads are discussed in more detail in FCA Server Threads

Per default all interfaces using the SOAP adapter are using the same set of FCA Threads since they are all

using the same URL httplthostgtltportgtMessageServlet In case one interface is facing a very high load or

slow backend connections this can cause a blockage of the available FCA Threads and could cause a heavy

impact on the processing time of other interfaces

To avoid this SAP Note 1903431 allows the usage of different URLs for specific interfaces This way you

could isolate interfaces with a very high volume on a dedicated entry point using its own set of FCA Threads

In case of high load for this interface the other SOAP sender interfaces would not be affected by a shortage

of FCA Threads

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

55

To use this new feature you can specify on the sender system the following two entry points

o MessageServletInternal

o MessageServletExternal

More information about the URL of the SOAP adapter can be found here

613 Receiver Adapter

For a receiver adapter it is just the other way around when compared to the sender adapter First the

message is received from the Integration Server and then put into a queue of the Messaging System this

time either the request or the receive queue for asynchronous and the request queue for synchronous

messages The last step is then the activity of the receiver adapter itself

Select the ldquoDetailsrdquo of a message in the AFW The adapter activity starts after the entry ldquoThe message status set to DLNGrdquo The message will then be forwarded to the entry point of the AFW ldquoDelivering to channel ltchannel_namegtrdquo enters the Module Processor ldquoMP Entering Module Processorrdquo and ends with ldquoThe message was successfully delivered to the application using connection rdquo

Depending on the type of the adapter you now have to investigate why the receiver adapter needs a high amount of processing time This could be a complicated operating system command a bad network connection or a slow backend system among many other reasons

Here too custom modules can be the reason for a prolonged processing time Check Performance of Module Processing for more details

Note From 71 the audit log is not persisted any longer for performance reasons as described in Persistence of Audit Log information in PI 710 and higher

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

56

614 IDoc_AAE adapter tuning

A comprehensive and up-to date description about the tuning of the IDoc adapter is summarized in Note

1641541 - IDoc_AAE Adapter performance tuning Please refer to this Note for details

Similar to all other adapters running on the Adapter Engine also the IDOC_AAE adapter is using the Messaging System queues and the explanations of the next chapter Messaging System Bottleneck are also valid for the IDOC_AAE adapter Note that the IDOC_AAE adapter only supports async Scenarios ndash so it is only using the ldquoSendrdquo queue in Java-only scenarios and the ldquoSendrdquo and ldquoReceiverdquo queue in dual-stack scenariosrdquo

For the IDoc_AAE sender you can configure in addition bulk support as indicated in the screenshot below By doing so all messages received by PI in one RFC call from ERP will be processed as one message This is similar to the mechanism of the ABAP IDoc adapter described in the chapter Packaging on sender and receiver side except with the difference that the number of messages in one bulk cannot be further limited

It is also essential to ensure that the size of the IDocs or IDoc packages for sender or receiver IDocs is not

getting too large to avoid any negative impact on the Java heap memory and Garbage Collection In the

case of the IDoc_AAE adapter the XML size of the IDoc is less critical it is the number of segments in an

IDoc or the overall number of segments in all IDocs of an IDoc package that will influence the memory

allocation during message processing Based on the current implementation per IDoc segment around 5 KB

of memory during processing is required A package of 5 IDocs with 2000 segments for each IDoc would

consume roughly 500 MB during processing In such cases it is important to eg lower the package size

andor use large message queues as outlined in chapter Large message queues on PI Adapter Engine

615 Packaging for Proxy (SOAP in XI 30 protocol) and Java IDoc adapter

Due to message packaging in the Integration Engine (see PI Message Packaging for Integration Engine) the

ABAP based IDoc and Proxy adapter was always using packaging when sending asynchronous messages

to the receiver system With PI 731 packaging was also introduced for Java based IDoc and Proxy (SOAP in

XI 30 protocol) adapter The aim of packaging is to achieve similar benefits to packaging already existing in

ABAP pipeline

o Less calls to backend consuming less parallel Dialog Work Processes on the receiver

o Less overhead due to context switches authentication when calling the backend

Especially for high volume interfaces packaging should be enabled to avoid overloading of the backend

systems and impact on users operating on the system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

57

Important

In the ABAP stack the packaging is active per default (in 71 and higher) as described in chapter PI

Message Packaging for Integration Engine Experience shows that packaging can have a very high impact

on the DIA Work process usage on the ERP receiver side Especially for migration projects from dual-stack

PI to AEXPO it is important to evaluate packaging to avoid any negative impact on the receiving ECC due

to missing packaging

The packaging on Java works in general different then on ABAP While in ABAP the aggregation is done on

the individual qRFC queue level (which always contains messages to one receiver system only) this is not

possible on the Adapter Engine since messages for all receivers are located in the same queue Therefore

packages are built outside of the Adapter Engine queues The messages to be packaged will be forwarded

to a bulk handler which waits till either the time for the package is exceeded or the number of messages or

data size is reached After this the message is send by a bulk thread to the receiving backend system

Packaging on Java is always done for a server node individually If you have a high number of server nodes

packaging will work less efficiently due to the load balancing of messages across the available server nodes

When enabling message packaging the message status will stay in status ldquoDeliveringrdquo throughout all steps

described above In the audit log you can see the time it spends in packaging The audit log shown below

shows eg that the message was almost waiting one minute before the package was built and 9 IDocs were

sent with this package

Packaging can be enabled globally by setting the following parameters in Messaging System service as

described in Note 1913972

o messagingsystemmsgcollectorenabled Enable packaging or disable it globally

o messagingsystemmsgcollectormaxMemPerBulk The maximum size of a bulk message

o messagingsystemmsgcollectorbulkTimeout The wait time per bulk - default 60 seconds

o messagingsystemmsgcollectormaxMemTotal Maximum memory which can be used by message collector

o messagingsystemmsgcollectorpoolSize Number of parallel threads used to send out packages to the backend This value has to be tuned based on the general performance of the receiver system the network latency between the systems or the configuration of the IDoc posting on the ERP sender side (described in chapter IDoc posting on receiver side) Therefore tuning of these threads might be very important These threads can unfortunately not be monitored yet with any PI tool or Wily

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

58

Introscope To verify that not all packaging threads are in use you can use the Thread monitoring in the SAP Management Console and check for threads named ldquoBULK_EXECUTORrdquo as shown below

If you would like to adapt the packaging for specific communication channel this can also be done using the

configuration options in the Integration Directory

While in the SOAP adapter in XI 30 protocol you can define three different packaging KPIs this is currently

not possible for the IDoc adapter There you can only specify the package size based on number of

messages

616 Adapter Framework Scheduler

For polling adapters like File JDBC and JMS the Adapter Framework Scheduler determines on which server

node the polling takes place Per default a File or JDBC sender only works on one server node and the

Adapter Framework (AFW) Scheduler determines on which one the job is scheduled

The Adapter Frameworks scheduler had several performance and reliability issues in a cluster environment

especially in systems having many Java server nodes and many polling communication channels Based on

this and the symptoms described in Note 1355715 - AF Scheduler to avoid using cluster communication a

new scheduler was released We highly recommend using the new version of the AFW scheduler To

activate the new scheduler you have to set the parameter schedulerrelocMode of the Adapter framework

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

59

service property to some negative value For instance by setting the value to -15 after every 15th polling

interval a rebalancing of the channel within the J2EE cluster might happen This can allow for better

balancing of the incoming load across the available server nodes if the files are coming in on regular

intervals The balancing is achieved by doing a servlet call Based on the HTTP load balancing the channel

might then be dispatched to another server node To avoid the balancing overhead this value should not be

set too low A value between -10 and -15 should be acceptable in most cases For very short polling intervals

(eg every second) even lower values (eg -50) can be configured

In case many files are put at one given time on the directory (by eg using a batch process) all these files will

be processed by one polling process only Hence a proper load balancing cannot be achieved by the AFW

scheduler In such a case the only option is to use write the files with different names or different directories

so that you can configure multiple sender channels to pick up the files

Starting from PI 731 you can monitor the Adapter Framework Scheduler in pimon Monitoring

Background Job Processing Monitor Adapter Framework Scheduler Jobs There you can check on which

server node the Communication Channel is polling (Status=rdquoActiverdquo) Also you can see the time the channel

was polling the last time and will poll next time

You cannot influence the server node on which the channel is polling This is determined via the AFW

scheduler You can only influence the frequency after which a channel can potentially move to another

server node by tuning the parameter relocMode as outlined above

Prior to PI 731 you have to use the following URL to monitor the Adapter Framework Scheduler

httpserverportAdapterFrameworkschedulerschedulerjsp

Above you see an example of a File sender channel The type value determines if the channel runs only on

one server node (type = L (local)) or on all server nodes (Type = G (global)) The time in the brackets

[60000] determines the polling interval in ms ndash in this case 60 seconds The Status column shows the

status of the channel

o ldquoONrdquo Currently polling

o ldquoonrdquo Currently waiting for the next polling

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

60

o ldquooffrdquo not scheduled any longer (eg channel deactivated or scheduled on another server node)

62 Messaging System Bottleneck

As described earlier the Messaging System is responsible for persisting and queuing messages that come

from any sender adapter and go to the Integration Server pipeline or Java only interfaces (ICO) or that come

from the Integration Server pipeline and go to any receiver adapter The time difference between receiving

the message from the one side and delivering it to the other side is the value that can be used to analyze

bottlenecks in the Messaging System The following chapters describe how to determine the time difference

for both directions

A bottleneck (or rather a ldquofakerdquo bottleneck) in the Messaging System can usually be closely connected with a

performance problem in the receiver adapter You must therefore make sure you execute the check

described in chapter 613 If the receiver adapter needs a lot of time to process messages then of course

the messages get queued in the messaging system and remain there for a long time since the adapter is not

ready to process the next one yet It looks like the messaging system is not fast enough but actually the

receiver adapter is the limiting factor

Execute the checks described in chapter 621 and 622 to get an overview of the situation you should do this for multiple messages

Decide if a specific receiver adapter is causing long waiting times in the messaging system by executing the check described in chapter 613 An additional hint at a receiver adapter problem is if only messages to specific receivers show a long wait time in the queue

Use the queue monitoring in the Runtime Workbench (RWB) to analyze how many threads are active for each of the queue types and how large the queue size currently is To do so choose Component Monitoring Adapter Engine Press the Engine Status button and choose tab ldquoAdditional Datardquo This will open the page below showing the number of messages in a queue the threads currently working and the maximum available threads Note This view only shows the Messaging System of one J2EE server node To get the whole picture you have to check all the server nodes that are available in your system via the drop down list box

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

61

o Since checking all the server nodes with this browser frontend can be very tedious it is better to use Wily Introscope since it allows easy graphical analysis of the queues and thread usage of the messaging system across Java server nodes

The starting point in Wily Introscope is usually the PI triage showing all the backlogs in the queue In the screenshot below a backlog in the dispatcher queue (for 71 and higher only) can be seen

The dispatcher queue is not the problem here since it will forward messages to the adapter queue if any free

threads are available on the adapter specific consumer threads The analysis must start in the PI inbound

queues By using the navigation button in the upper right corner of the inbound queue size you can directly

jump to a more detailed view where you can see that the file adapter was causing the backlog

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

62

To see the consumer thread usage you can then follow the link to the file adapter In the screenshot below

you can see that all consumer threads were exceeded during the time of the backlog Thus increasing the

number of consumer threads could be a solution if the resulting delay is not acceptable for business

o If it is obvious that the messaging system does not have enough resources as in the case above increase the number of consumers for the queue in question Use the results of the previous check to decide which adapter queues need more consumers

The adapter specific queues in the messaging system have to be configured in the NWA using

service ldquoXPI Service AF Corerdquo and property messagingconnectionDefinition The default

values for the sending and receiving consumer threads are set as follows

(name=global messageListener=localejbsAFWListener exceptionListener=

localejbsAFWListener pollInterval=60000 pollAttempts=60

SendmaxConsumers=5 RecvmaxConsumers=5 CallmaxConsumers=5

RqstmaxConsumers=5)

To set individual values for a specific adapter type you have to add a new property set to the default set with the name of the respective adapter type for example

(name=JMS_httpsapcomxiXISystem

messageListener=localejbsAFWListener

exceptionListener=localejbsAFWListener pollInterval=60000

pollAttempts=60 SendmaxConsumers=7 RecvmaxConsumers=7

CallmaxConsumers=7 RqstmaxConsumers=7)

The name of the adapter specific queue is based on the pattern ltadapter_namegt_ ltnamespacegt for

example RFC_httpsapcomxiXISystem

Note that you must not change parameters such as pollInterval and pollAttempts For more details see SAP Note 791655 - Documentation of the XI Messaging System Service Properties

o Not all adapters use the parameter above Some special adapters like CIDX RNIF or Java Proxy can be changed by using the service ldquoXPI Service Messaging Systemrdquo and property messagingconnectionParams by adding the relevant lines for the adapter in question as described above

o A performance problem on the Messaging System can also be caused by a high logging as described in chapter Logging Staging on the AAE (PI 73 and higher)

o Important Make sure that your system is able to handle the additional threads that is monitor the CPU usage and application thread availability as well as the memory of the J2EE Engine after you have applied the change

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

63

o Important Keep in mind that not all adapters are able to handle requests in parallel even if you assign more threads on the Messaging System queues In such a case adding additional threads will not speed up processing significantly See chapter Adapter Parallelism for details

o A backlog of messages in the messaging system is highly critical for synchronous scenarios due to timeouts that might occur Therefore the number of synchronous threads always has to be tuned so that no bottleneck occurs Often a backlog in synchronous queues is caused due to bad performance on the receiving backend system

o More information is provided in the blog Tuning the PI Messaging System Queues

621 Messaging System in between AFW Sender Adapter and Integration Server (Outbound)

Open the Audit Log of a message and look for the following entry ldquoThe message was successfully retrieved

from the send queuerdquo Note that the queue name would be ldquoCall Queuerdquo if the message was synchronous

To determine how long the message waited to be picked up from the queue compare the timestamp of the

above entry with the timestamp for the step ldquoMessage successfully put into the queuerdquo A large time

difference between those two timestamps would therefore indicate a bottleneck for consumer threads on the

sender queues in the Messaging System

Instead of checking the latency of single messages we recommend using Wily Introscope to monitor the

queue behavior as shown above

Asynchronous messages only use one thread for processing in the messaging system Multiple threads are

used for synchronous messages The adapter thread puts the message into the messaging system queue

and will wait till the messaging system delivers the response The adapter thread is therefore not available

for other tasks until the response is returned A consumer thread on the call queue sends the message to the

Integration Engine The response will be received by a third thread (consumer thread of send queue) which

correlates the response with the original request After this the initiating adapter thread will be notified to

send the response to the original sender system This correlation can be seen in the audit log of the

synchronous message below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

64

622 Messaging System in Between Integration Server and AFW Receiver Adapter (Inbound)

Open the Audit Log of a message and look for the following entry ldquoMessage successfully put into the

queuerdquo Compare this timestamp with the timestamp for the step ldquoThe message was successfully retrieved

from the receive queuerdquo The time difference between these two timestamps is the time that the message

waited in the messaging system for a free consumer thread A large time difference between those two

timestamps indicates a bottleneck in the adapter specific inbound queues of the Messaging System

623 Interface Prioritization in the Messaging System

SAP NetWeaver PI 71 and higher offers the possibility for prioritization of interfaces similar to the ABAP

stack This feature is especially helpful if high volume interfaces run at the same time with business critical

interfaces To minimize the delay for critical messages the Messaging System now makes it possible for you

to define high medium and low priority processing at interface level Based on the priority the Dispatcher

Queue of the messaging system (which will be the first entry point for all messages) will forward the

messages to the standard adapter-specific queues The priority assigned to an interface determines the

number of messages that are forwarded once the adapter-specific queues have free consumer threads

available This is done based on a weighting of the messages to be reloaded The weights for the different

priorities are as follows High 75 Medium 20 Low 5

Based on this approach you can ensure that more resources can be used for high priority interfaces The

screenshot below shows the UI for message prioritization available in pimon Configuration and

Administration Message Prioritization

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

65

The number of messages per priority can be seen in a dashboard in Wily as shown below

You can find more details on configuring usage prioritization within the AAE at SAP Online Help navigate to

SAP NetWeaver SAP NetWeaver PI SAP NetWeaver Process Integration 71 Including Enhancement

Package 1 SAP NetWeaver Process Integration Library Function-Oriented View Process Integration

Process Integration Monitoring Component Monitoring Prioritizing the Processing of Messages

624 Avoid Blocking Caused by Single SlowHanging Receiver Interface

As already discussed above some receiver adapters process messages sequentially on one server node by

default This is independent of the number of consumer threads defined for the corresponding receiver

queue in the messaging system For example even though you have configured 20 threads for the JDBC

Receiver one Communication Channel will only be able to send one request to the remote database at a

given time If there are many messages for the same interface all of them will get a thread from the

messaging system but will be blocked when performing the adapter call

In the case of a slow message transmission for example for EDI interfaces using ISDN technology or

remote system with small network bandwidth this behavior can cause a ldquohangingrdquo situation for a specific

adapter since one interface can block all messaging threads As a result all the other interfaces will not get

any resources and will be blocked

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

66

Based on this a parameter messagingsystemqueueParallelismmaxReceivers was introduced which can

be set in NWA in service XPI Service Messaging System With this parameter you can restrict the maximum

number of receiver consumer threads that can be used by one interface In older releases (lower 731 SP11

and 74 SP6) this is a global parameter at the moment that affects all adapters It should therefore not be set

too restrictively If you need high parallel throughput one option is to set the maxReceiver parameter to 5 (so

that each interface can use 5 consumer threads for each server node) and increase the overall number of

threads on the receive queue for the adapter in question (for example JDBC) to 20 With this configuration it

will be possible for four interfaces to get resources in parallel before all threads are blocked For more

information see Note 1136790 - Blocking Receiver Channel May Affect the Whole Adapter Type and SDN

blog Tuning the PI Messaging System Queues

In the screenshot below you see the impact of maxReceivers parameter when a backlog for one interface

occurs Even though there are more free SOAP threads available they are not consumed Hence the free

SOAP threads could be assigned to other interfaces running in parallel

This parameter has a direct impact on the queue where message backlogs occur As mentioned earlier

usually a backlog should occur on the Dispatcher queue only since it is responsible to dispatch threads only

if there are free consumer threads in the adapter specific queue When setting maxReceiver you restrict the

threads per interface and this means that less consumer threads are blocked When one interface is facing a

high backlog there will always be free consumer threads and therefore the dispatcher queue can dispatch

the message to the adapter specific queue Hence the backlog will appear in the adapter specific queue so

that the message prioritization would not work properly any longer

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

67

Per default the maxReceiver parameter is only relevant for asynchronous message processing (ICo and

classical dual-stack scenarios) The reason here is that a wait time for synchronous scenarios should be

avoided by all means and therefore additional restrictions in the number of available threads can be very

critical Therefore it is usually advisable to not limit the threads per interface but increase the overall number

of available threads In case you have many high volume synchronous scenarios with different priority that

run in parallel it might be advisable to limit the threads each interface can use To do so you have to set the

parameter messagingsystemqueueParallelismqueueTypes to Recv ICoAll as described in SAP Note

1493502 - Max Receiver Parameter for Integrated Configurations

Enhancement with 731 SP11 and 74 SP6

With Note 1916598 - NF Receiver Parallelism per Interface an enhancement was introduced that allows

the specification of the maximum parallelization on more granular level This new feature has to be activated

by setting the parameter messagingsystemqueueParallelismperInterface to true Using a configuration UI

you can specify rules to determine the parallelization for all interfaces of a given receiver service If no rule

for a given interface is specified the global maxReceivers value will be considered If the receiver service

corresponds to a technical business system this configuration would help to restrict the parallel requests

from PI to that system This also works across protocols so that eg it could be used to restrict the number of

parallel requests using Proxy or IDoc to the same ERP receiver system Below you can find a screenshot of

the configuration UI in NWA SOA Monitoring

With the improvement mentioned above also the dispatching mechanism in the dispatcher queue is

changed so that it is aware of the maxReceiver settings This means that now again the backlog will now be

placed in the dispatcher queue and the prioritization will work properly

625 Overhead based on interface pattern being used

Also the interface pattern configured can cause some overhead When choosing the interface pattern

ldquoStateless Operationrdquo each message is parsed against its data type during inbound processing This can

cause performance and memory overhead but gives a syntactical check of the data received

When choosing ldquoStateless (XI30-Compatible)rdquo no check of the data type is performed and therefore the

overhead is avoided Therefore for interface with good data quality and high message throughput this

interface pattern should be chosen

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

68

63 Performance of Module Processing

If your analysis in chapter Adapter Performance Problem pointed to a problem in the Module Processing

then you must analyze the runtime of the modules used in the Communication Channel Adapter modules

can be combined so that one CC calls multiple modules in a defined sequence Adapter modules can be

custom-developed or SAP standard and can be used for many different purposes ndash like to transform the EDI

message before sending it to the partner or to change header attributes of a PI message before forwarding it

to a legacy application Due to the flexible usage of adapter modules there is also no standard way to

approach performance problems

In the audit log shown below you can see two adapter modules One module is a customer-developed

module called SimpleWaitModule The next module called CallSAPAdapter which is a standard module that

inserts the data into the messaging system queues

In the audit log you will get a first impression of the duration of the module In the example above you can

see that the SimpleWaitModule requires 4 seconds of processing time

Once again Wily Introscope offers a better overview of the processing time of modules In the screenshot

below you can see a dashboard showing the cumulativeaverage response times and number of invocations

of different modules If there is one that has been running for a very long time then it would be very easy to

identify since there will be a line indicating a much higher average response time The tooltip displays the

name of the module

Module

Processing

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

69

Once you have identified the module with the long running time you have to talk to the responsible developer

to understand why it is taking that long Eventually the module executes a look-up to a remote system using

JCo or JDBC which could be responsible for the delay In the best case the module will print out information

in addition to the audit log to detect such steps If not use the Wily Introscope transaction trace as explained

in appendix Wily Transaction Trace

64 Java only scenarios Integrated Configuration objects

From SAP NetWeaver PI 71 on you can configure scenarios that only run on the Java stack using the

Advance Adapter Engine (AAE) when only Java-based adapters are used A new configuration object is

used for this that is called Integrated Configuration When using this the steps executed so far in the ABAP

pipeline (Receiver Determination Interface Determination and Mapping) are also executed by the services in

the Adapter Engine

641 General performance gain when using Java only scenarios

The major advantage of AAE processing is a reduced overhead due to the context switches between ABAP

and Java Thus the overall throughput can be increased significantly and the overall latency of a PI

message can be reduced greatly If possible interfaces should always use the Integrated Configuration to

achieve best performance Therefore the best tuning option is to change a scenario that is using Java based

sender and receiver adapters from a classical ABAP-based scenario to an AAE-based one

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

70

Based on SAP internal measurements performed on 71 releases the throughput as well as the response

time could be improved significantly as shown in the diagrams below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

71

Based on these measurements the following statements can be made

1) Significant performance improvements can be achieved always with local processing (if available) for a certain scenario

2) This is valid for all adapter types huge mapping runtimes slow networks slow (receiver) applications reduce the throughput and therefore rated for the overall scenario also reduce the benefit of local processing in terms of overall throughput and response time measurements

3) Greatest benefit is recognized for small payloads (comparison 10k 50k 500k) and asynchronous messages

642 Message Flow of Java only scenarios

All the Java based tuning options mentioned in chapter Analyzing the (Advanced) Adapter Engine and Long

Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo are still valid Just the calling sequence is different

The following describes the steps used in Integrated Configuration to help you better understand the

message flow of an interface The example is JMS to Mail

1) Enter the JMS sender adapter

2) Put into the dispatcher queue of the messaging system

3) Forwarded to the JMS send queue of the messaging system

4) Message is taken by JMS send consumer thread

a No message split used

In this case the JMS consumer thread will do all the necessary steps previously done in the ABAP pipeline (like receiver determination interface determination and mapping) and will then also transfer the message to the backend system (remote database in our example) Thus all the steps are executed by one thread only

b Message split used (1n message relation)

In this case there is a context switch The thread taking the message out of the queue will process the message up to the message split step It will create the new message and put it in the Send queue again from where it will be taken by different threads (which will then map the child message and finally send it to the receiving system)

As we can see in this example for Integrated Configuration only one thread does all the different steps of a

message The consumer thread will not be available for other messages during the execution of these steps

The tuning of the Send queue (Call for synchronous messages) is therefore much more important for

scenarios using Integrated Configuration than for ABAP-based scenarios

The different steps of the message processing can be seen in the audit log of a message If for example a

long-running mapping or adapter module is indicated then you can use the relevant chapter of this guide

There is no difference in the analysis except that for mappings no JCo connection is required since the

mapping call is done directly from the Java stack

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

72

The example audit logs below are based on a Java only synchronous SOAP to SOAP scenario

In the highlighted areas you can see that all the steps are very fast except the mapping call which lasts

around 7 seconds To analyze the long duration of the mapping now the same steps as discussed in chapter

Long Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo have to be followed

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

73

643 Avoid blocking of Java only scenarios

Like classical ABAP based scenarios Java only scenarios can face backlog situations in case of a slow or

hanging receiver backend Since the Java only interfaces are only using send queues the restriction of

consumer threads on the receiver queue as described in chapter Avoid Blocking Caused by Single

SlowHanging Receiver Interface is no solution

Based on that an additional property messagingsystemqueueParallelismqueueTypes of service SAP XI AF

MESSAGING was introduced via Note 1493502 - Max Receiver Parameter for Integrated Configurations

The parameter can be set for synchronous and asynchronous interfaces Please keep in mind that

messagingsystemqueueParallelismmaxReceivers is a global parameter for all adapters and for all

configured Quality of Service This global restriction can be avoided starting with 731 SP11 740 SP06 as

per SAP Note 1916598 - NF Receiver Parallelism per Interface Usually a restriction for the parallelization

can be highly critical for synchronous interfaces Therefore we generally recommend to set

messagingsystemqueueParallelismqueueTypes in most cases to Recv IcoAsync only

644 Logging Staging on the AAE (PI 73 and higher)

Up to PI 711 on the PI Adapter Engine only one message version was persisted Especially for Java only

scenarios this was often not sufficient for troubleshooting since for instance the result of a mapping could not

be verified

Starting from PI 73 the persistence steps can be configured globally in the Adapter Engine for synchronous

and asynchronous interfaces This is similar to the LOGGING or LOGGING_SYNC on the ABAP stack In

later version of PI you will be able to do the configuration on interface level

The versions that can be persisted are shown in the diagram below (the abbreviations in green are the

values for the configuration)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

74

In the Messaging System we generally distinguish Staging (versioning) and Logging An overview is given

below

For details about the configuration please refer to the SAP online help Saving Message Versions

Similar to the LOGGING on the ABAP Integration Server persisting several versions of the Java message

can cause a high overhead on the DB and can cause a decrease in performance

The persistence steps can be seen directly in the audit log of a message

Also the Message Monitor shows the persisted versions

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

75

While this additional monitoring attributes are extremely helpful for troubleshooting they should be used

carefully from a performance perspective Especially the number of message versions and the

loggingstaging mode you choose can have a severe impact on performance Therefore you have to find the

balance between business requirement and performance overhead Some guidelines on how to use staging

and logging are summarized in SAP Note 1760915 - FAQ Staging and Logging in PI 73 and higher

65 J2EE HTTP load balancing

With NetWeaver version 71 and higher the Java HTTP load balancing is no longer done by the Java

Dispatcher but by the ICM The default load balancing rules of the ICM are designed with a stateful

application in mind giving priority to the balancing of HTTP sessions These default rules are not optimal for

stateless applications such as SAP PI

In the past we could observe an unequal load distribution across Java server nodes caused by this In case

of high backlogs this can cause a delay in the overall message processing of the interface because one

server node has more messages assigned than others and therefore not all available resources are used

A typical example can be seen in the Wily screenshot below

SAP Note 1596109 ndash Uneven distribution of HTTP requests on Java server nodes introduces new load

balancing rules for stateless applications like PI Please follow the description in the Note to ensure the

messages are distributed equally across the available server nodes In the meantime these load balancing

rules can also be executed using the CTC templates as described in Notes 1756963 and 1757922 Also this

has been included in the PI Initial Setup wizard in order to execute this task automatically as a post-

installation step (see Note 1760700 - PI CTC Add HTTP loadbalancing to initial setup)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

76

For the example given above we could see a much better load balancing after the new load balancing rules

were implemented This can be seen in the following screenshot

Please Note The load balancing rules mentioned above are only responsible to balance the messages

across the available server nodes of one instance HTTP load balancing across application servers is done

by the SAP Web Dispatcher More information about this is provided in the Guide How To Scale SAP PI 71

66 J2EE Engine Bottleneck

All tuning that can be done in the AFW is limited by the ability of the J2EE Engine to handle a given amount

of threads and to provide enough memory that is needed to process all requests Of course the CPU is also

a limiting factor but this will be discussed in Chapter 91

From SAP NetWeaver PI 71 onwards the SAP VM is used on all platforms Therefore the analysis of all

platforms can use the same tools

661 Java Memory

The first analysis of the J2EE memory is usually done using the garbage collection (GC) behavior of the

Java Virtual Machine (JVM) To get the information of the GC written to the standard log file the JVM has to

be started with the parameter ndashverbosegc

Analyze the file std_serverltngtout for the frequency of GCs their duration and the allocated memory

Look for unusual patterns indicating an Out-of-Memory error or an unintentional restart of your J2EE engine This would be visible if the allocated memory drops to 0 at a given point in time A healthy Garbage Collection output would display a saw tooth pattern ndash that is the memory usage would increase over time but then go down to its original value as soon as a full Garbage Collection is triggered You should see the memory usage go down to initial values during low volume times (night-time)

Pay attention to GCs with a high duration This is important because no PI message can be mapped or processed by the J2EE Adapter Framework during a GC in a JAVA VM One of the many reasons for long GCs that should be mentioned here is paging (see chapter 92) If the heap space of the J2EE Engine is exhausted all objects in the heap are evaluated for their references If there is still a reference the object is kept If there is no reference the object is deleted If some or all of the objects have been swapped out due to insufficient RAM then they have to be swapped in to be evaluated This leads to very long runtimes for GCs GCs of more than 15 minutes have been observed on swapping systems

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

77

Different tools exist for the Garbage Collection Analysis

1) Solution Manager Diagnostics

Solution Manager Diagnostics (SMD) offers a memory analysis application that is able to read the GC output in the std_serverltngtout files To start call the Root Cause Analysis section In the Common Task list you find the entry Java Memory Analysis which will give you the chance to upload your std_serverltngtout file It shows the memory allocated after the GC and also prints the duration of each GC (black dots with duration scale on right axis) A normal GC output should show a saw tooth pattern where the memory always goes down to an initial value (especially during low volume times) This is shown in the example screenshot below

2) Wily Introscope

Again Wily Introscope offers different dashboards that can be used to check the Garbage Collection behavior on the J2EE engine The dashboard can be found using the J2EE Overview J2EE GC Overview and shows important KPIs for the GC such as the count of GC or the duration for each interval The GC Time dashboards shows the ratio of time spent in GC An example is shown in the screenshot below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

78

3) Netweaver Administrator

The NWA also offers a view to monitor the Java heap usage However the important information about the duration of the GC is not available This monitor can be found by navigating to Availability and Performance Management Java System Reports Choose Report System Health

In NetWeaver 73 you find this data in Availability and Performance Management Resource Monitoring History Reports There you can build your own Memory report of the monitoring data provided The screenshot below eg shows the Used Memory per server node

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

79

4) SAP JVM Profiler

The SAP JVM profiler allows the analysis of the extensive GC information stored in the sapjvm_gcprf files of the server node folders or to directly connect to the server node to retrieve this information More information can be found at httpwikiscnsapcomwikidisplayASJAVAJava+Profiling

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

80

Procedure

o Search for restarts of the J2EE Engine by searching for the string ldquois startingrdquo in the file std_serverltngtout Apart from the first of these entries (which simply marks the initial start of the J2EE Engine) the second and any later entries mark a fatal situation after which the J2EE server node had to be restarted This could for example be an outndashof-memory situation Search the log above those entries for a possible reason

o Search for exceptions and fatal entries Was an out-of-memory error reported (search for pattern OutOfMemory) Was a thread shortage reported

o This document does not deal with tuning the J2EE Engine If at this point it becomes clear that the J2EE Engine is the limiting factor proceed with SAP Notes or open a customer incident Only some very basic recommendations are given here

o The parameters for the SAP VM are defined by Zero Admin templates From time to time new Zero Admin templates might be released which change important parameters for the SAP VM Thus you should check SAP Note 1248926 - AS Java VM Parameters for NetWeaver 71-based Products regularly and apply the changes accordingly

o The problem might be caused by too high parallelization when processing large messages This should be restricted to avoid overloading of the Java heap memory

1) In case the interfaces are using the ABAP Integration Server use the large message queue filters to restrict the parallelization of mapping calls from ABAP queues To do so set parameter EO_MSG_SIZE_LIMIT to eg 5000 to direct all large messages to dedicated XBTL or XBTM queues

2) To restrict the parallelization on the Java Messaging System you can use the large message queue mechanism described in chapter Large message queues on the Messaging System

o If it is already obvious that the memory of your J2EE Engine is getting short and nothing can be changed on the scenarios itself you have several options for scaling your PI landscape Especially for large PI installations with high message throughput or large messages SAP Note 1502676 - Scaling up large PI installations describes potential options

1) Increase the Java Heap of your server node If sufficient physical memory is available increase the default heap size of 2 GB to a higher value such as 3 or 4 GB Experience shows that the SAP VM can handle these heap sizes without major impact on performance Larger heap sizes can cause longer processing time of GCs which in turn can affect the PI application negatively Increasing the maximum heap size on the J2EE engine will automatically adapt the new area of the heap due to a dynamic configuration (in newer J2EE versions this will be per default set to 16 of the overall heap) After increasing the heap size monitor the duration of the GC execution

2) Add an additional server node to distribute the load over more processes SAP recommends that productive systems have at least 2 Java server nodes per instance Prerequisite is that your physical host has enough RAM and CPU To minimize the overhead for J2EE cluster communication it is in general recommended configuring larger server nodes (as described above) then a high number of server nodes

3) Add an additional application server consisting of a double stack Web Application Server (ABAP and J2EE Engine) to distribute the load to an additional hardware Find details on the scaling of PI with multiple instances in the Guide How To Scale up SAP NetWeaver Process Integration

4) Use a non-central Adapter Engine to separate large messages that cause the memory problem from business critical scenarios

662 Java System and Application Threads

A system or application thread is required for every task that the J2EE Engine executes Each of these

thread types are maintained using a thread pool If the pool of available threads is expired no more requests

can be processed It is therefore important that you have enough threads available at all times

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

81

o In general SAP recommends that you increase the system and application thread count for a PI system As described in SAP Note 937159 - XI Adapter Engine is stuck the application threads should be increased to 350 and the system threads to lsquo120rsquo for all J2EE server nodes

o There are two options for checking the thread usage

1) Wily Introscope

Wily Introscope provides a dashboard in the J2EE Overview section called J2EE CPU and Memory Detail that can also be used to monitor the thread usage on a PI engine Different views exist for Application and System Threads as shown in the screenshot below

2) NetWeaver Administrator (NWA)

The NWA also offers a monitor similar to the one available for the memory mentioned above This monitor can be found by navigating to Availability and Performance Management Java System Reports Choose Report System Health and looking at the windows shown in the screenshot below

Again in NetWeaver 73 and higher the monitor can be found at Availability and Performance

Management Resource Monitoring History Report An example for the Application Thread

Usage is shown below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

82

Check if the ThreadPoolUsageRate is below 80 for both the application threads and the system threads Also check the maximum value by choosing a longer history in Wily Introscope

Has the thread usage increased over the last daysweeks

Is there one server node that shows a higher thread usage than others

3) SAP Management Console

The SAP Management Console is an essential tool to analyze different aspects of the J2EE engine It also offers a thread dump view similar to SM50 in ABAP showing the activity of all the threads of a given Java Application Server More information about the SAP MC can be found in Note 1014480 - SAP Management Console (SAP-MC) and the online help The SAP Management Console can be started as Java web-start application using httpltservergt5ltInstanc_numbergt13

Navigate to AS Java Threads and check for any threads in red status (gt20 secs running time) This indicates long running threads and you should check for the related thread stacks The example below shows a long running thread related to the PI directory You can see the user doing the request and can see that the thread is waiting for an HTTP response from the CPACache

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

83

If you are not sure what is causing the problem you should take multiple thread dumps (at least 3 every 30 seconds) for all server nodes This can be done in the Management Console navigating to AS Java Process Table

The Thread Dump Viewer tool (TDV) from Note 1020246 - Thread Dump Viewer for SAP Java Engine can be used to analyze the produced thread dumps (inside the results archive that can be downloaded to your local PC)

663 FCA Server Threads

An additional type of thread was introduced with SAP NetWeaver PI 71 that is responsible for HTTP traffic

FCA Server Threads The FCA Server Threads are responsible for receiving HTTP calls at the Java side

(after the Java dispatcher is no longer available) Also FCA Threads use a Thread Pool Fifteen FCA

Threads are configured by default but based on SAP Note 1375656 - SAP Netweaver PI System Parameters

we recommend that you increase it to 50 This can be done in the NWA by changing the parameter

FCAServerThreadCount of service HTTP Provider

FCA Server Threads are particularly crucial for synchronous message transfer and for HTTP-based

scenarios like WebServices or calls from the ABAP engine to the Adapter Engine If there is a long duration

of the HTTP call due to a slow backend system then the thread is blocked for the whole time and not

available to send other HTTP requests (other PI messages)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

84

The configured FCAServerThreadCount represents the maximum number of threads working on a single

entry point An entry point in the PI sense could eg be the SOAP Adapter Servlet (share with all channels

as described in Tuning SOAP sender adapter)

o If response times for specific entry point are high (gt15 seconds) additional FCA threads will be spawned to be available for parallel HTTP based incoming requests using different entry points only This ensures that in case of problems with one application not all other applications will be blocked constantly

o An overall maximum of 20xFCAServerThreadCount may be used in the J2EE engine

There is currently no standard monitor available for FCA Threads except the thread view in the SAP

Management Console A new dashboard will be integrated into Wily Introscope soon and will also show the

FCA Server Threads that are in use

664 Switch Off VMC

The Virtual Machine Container (VMC) is enabled by default for an ABAP WebAS 71 Since PI does not use

VMC at runtime the VMC can be switched off on a PI system to avoid resource overhead This

recommendation is based on SAP Note 1375656 - SAP NetWeaver PI System Parameters

You can activate the VM Container setting the profile parameters vmcjenable = off which can be changed

using profile maintenance (transaction RZ10) Once you have made the change you need to restart the SAP

Web Application Server instance

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

85

7 ABAP PROXY SYSTEM TUNING

Every ABAP WebAS gt 620 includes an ABAP Proxy runtime This enables a SAP system to talk native PI

protocol (XI-SOAP) so that no costly transformation is necessary

In general ABAP proxy is just a framework that is triggered by (sender) or triggers (receiver) application-

specific coding It is not possible to give a general tuning recommendation because the applications and use

cases of ABAP proxy can differ so greatly

In this section we would like to highlight the system tuning options that can be applied to improve the

throughput at the ABAP Proxy backend side

In general you can increase the throughput of EO interfaces by changing the queue parallelization Two

different parameters exist to change the number of qRFC queues on the ABAP Proxy backend As in PI

ABAP Proxy processing is based on qRFC inbound queues (SMQ2) with specific names A sender proxy

uses XBTS queues (10 parallel by default) and a receiver proxy XBTR (20 parallel by default)

The sender proxy only forwards the message to PI by executing an HTTP call that must not take too much

time and which is not resource-critical On the other hand the receiver proxy executes the inbound

processing of the messages based on the application context (and which can be very time-consuming) It is

therefore more likely to face a backlog situation in the Receiver Proxy queues In the screenshot below you

can see the performance header of a receiver proxy message that required around 20 minutes in the

PLSRV_CALL_INBOUND_PROXY (which corresponds to the posting of the application data)

Of course such a long-running message will block the queue and all messages behind it will face a higher

latency Since this step is purely application-related it is only possible to perform tuning at the application

side

The number of sender and receiver ABAP Proxy queues can be changed in transaction SXMB_ADM on the

proxy backend with parameter EO_INBOUND_PARALLEL and sub parameter SENDER (XBTS queues)

and RECEIVER (XBTR queues) The prerequisite for increasing the number of queues is that there are

enough qRFC resources are available at the proxy system (as discussed in section qRFC Resources

(SARFC))

Also as described in chapter Message Prioritization on the ABAP Stack prioritization can be configured in

the ABAP Proxy system This can be used to separate runtime-critical interfaces from interfaces with a long

application processing (as shown above)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

86

71 New enhancements in Proxy queuing

The queuing mechanism was enhanced for sender and receiver proxy systems with Note 1802294

(Receiver) and Note 1831889 (Sender) After implementing these two Notes interface specific queues will be

used and tuning of these queues on interface level will be possible This can be very helpful in cases where

one receiver interface shows very long posting time in the application coding that cannot be further

improved Other messages for more business critical interfaces will eventually be blocked by this message

due to the common usage of the XBTR queues On the sender side of a Proxy system the processing of the

queues call the central PI hub In general the processing time there should be fast But in case of high

volume interfaces you might want to slow down less business critical interfaces to avoid overloading of your

central PI hub The prioritization mechanisms available prior to these Notes did not provide such options

This is achieved by adding an interface specific identifier to the queue-name In the screenshot below you

see a comparison in the queue names for the old framework (red) and the new framework (blue)

For the sender queues the following new parameters are introduced

o Parameter EO_QUEUE_PREFIX_INTERFACE with sub parameter SENDERSENDER_BACK Should be set to lsquo1rsquo to active interface specific queue This is a global switch affecting all sender interfaces

o Parameter EO_INBOUND_PARALLEL_SENDER without sub parameter determines the general number of queues per interface

Using a sub parameter ltsender IDgt the parallelization of individual interfaces can be changed This can eg be used to assign more queues for high priority interfaces

Wildcards can be used in the ltsender IDgt to group interfaces sharing eg the same service name or the same interface name but different services

For the receiver queues the following new parameters are introduced

o Parameter EO_QUEUE_PREFIX_INTERFACE with sub parameter RECEIVER Should be set to lsquo1rsquo to active interface specific queue This is a global switch affecting all receiver interfaces

o Parameter EO_INBOUND_PARALLEL_RECEIVER without sub parameter determines the general number of queues per interface

Using a sub parameter ltsender IDgt the parallelization of individual interfaces can be changed This can eg be used to assign more queues for high priority interfaces

Wildcards can be used in the ltsender IDgt to group interfaces sharing eg the same service name or the same interface name but different services

This new feature also replaces the currently existing prioritization since it is in general more flexible and

powerful We therefore recommend adjusting existing prioritization rules using the new queuing possibilities

described above

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

87

8 MESSAGE SIZE AS SOURCE OF PERFORMANCE PROBLEMS

The message size directly influences the performance of an interface The size of a PI message depends on

two elements The PI header elements with a rather static size and the payload which can vary greatly

between interfaces or over time for one interface (for example larger messages during year-end closing)

The size of the PI message header can cause a major overhead for small messages of only a few kB and

can cause a decrease in the overall throughput of the interface Furthermore many system operations (like

context switches or database operations) are necessary for only a small payload The larger the message

payload the smaller the overhead due to the PI message header On the other hand large messages

require a lot of memory on the Java stack which can cause heavy memory usage on ABAP or excessive

garbage collection activity (see section Java Memory) that will also reduce the overall system performance

Very large messages can even crash the PI system by causing an Out-of-Memory exception for example

You therefore have to find a compromise for the PI message size

Below you see throughput measurements performed by SAP In general the best throughput was identified

for messages sizes of 1 to 5 MB

The message size in these measurements corresponds to the XML message size processed in PI and not

the size of the file or IDoc being sent to PI You can use the Runtime Header of the ABAP stack to check the

message size Below you can see an example of a very small message While the MessageSizePayload

field describes the size of the payload in bytes (here 433 bytes) the MessageSizeTotal describes the total

message size (header + payload) In the example this is around 14 KB demonstrating the overhead by the PI

header for small messages The next two lines describe the payload size before and after the mapping In

the example below the mapping reduces the payload size The last two lines determine the size of the

response message that is sent back to PI before and after the response mapping for synchronous

messages

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

88

Based on the above observations we highly recommend that you use a reasonable message size for your

interfaces During the design and implementation of the interface we therefore recommend using a message

size of 1 to 5 MB if possible Messages can be collected at the sender side or by using IDoc packaging as

described in Tuning the IDoc Adapter to achieve this In case of large messages a split has to be performed

by changing the sender processing or by using the split functions available in the structure conversion of the

File adapter

81 Large message queues on PI ABAP

In case the interfaces are using the ABAP Integration Server use the large message queue filters to restrict

the parallelization of mapping calls from ABAP queues To do so set the parameter EO_MSG_SIZE_LIMIT of

category TUNING to eg 5000 to direct all messages larger 5 MB to dedicated XBTL or XBTM queues

The value of the parameter depends on the number of large messages and the acceptable delay that might

be caused due to a backlog in the large message queue

To reduce the backlog the number of large message queues can also be configured via parameter

EO_MSG_SIZE_LIMIT_PARALLEL of category TUNING The default value is 1 so that all messages larger

than the defined threshold will be processed in one single queue Naturally the parallelization should not be

set higher than 2 or 3 to avoid overloading of the Java memory due to parallel large message requests

82 Large message queues on PI Adapter Engine

SAP Note 1727870 - Handling of large messages in the Messaging System introduces large message

queues (virtual queues no adapter behind) also for the Java based Adapter Engine Contrary to the

Integration Engine it is not the size of a single large message only that determines the parallelization

Instead the sum of the size of the large messages across all adapters on a given Java server node is limited

to avoid overloading the Java heap This is based on so called permits that define a threshold of a message

size Each message larger than the permit threshold is considered as large message The number of permits

can be configured as well to determine the degree of parallelization Per default the permit size is 10 MB and

10 permits are available This means that large messages will be processed in parallel if 100 MB are not

exceeded

To show this let us look at an example using the default values Let us assume we have 5 messages waiting

to be processed (status ldquoTo Be Deliveredrdquo on one server node Message A has 5 MB message B has 10

MB message C has 50 MB message D 150 MB message E 50 MB and message F 40 MB Message A is

not considered large since the size is smaller than the permit size and is not considered large and can be

immediately processed Message B requires 1 permit message C requires 5 Since enough permits are

available processing will start (status DLNG) Hence for message D all available 10 permits would be

required Since the permits are currently not available it cannot be scheduled If blacklisting is enabled the

message will be put to error status (NDLV) since it exceed the maximum number of defined permits In that

case the message would have to be restarted manually Message E requires 5 permits and can also not be

scheduled But since there are 4 permits left message F is put to DLNG Due to the smaller size message B

and message F finish first releasing 5 permits This is sufficient to schedule message E which requires 5

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

89

permits Only after message E and C have finished message D can be scheduled consuming all available

permits

The example above shows a potential delay a large message could face due to the waiting time for the

permits But the assumption is that large messages are not time critical and therefore additional delay is less

critical than potential overload of the system

The large message queue handling is based on the Messaging System queues This means that restricting

the parallelization is only possible after the initial persistence of the message in the Messaging System

queues Per default this is only done after the Receiver Determination Therefore if you have a very high

parallel load of incoming large requests this feature will not help Instead you would have to restrict the size

of incoming requests on the sender channel (eg file size limit in the file adapter or the

icmHTTPmax_request_size_KB limit in the ICM for incoming HTTP requests) If you have very complex

extended receiver determination or complex content based routing it might be useful to configure staging in

the first processing step of the Messaging System (BI=3) as described in Logging Staging on the AAE (PI

73 and higher)

The number of permits consumed can be monitored in PIMON Monitoring Adapter Engine Status The

number of threads corresponds to the number of consumed permits

In newer Wily versions there is also a dashboard showing the number of consumed permits as well The

Worker Threads on the right hand side of the screenshot correspond to the permits

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

90

9 GENERAL HARDWARE BOTTLENECK

During all tuning actions discussed in the previous chapters you must keep in mind that the limitation of all

activities is set by the underlying CPU and memory capacity The physical server and its hardware have to

provide resources for three PI runtimes the Adapter Engine the Integration Engine and the Business

Process Engine Tuning one of the engines for high throughput leaves fewer resources for the remaining

engines Thus the hardware capacity has to be monitored closely

91 Monitoring CPU Capacity

When monitoring the CPU capacity it is not enough to use the average that is displayed in the ldquoPrevious

Hoursrdquo section of transaction ST06 Instead you should start your interface and monitor the CPU while it is

running The best procedure is to use operating system tools to monitor the CPU usage (particularly if using

hardware virtualization)

The SMD Host Agent also reports CPU data to Wily Introscope which can be viewed using dashboards as

shown below This example shows two systems where one is facing a temporary CPU overload and the

other a permanent one

Also the NWA offers a view on the CPU activity from NetWeaver 73 on via Availability and Performance

Management Resource Monitoring History Reports There you can build your own report based on the

ldquoCPU utilizationrdquo data as shown in the screenshot

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

91

You can also monitor the CPU usage in ST06 as described below

Procedure

Log on to your Integration Server and call transaction ST06 Take a look at the snapshot that is provided on

the entry screen than navigate to ldquoDetailed Analysis Menurdquo

Snapshot view Is the idle time (upper left corner) at a reasonable value at all times for example around 10 or higher

Snapshot view The load average is a good indicator of the dimension of the bottleneck It describes the number of processes for each CPU that are in a wait queue before they are assigned to a free CPU As long as the average remains at one process for each available CPU the CPU resources are sufficient Once there is an average of around three processes for each available CPU there is a bottleneck at the CPU resources In connection with a high CPU usage a high value here can indicate that too many processes are active on the server In connection with a low CPU usage a high value here can indicate that the main memory is too small The processes are then waiting due to excessive paging

Detailed Analysis View TOP CPU Which thread uses up the highest amount of CPU Is it the J2EE Engine the work processes or the database

o There are different follow-up actions to be taken depending on the findings of the second check The first option is of course to simply increase the hardware This should be accompanied by a thorough sizing (for which SAP provides the Quicksizer in the Service Marketplace httpservicesapcomquicksizer )

o The second option is to identify the CPU-consuming threads and to think about a reduction of the load in this component For example if the J2EE Engine is the largest consumer then it is worth re-evaluating the number of concurrent mapping threads andor consumer queues as described in the previous chapters

o SAP Note 742395 - Analyzing High CPU Usage by the J2EE Engine provides a good entry point if the J2EE Engine consumes too much CPU

o If it is the work processes that consume a lot of CPU use transaction ST03N to figure out if it is the Integration Server or the Business Process Engine You can do that by looking at the CPU usage per user BPE uses the user WF-BATCH while the Integration Server uses the last user who activated a qRFC queue (typically PIAFUSER or PIAPPLUSER) Use the previous chapter to restrict these activities

92 Monitoring Memory and Paging Activity

As stated above paging is very critical for the Java stack since it influence the Java GC behavior directly

Therefore paging should be avoided in every case for a Java-based system

The OS tools are the most reliable for monitoring the paging activity on your system Transaction ST06 can

also be used to perform a snapshot analysis Take a look at the snapshot that is provided on the entry

screen

Snapshot view Is there enough physical memory available

Snapshot view Is there considerable paging Paging can have a negative influence on your J2EE Engine If it does this can be seen in long GC times as described in Chapter 661

93 Monitoring the Database

Monitoring database performance is a rather complex task and requires information to be gathered from a

variety of sources Only a basic set of indicators is given for the purpose of this guide A more detailed

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

92

analysis is required if these indicators point to a major performance problem in the database If assistance is

needed open a support message under the component BC-DB-ORA (or MSS SDB DB6 respectively)

931 Generic J2EE database monitoring in NWA

The next sections of this chapter will be mainly based on ABAP transactions to do a detailed analysis of the

database performanceThe NWA however offers possibilities to monitor the database performance for the

Java based tables This is especially helpful for Java only AEXPO or Non-Central AAEs In the NWA you

can use the ldquoOpen SQL Monitorrdquo in the Troubleshooting section of NWA The official documentation can be

found in the Online Help Monitoring DB via DBACOCKPIT transaction (ABAP based) is possible also for the

Java-only systems from SAP Solution Manager (DBA Cockpit must be configured in the Managed System

Setup)

It will be too much to outline all the available functionalities here Instead only a few key capabilities will be

demonstrated If you want to eg see the number of select update or insert statements in your system you

can use the ldquoTable Statistics Monitorrdquo It clearly shows you which tables in your system are accessed the

most frequently

To see the processing time of the different statements on your system you can use the ldquoOpen SQL statisticsrdquo

monitor There you can see the count the total average and max processing time of the individual SQL

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

93

statements and can therefore identify the expensive statements on your system

Per default the recorded period is always from the last restart of the system If you would like to just look at

the statistics for a specific time period (eg during a test) you have the option of resetting the statistics in the

individual monitor prior to the test

932 Monitoring Database (Oracle)

Procedure

Log on to your Integration Server and call transaction ST04

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

94

Many of the numbers in the screenshot above depend on each other The following checklist names a few

key performance figures of your Oracle database

The data buffer quality for instance is based on the ratio of physical reads versus the total number of reads The lower the ratio the better the buffer quality The data buffer quality should be better than 94 the statistics should be based on 15 million total reads This number of reads ensures that the database is in an equilibrated state

Ratio of user and recursive calls A good performance is indicated by ratio values greater than 2 Otherwise the number of recursive calls compared to the number of user calls is too high Over time this ratio always declines because more and more SQL statements get parsed in the meantime

Number of reads for each user call If this value exceeds 30 blocks for each user call this indicates an expensive SQL statement

Check the value of TimeUser call Values larger than 15 ms often indicate an optimization issue

Compare busy wait time versus CPU time A ratio of 6040 generally indicates a well-tuned system Significantly higher values (for example 8020) indicate room for improvement

The DD-cache quality should be better than 80

933 Monitoring Database (MS SQL)

Procedure

To display the most important performance parameters of the database call Transaction ST04 or choose

Tools Administration Monitor Performance Database Activity An analysis is only meaningful if

the database has been running for several hours with a typical workload To ensure a significant database

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

95

workload we recommend a minimum of 500 CPU busy seconds Note The default values displayed in

section Server Engine are relative values To display the absolute values press button Absolute values

Check the values in (1)

The cache hit ratio (2) which is the main performance indicator for the data cache shows the

average percentage of requested data pages found in the cache This is the average value since startup The value should always be above 98 (even during heavy workload) If it is significantly below 98 the data cache could be too small To check the history of these values use Transaction ST04 and choose Detail Analysis Menu Performance Database A snapshot is collected every 2 hours

Memory setting (5) shows the memory allocation strategy used and shows the following

FIXED SQL Server has a constant amount of memory allocated which is set by SQL Server configuration parameters min server memory (MB) = max server memory (MB)

RANGE SQL Server dynamically allocates memory between min server memory (MB) lt gt max server memory (MB)

AUTO SQL Server dynamically allocates memory between 4 MB and 2 PB which is set by min server memory (MB) = 0 and max server memory (MB) = 2147483647

FIXED-AWE SQL Server has a constant amount of memory allocated which is set by min server memory (MB) = max server memory (MB) In addition the address windowing extension functionality of Windows 2000 is enabled

934 Monitoring Database (DB2)

Procedure

To get an overview of the overall buffer pool usage catalog and package cache information go to

transaction ST04 choose Performance Database section Buffer Pool (or section Cache respectively)

1 1

1 2

1 3

1 5

1 4

gt 6h

gt 98

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

96

Buffer Pools Number Number of buffer pools configured in this system

Buffer Pools Total Size The total size of all configured buffer pools in KB If more than one buffer pool is used choose Performance Buffer Pools to get the size and buffer quality for every single buffer pool

Overall buffer quality This represents the ratio of physical reads to logical reads of all buffer pools

Data hit ratio In addition to overall buffer quality you can use the data hit ratio to monitor the database (data logical reads - data physical reads) (data logical reads) 100

Index hit ratio In addition to overall buffer quality you can use the index hit ratio to monitor the database (index logical reads - index physical reads) (index logical reads) 100

Data or Index logical reads The total number of read requests for data or index pages that went through the buffer pool

Data or Index physical reads The total number of read requests that required IO to place data or index pages in the buffer pool

Data synchronous reads or writes Read or write requests performed by db2agents

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

97

Catalog cache size Maximum size of the catalog cache that is used to maintain the most frequently accessed sections of the catalog

Catalog cache quality Ratio of catalog entries (inserts) to reused catalog entries (lookups)

Catalog cache overflows Number of times that an insert in the catalog cache failed because the catalog cache was full (increase catalog cache size)

Package cache size Maximum size of the package cache that is used to maintain the most frequently accessed sections of the package

Package cache quality Ratio of package entries (inserts) to reused package entries (lookups)

Package cache overflows Number of times that an insert in the package cache failed because the package cache was full (increase package cache size)

935 Monitoring Database (MaxDB SAP DB)

Procedure

As with the other database types call transaction ST04 to display the most important performance

parameters

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

98

With the Database Performance Monitor (transaction ST04) and its submonitors the CCMS in the SAP R3

System allows you to view in the SAP R3 System all of the information that can be used to identify

bottleneck situations

The SQL Statements section provides information about the number of SQL statements executed and related sizes

The IO Activity section lists physical and logical read and write accesses to the database

The Lock Activity section (in transaction ST04) allows you to identify possible bottlenecks in the assignment of locks by the database

The Logging Activity section combines information about the log area

The Scan and sort activity section can be helpful in identifying that suitable indexes are missing

The Cache Activity section provides information about the usage of the caches and the associated hit rates

o The data cache Hit rate should be 99 in a balanced system If this is not the case you need to check if the low hit rate is due to the size of the data cache being too small or due to an unsuitable SQL statement If necessary you can increase the data cache by increasing the MaxDB parameter Data_Cache_Size (lower than 74) or Cache_Size (74 or higher)

o The catalog cache hit rate should be 85 in a balanced system It is not a cause for concern if the catalog hit rate is temporarily lower since accesses to the system tables and an active command monitor can temporarily impair the hit rate If the catalog cache is busy the pages are paged out in the data cache rather than on the hard disk

o Log entries are not written directly to the log volume but first into a buffer (LOG_IO_QUEUE) so as to be able to write several log entries together asynchronously with one IO on the hard disk Only once a LOG_IO_QUEUE page is full is it written to the hard disk However there are situations in which you need to write LOG entries from the LOG_QUEUE onto the hard disk as quickly as possible for example if transactions are completed (COMMITROLLBACK) The transactions wait until the log writer reports the OK informing them that the log entry is on the hard disk Firstly this means that is important to use the quickest possible hard disks for the log volume(s) and secondly you must ensure that no LOG_QUEUE_OVERFLOWS occur in production operation If the LOG_IO_QUEUE is full then all transactions that want to write LOG entries must wait until free memory becomes available again in the LOG_IO_QUEUE At LOG_QUEUE_OVERFOLWS (Transaction DB50 -gt Current Status -gt Activities Overview -gt LOG_IO_QUEUE Overflow) you need to increase the LOG_IO_QUEUE parameter

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

99

o In accordance with Note 819324 - FAQ MaxDB SQL optimization check which SQL statements are responsible for the most disk accesses and whether they can be optimized You should also optimize SQL statements that have a poor runtime and poor selectivity

Bottleneck Analysis

The Bottleneck Analysis function is available in transaction ST04 under the Problem Analyzes

Performance Database Analyzer Bottlenecks You can use this function to activate the corresponding

analysis tool dbanalyzer

The dbanalyzer then collects important performance data every 15 minutes and evaluates this for possible

bottlenecks

The result of the bottleneck analysis is output in text format to provide a quick overview of possible causes of

performance problems For a more detailed description of the bottleneck messages see the online

documentation (httphelpsapcom) in the SAP Web Application Server area and search for SAP DB

bottleneck analysis messages

The dbanalyzer tool can also be started at operating system level It logs its analysis results in files with date

stamps in the subdirectory ltrundirectorygtanalyzer (you can find the actual directory by double-clicking on

lsquoPropertiesrsquo in DB50) of the relevant database instance

94 Monitoring Database Tables

Some of the tables of SAP PI grow very quickly and can cause severe performance problems if archiving or

deletion does not take place frequently For troubleshooting see SAP Note 872388 ndash Troubleshooting

Archiving and Deletion in PI

Procedure

Log on to your Integration Server and call transaction SE16 Enter the table names as listed below and execute In the following screen simply press the button ldquoNumber of Entriesrdquo The most important tables are SXMSPMAST (cleaned up by XML message archivingdeletion)

If you use the switch procedure you have to check SXMSPMAST2 as well

SXMSCLUR SXMSCLUP (cleaned up by XML message archivingdeletion)

If you use the switch procedure you have to check SXMSPCLUR2 and SXMSCLUP2 as well

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

100

SXMSPHIST (cleaned up by the deletion of history entries)

If you use the switch procedure you have to check SXMSPHIST2 as well

SXMSPFRAWH and SXMSPFRAWD (cleaned up by the performance jobs see SAP Note 820622 ndash Standard Jobs for XI Performance Monitoring)

SWFRXI (cleaned up by specific jobs see SAP Note 874708 - BPE HT Deleting Message Persistence Data in SWFRXI)

SWWWIHEAD (cleaned up by work item archivingdeletion)

Use an appropriate database tool for example SQLPLUS for Oracle for the database tables of the J2EE schema Main tables that could be affected of growth

PI Message tables BC_MSG and BC_MSG_AUDIT (when audit log persistence is enabled)

PI message logging information when stagingloging is used BC_MSG_LOG_VERSION

XI IDoc tables (if IDoc persistence is activated) XI_IDOC_IN_MSG and XI_IDCO_OUT_MSG

Check for all tables if the number of entries is reasonably small or remains roughly constant over a period of time If that is not the case check your archivingdeletion setup

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

101

10 TRACES LOGS AND MONITORING DECREASING PERFORMANCE

101 Integration Engine

There are three important locations for the configuration of tracing logging in the Integration Engine The

pipeline settings of PI itself the Internet Communication Manager (ICM) and the gateway All three are

heavily involved in message processing and therefore their tracing or logging settings can have an impact on

performance On top of this you might have changed the SM59 RFC destinations and have switched on the

trace in order to analyze a problem If so then this needs to be reset as well

Procedure

Start transaction SXMB_ADM and navigate to Integration Engine Configuration Specific Configuration and search for the following entries

Category Parameter Subparameter Current Value Default

RUNTIME TRACE_LEVEL ltnonegt ltyour valuegt 1

RUNTIME LOGGING ltnonegt ltyour valuegt 0

RUNTIME LOGGING_SYNC ltnonegt ltyour valuegt 0

Set the above parameters back to the default value which is the recommended value by SAP

Start transaction SMICM and check the value of the Trace Level that is displayed in the overview

(third line) Set the trace level to lsquo1rsquo if not already the case You can do so by navigating to Goto Trace Level Set

Start transaction SMGW navigate to Goto Parameters Display and check the parameter Trace Level Set the trace level to lsquo1rsquo if not already the case You can do so by navigating to Goto Trace Gateway Reduce Level (or Increase Level respectively)

Start transaction SM59 and search the following RFC destination for the flag ldquoTracerdquo on the tab Special Options and remove the Flag ldquoTracerdquo if it has been set

AI_RUNTIME_JCOSERVER

AI_DIRECTORY_JCOSERVER

LCRSAPRFC

SAPSLDAPI It is possible that other RFC destinations are used for sending out IDocs and similar

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

102

102 Business Process Engine

You only have to check the Event Trace in the Business Process Engine

Procedure

Call transaction SWELS and check if the Event Trace is switched on The recommended setting is lsquooffrsquo (as shown in the screenshot below)

103 Adapter Framework

Since the Adapter Framework runs on the J2EE Engine the tracing and logging can be conveniently

checked and controlled using NetWeaver Administrator To improve the performance SAP recommends that

you set all trace levels to default Higher trace levels are only acceptable in productive usage during problem

analysis Below you will find a description on how to set the default trace level for all locations at once It is of

possible to do it for every location separately but this way you might forget to reset one of the locations

Procedure

Start the NetWeaver Administrator and navigate to Problem Management (or Troubleshooting in NW 73 and higher) Logs and Traces and choose Log Configuration In the Show drop down list box choose Tracing Locations Select the Root Location in the tree and press the button Default Configuration This will set the default configuration for all locations

Start the NetWeaver Administrator and navigate to Troubleshooting Logviewer (direct link nwalinks) Open the View ldquoDeveloper Tracerdquo and check if you have very frequent reoccurring exceptions that fill up the trace Analyze the exception and check what is causing this (eg a wrong configuration of a Communication Channel causing repeated traces If you see very frequent errors for which you cannot find the root cause open a customer incident at SAP

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

103

An aggregated view of Java exceptions (by number date instance server node etc) are reported and available in SAP Solution Manager Root Cause Analysis Exception Analysis functionality

1031 Persistence of Audit Log information in PI 710 and higher

With SAP NetWeaver PI 71 the audit log is not persisted for successful messages in the database by default

to avoid performance overhead Therefore the audit log is only available in the cache for a limited period of

time (based on the overall message volume)

Audit log persistence can be activated temporarily for performance troubleshooting where audit log

information is required To do so use the NWA and change the parameter

ldquomessagingauditLogmemoryCacherdquo to false in service XPI Service Messaging System by setting the

parameter to false Details can be found in SAP Note 1314974 - PI 71 AF Messaging System audit log

persistence Only do so temporarily during the time of troubleshooting to avoid any performance problems

from the additional persistence

After implementing Note 1611347 - New data columns in Message Monitoring additional information like

message processing time in ms and server node is visible in the message monitor

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

104

11 ERRORS AS SOURCE OF PERFORMANCE PROBLEMS

If errors occur within messages or within technical components of the SAP Process Integration then this can

have a severe impact on the overall performance Thus to solve performance problems it is sometimes

necessary to analyze the log files of technical components and to search for messages with errors To

search for messages with errors use the PI Admin Check which is available using SAP Note 884865 - PIXI

Admin Check

Procedure

ICM (Internet Communication Manager)

Start transaction SMICM open the log file (Shift + F5 or GoTo Trace File Show All) and check for errors

Gateway

Start transaction SMGW open the Log File (CTRL + Shift + F10 or Goto Trace Gateway Display File) and check for errors

System Log

Start transaction SM21 choose an appropriate time interval and search the log entries for errors Repeat the procedure for remote systems if you are using dialog instances

ABAP Runtime Errors

Start transaction ST22 choose an appropriate time interval and search for PI-related dumps In general the number of ABAP dumps should be very small on a PI system and therefore all occurring dumps should be analyzed

Alerts CCMS

Start transaction RZ20 and search for recent alerts

Work Process and RFC trace

In the PI work directory check all files which begin with dev_rfc or dev_w for errors

J2EE Engine

In the PI work directory search for errors in file dev_server0out If you are using more than one J2EE server node check dev_serverltngtout as well

Applications running on the J2EE Engine

The traces for application-related errors are the defaulttrc files located in j2eeclusterserverltngtlog directory To monitor exceptions across multiple server nodes the best tool to use is the LogViewer service of the NWA This is available via Problem Management Logs and Traces and choose Log Viewer Search for exceptions in the area of the performance problem for example the PI Messaging system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

105

APPENDIX A

A1 Wily Introscope Transaction Trace

As mentioned above the Wily Transaction trace can be used to trace expensive steps that were noticed

during the mapping or module processing The transaction trace will allow you to drill down further into Java

performance problems and to distinguish if it is a pure coding problem or caused by a look-up to a remote

system or a slow connection to the local database

The Wily Transaction trace can be used to trace all steps that exceed a specific duration on the Java stack If

a user is known (for example for SAP Enterprise Portal) the tracing can be restricted to a specific user In PI

if you encounter a module that lasts several seconds then you can restrict the tracing as shown below

Note Starting the Transaction Trace increases the data collection on the satellite system (PI) and is

therefore only recommended in productive environment for troubleshooting purposes

With the selection above the trace will run for 10 minutes (this is the maximum and it can be canceled earlier)

and will trace all steps exceeding 3 seconds for the agents of the PI system If such long-running steps are

found then a new window will be displayed listing these steps

In the screenshot below you can see the result in the Trace View The Trace View shows the elapsed time

from left to rightndash in the example below around 48 seconds From top to bottom we can see the call stack of

the thread In general we are interested in long-running threads on the bottom of the trace view A long-

running block at the bottom means that this is the lowest level coding that was instrumented and which is

consuming all the time

In the example below we see a mapping call that is performing many individual database statements ndash this

will become visible by highlighting the lowest level In such a case you have to review the coding of the

mapping to see if the high amount of database calls can be summarized in one call

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

106

Another case that is often seen is that a lookup using JDBC to a remote database or RFC to an ABAP

system takes a long time in the mapping or adapter module In such a case there will be one long block at

the bottom of the transaction trace that also gives you some details about the statement that was executed

A2 XPI inspector for troubleshooting and performance analysis

The XPI Inspector is an additional tool that provides enhanced logging and tracing capabilities with a main

focus on troubleshooting But the tool can also be used for troubleshooting of performance issues

General information about the tool can be found at Note 1514898 - XPI Inspector for troubleshooting PI The

tool can be called using the following URL on your system httphostjava-portxpi_inspector

To analyze performance problems typically the example ldquo51 ndash ldquoPerformance Problem)rdquo is used As a basic

measurement it allows you to take multiple thread dumps in specific time intervals These thread dumps can

be analyzed later by your administrators or SAP support to identify what is causing the problem The Thread

Dump Viewer tool (TDV) from Note 1020246 - Thread Dump Viewer for SAP Java Engine can be used to

analyze the produced thread dumps (inside the results archive that can be downloaded to your local PC)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

107

In addition the tool allows you to do JVM profiling be either doing JVM Performance tracing or JVM Memory

Allocation tracing This can help to understand in detail which steps of the processing are taking a long time

As output the tool provides prf files that can be loaded into the JVM Profiler More information can be found

at httpwikiscnsapcomwikidisplayASJAVAJava+Profiling Please Note that these traces (especially the

memory profiling) will cause a high overhead on the J2EE engine and are therefore not recommended to be

used in production Instead it is advisable to reproduce the problem on your QA system and use the tool

there

Document Version 30 March 2014

wwwsapcom

copy 2014 SAP AG All rights reserved

SAP R3 SAP NetWeaver Duet PartnerEdge ByDesign SAP

BusinessObjects Explorer StreamWork SAP HANA and other SAP

products and services mentioned herein as well as their respective

logos are trademarks or registered trademarks of SAP AG in Germany

and other countries

Business Objects and the Business Objects logo BusinessObjects

Crystal Reports Crystal Decisions Web Intelligence Xcelsius and

other Business Objects products and services mentioned herein as

well as their respective logos are trademarks or registered trademarks

of Business Objects Software Ltd Business Objects is an SAP

company

Sybase and Adaptive Server iAnywhere Sybase 365 SQL

Anywhere and other Sybase products and services mentioned herein

as well as their respective logos are trademarks or registered

trademarks of Sybase Inc Sybase is an SAP company

Crossgate mgic EDDY B2B 360deg and B2B 360deg Services are

registered trademarks of Crossgate AG in Germany and other

countries Crossgate is an SAP company

All other product and service names mentioned are the trademarks of

their respective companies Data contained in this document serves

informational purposes only National product specifications may vary

These materials are subject to change without notice These materials

are provided by SAP AG and its affiliated companies (SAP Group)

for informational purposes only without representation or warranty of

any kind and SAP Group shall not be liable for errors or omissions

with respect to the materials The only warranties for SAP Group

products and services are those that are set forth in the express

warranty statements accompanying such products and services if

any Nothing herein should be construed as constituting an additional

warranty

wwwsapcom

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

2

DOCUMENT HISTORY

Before you start planning make sure you have the latest version of this document You can find the link to

the latest version via SAP Note 894509 ndash PI Performance Check

The following table provides an overview on the most important changes to this document

Version Date Description

30 March 2014 Reason for new version Include latest improvements in PI 731 and

PI 74 and enhanced tuning options for Java only runtime Many

sections were updated Major changes are

Additional symptom (blacklisting) for qRFC queues in READY status

Long processing time for Lean Message Search

IDoc posting configuration on receiver side

Tuning of SOAP sender adapter

More information about IDOC_AAE adapter tuning

Packaging for Java IDoc and Java Proxy adapter

New Adapter Framework Scheduler for polling adapters

new performance monitor for Adapter Engine

Enhancements of the maxReceiver parameter for individual interfaces

Include latest information for staging and logging (restructured to be part of Java only chapter)

FCA Thread tuning for incoming HTTP requests

Enhancements in the Proxy framework on senderreceiver ERPs

Large message queues on PI Adapter Engine

Generic J2EE database monitoring in NWA

Appendix section XPI_Inspector

20 October 2011 Reason for new version Adaption to PI 73

Added document history

Added options and description about AEX

Added new adapters updated existing adapter parallelization

Description of additional monitors (Java performance monitor and new ccBPM monitor)

General review of entire document with corrections

More details and performance measurements for Java Only

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

3

(ICO) scenarios

Additional new chapters

Prevent blocking of EO queues

Avoid uneven backlogs with queue balancing

Reduce the number of EOIO queues

Adapter Framework Scheduler

Avoid blocking of Java only scenarios

J2EE HTTP load balancing

Persistence of Audit Log information in PI 710 and higher

Logging Staging on the AAE (PI 73 and higher)

11 December 2009 Reason for new version Adaption to PI 71

General review of entire document with corrections

Additional new chapters

Tuning the IDoc adapter

Message Prioritization on the ABAP stack

Prioritization in the Messaging System

Avoid blocking caused by single slowhanging receiver interface

Performance of Module Processing

Advanced Adapter Engine Integrated Configuration

ABAP Proxy system tuning

Wily Transaction Trace

10 October 2007 Initial document version released for XI 30 and PI 70

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

4

TABLE OF CONTENTS

1 INTRODUCTION 6

2 WORKING WITH THIS DOCUMENT 10

3 DETERMINING THE BOTTLENECK 12 31 Integration Engine Processing Time 12 32 Adapter Engine Processing Time 13 321 Adapter Engine Performance monitor in PI 731 and higher 15 33 Processing Time in the Business Process Engine 16

4 ANALYZING THE INTEGRATION ENGINE 20 41 Work Process Overview (SM50SM66) 21 42 qRFC Resources (SARFC) 22 43 Parallelization of PI qRFC Queues 23 44 Analyzing the runtime of PI pipeline steps 29 441 Long Processing Times for ldquoPLSRV_RECEIVER_ DETERMINATIONrdquo PLSRV_INTERFACE_DETERMINATION 30 442 Long Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo 30 443 Long Processing Times for ldquoPLSRV_CALL_ADAPTERrdquo 33 444 Long Processing Times for ldquoDB_ENTRY_QUEUEINGrdquo 34 445 Long Processing Times for ldquoDB_SPLITTER_QUEUEINGrdquo 34 446 Long Processing Times for ldquoLMS_EXTRACTIONrdquo 36 447 Other step performed in the ABAP pipeline 36 45 PI Message Packaging for Integration Engine 37 46 Prevent blocking of EO queues 38 47 Avoid uneven backlogs with queue balancing 38 48 Reduce the number of parallel EOIO queues 40 49 Tuning the ABAP IDoc Adapter 41 491 ABAP basis tuning 41 492 Packaging on sender and receiver side 42 493 Configuration of IDoc posting on receiver side 42 410 Message Prioritization on the ABAP Stack 44

5 ANALYZING THE BUSINESS PROCESS ENGINE (BPE) 45 51 Work Process Overview 45 52 Duration of Integration Process Steps 45 53 Advanced Analysis Load Created by the Business Process Engine (ST03N) 47 54 Database Reorganization 47 55 Queuing and ccBPM (or increasing Parallelism for ccBPM Processes) 48 56 Message Packaging in BPE 48

6 ANALYZING THE (ADVANCED) ADAPTER ENGINE 50 61 Adapter Performance Problem 51 611 Adapter Parallelism 51 612 Sender Adapter 54 613 Receiver Adapter 55 614 IDoc_AAE adapter tuning 56 615 Packaging for Proxy (SOAP in XI 30 protocol) and Java IDoc adapter 56 616 Adapter Framework Scheduler 58 62 Messaging System Bottleneck 60 621 Messaging System in between AFW Sender Adapter and Integration Server (Outbound) 63 622 Messaging System in Between Integration Server and AFW Receiver Adapter (Inbound) 64 623 Interface Prioritization in the Messaging System 64 624 Avoid Blocking Caused by Single SlowHanging Receiver Interface 65 625 Overhead based on interface pattern being used 67 63 Performance of Module Processing 68 64 Java only scenarios Integrated Configuration objects 69

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

5

641 General performance gain when using Java only scenarios 69 642 Message Flow of Java only scenarios 71 643 Avoid blocking of Java only scenarios 73 644 Logging Staging on the AAE (PI 73 and higher) 73 65 J2EE HTTP load balancing 75 66 J2EE Engine Bottleneck 76 661 Java Memory 76 662 Java System and Application Threads 80 663 FCA Server Threads 83 664 Switch Off VMC 84

7 ABAP PROXY SYSTEM TUNING 85 71 New enhancements in Proxy queuing 86

8 MESSAGE SIZE AS SOURCE OF PERFORMANCE PROBLEMS 87 81 Large message queues on PI ABAP 88 82 Large message queues on PI Adapter Engine 88

9 GENERAL HARDWARE BOTTLENECK 90 91 Monitoring CPU Capacity 90 92 Monitoring Memory and Paging Activity 91 93 Monitoring the Database 91 931 Generic J2EE database monitoring in NWA 92 932 Monitoring Database (Oracle) 93 933 Monitoring Database (MS SQL) 94 934 Monitoring Database (DB2) 95 935 Monitoring Database (MaxDB SAP DB) 97 94 Monitoring Database Tables 99

10 TRACES LOGS AND MONITORING DECREASING PERFORMANCE 101 101 Integration Engine 101 102 Business Process Engine 102 103 Adapter Framework 102 1031 Persistence of Audit Log information in PI 710 and higher 103

11 ERRORS AS SOURCE OF PERFORMANCE PROBLEMS 104

APPENDIX A 105 A1 Wily Introscope Transaction Trace 105 A2 XPI inspector for troubleshooting and performance analysis 106

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

6

1 INTRODUCTION

SAP NetWeaver Process Integration (PI) consists of the following functional components Integration Builder

(including Enterprise Service Repository (ESR) Service Registry (SR) and Integration Directory) Integration

Server (including Integration Engine Business Process Engine and Adapter Engine) Runtime Workbench

(RWB) and System Landscape Directory (SLD) The SLD in contrast to the other components may not

necessarily be part of the PI system in your system landscape because it is used by several SAP NetWeaver

products (however it will be accessed by the PI system regularly) Additional components in your PI

landscape might be the Partner Connectivity Kit (PCK) a J2SE Adapter Engine one or several non-central

Advanced Adapter Engines or an Advanced Adapter Engine Extended An overview is given in the graphic

below The communication and accessibility of these components can be checked using the PI Readiness

Check (SAP Note 817920)

Processing at runtime is carried out by the Integration Server (IS) with the aid of the SLD The IS in this

graphic stands for the Web Application Server 70 or higher In the classic environment PI is installed as a

double stack an ABAP and a Java stack (J2EE Engine) The ABAP stack is responsible for pipeline

processing in the Integration Engine (Receiver Determination Interface Determination and so on) and the

processing of integration processes in the Business Process Engine Every message has to pass through

the ABAP stack The Java stack executes the mapping of messages (with the exception of ABAP mappings)

and hosts the Adapter Framework (AFW) The Adapter Framework contains all XI adapters except the plain

HTTP WSRM and IDoc adapters

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

7

With 71 the so called Advanced Adapter Engine (AAE) was introduced The Adapter Engine was enhanced

to provide additional routing and mapping call functionality so that it is possible to process messages locally

This means that a message that is handled by a sender and receiver adapter based on J2EE does not need

to be forwarded to the ABAP Integration Engine This saves a lot of internal communication reduces the

response time and significantly increases the overall throughput The deployment options and the message

flow for 71 based systems and higher are shown below Currently not all the functionalities available in the

PI ABAP stack are available on the AAE but each new PI release closes the gap further

In SAP PI 73 and higher the Adapter Engine Extended (AEX) was introduced In addition to the AAE the

Adapter Engine Extended also allows local configuration of the PI objects The AEX can therefore be seen

as a complete PI installation running on Java only From the runtime perspective no major differences can be

seen compared to the AAE and therefore no differentiation is made in this guide

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

8

Looking at the different involved components we get a first impression of where a performance problem

might be located In the Integration Engine itself in the Business Process Engine or in one of the Advanced

Adapter Engines (central non-central plain PCK) Of course a performance problem could also occur

anywhere in between for example in the network or around a firewall Note that there is a separation

between a performance issue ldquoin PIrdquo and a performance issue in any attached systems If viewed ldquothrough

the eyesrdquo of a message (that is from the point of view of the message flow) then the PI system technically

starts as soon as the message enters an adapter or (if HTTP communication is used) as soon as the

message enters the pipeline of the Integration Engine Any delay prior to this must be handled by the

sending system The PI system technically ends as soon as the message reaches the target system for

example a given receiver adapter (or in case of HTTP communication the pipeline of the Integration Server)

has received the success message of the receiver Any delay after this point in time must be analyzed in the

receiving system

There are generally two types of performance problems The first is a more general statement that the PI

system is slow and has a low performance level or does not reach the expected throughput The second is

typically connected to a specific interface failing to meet the business expectation with regard to the

processing time The layout of this check is based on the latter First you should try to determine the

component that is responsible for the long processing time or the component that needs the highest absolute

time for processing Once this has been clarified there is a set of transactions that will help you to analyze

the origin of the performance problem

If the recommendations given in this guide are not sufficient SAP can help you to optimize the service via

SAP consulting or SAP MaxAttention services SAP might also handle smaller problems restricted to a

specific interface if you describe your problem in a SAP customer incident Support offerings like ldquoSAP

Going Live Analysis (GA)rdquo and ldquoSAP Going Live Verification (GV)rdquo can be used to ensure that the main

system parameters are configured according to SAP best practice You can order any type of service using

SAP Service Marketplace or your local SAP contacts

If you have already worked with the PI Admin Check (SAP Note 884865) you will recognize some of the

transactions This is because SAP considers the regular checking of the performance to be an important

administrational task However this check tries to show the methodology to approach performance problems

to its reader Also it offers the most common reasons for performance problems and links to possible follow-

up actions but does not refer to any regular administrational transactions

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

9

Wily Introscope is a prerequisite for analyzing performance problems at the Java side It allows you to

monitor the resource usage of multiple J2EE server nodes and provides information about all the important

components on the Java stack like mapping runtime messaging system or module processor Furthermore

the data collected by Wily is stored for several days so it is still possible to analyze a performance problem at

a later date Wily Introscope is provided free-of-charge for SAP customers It is delivered as part of Solution

Manager Diagnostics but can also be installed separately For more information see

httpservicesapcomdiagnostics or SAP Note 797147

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

10

2 WORKING WITH THIS DOCUMENT

If you are experiencing performance problems on your Process Integration system start with Chapter 3

Determining the Bottleneck It helps you to determine the processing time for the different runtimes within PI

Integration Engine Business Process Engine and Adapter Engine or connected Proxy System Once you

have identified the area in which the bottleneck is most probably located continue with the relevant chapter

This is not always easy to do because a long processing time is not always an indication for a bottleneck It

can make sense to do this if a complicated step is involved (Business Process Engine) if an extensive

mapping is to be executed (IS) or if the payload is quite large (all runtimes) For this reason you need to

compare the value that you retrieve from Chapter 3 with values that you have received previously for

example The history data provided for the Java components by Wily Introscope is a big help If this is not

possible you will have to work with a hypothesis

Once the area of concern has been identified (or your first assumption leads you there) Chapters 4 5 6

and 7 will help you to analyze the Integration Engine the Business Process Engine the PI Adapter

Framework and the ABAP Proxy runtime

After (or preferably during) the analysis of the different process components it is important to keep in mind

that the bottlenecks you have observed could also be caused by other interfaces processing at the same

time It will lead you to slightly different conclusions and tuning measures You therefore have to distinguish

the following cases based on where the problem occurs

A with regard to the interface itself that is it occurs even if a single message of this interface is processed

B or with regard to the volume of this interface that is many messages of this interface are processed at the same time

C or with regard to the overall message volume processed on PI

This can be analyzed by repeating the procedure of Chapter 3 for A) a single message that is processed

while no other messages of this interface are processed and while no other interfaces are running Then

compare this value with B) the processing time for a typical amount of messages of this interface not simply

one message as before If the values of measurement A) and B) are similar repeat the procedure with C) a

typical volume of all interfaces that is during a representative timeframe on your productive PI system or

with the help of a tailored volume test

These three measurements ndash A) processing time of a single message B) processing time of a typical

amount of messages of a single interface and C) processing time of a typical amount of messages of all

interfaces ndash should enable you to distinguish between the three possible situations

o A specific interface has long processing steps that need to be identified and improved The tuning options are usually limited and a re-design of the interface might be required

o The mass processing of an interface leads to high processing times This situation typically calls for tuning measures

o The long processing time is a result of the overall load on the system This situation can be solved by tuning measures and by taking advantage of PI features for example to establish separation of interfaces If the bottleneck is hardware-related it could also require a re-sizing of the hardware that is used

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

11

Chapters 8 9 10 and 11 deal with more general reasons for bad performance such as heavy

tracinglogging error situations and general hardware problems They should be taken into account if the

reason for slow processing cannot be found easily or if situation C from above applies (long processing times

due to a high overall load)

Important

Chapter 9 provides the basic checks for the hardware of the PI server It should not only be used when

analyzing a hardware bottleneck due to a high overall load andor an insufficient sizing but should also be

used after every change made to the configuration of PI to ensure that the hardware is able to handle the

new situation This is important because a classical PI system uses three runtime engines (IS BPE AFW)

Tuning one engine for high throughput might have a direct impact on the others With every tuning action

applied you have to be aware of the consequences on the other runtimes or the available hardware

resources

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

12

3 DETERMINING THE BOTTLENECK

The total processing time in PI consists of the processing time in the Integration Server processing time in

the Business Process Engine (if ccBPM is used) as well as the processing time in the Adapter Framework (if

adapters except IDoc plain HTTP or ABAP proxies are used) In the subsequent chapters you will find a

detailed description of how to obtain these processing times

For reasons of completeness we will also have a look at the ABAP Proxy runtime and the available

performance evaluations

31 Integration Engine Processing Time

You can use an aggregated view of the performance (details about activation in Note 820622 - Standard jobs

for XI performance monitoring) data for the Integration Engine as shown below

Procedure

Log on to your Integration Engine and call transaction SXMB_IFR In the browser window follow the link to

the Runtime Workbench In there click ldquoPerformance Monitoringrdquo Change the display to ldquoDetailed Data

Aggregatedrdquo select lsquoISrsquo as the Data Source (or lsquoPMIrsquo if you have configured the Process Monitoring

Infrastructurersquo) and ldquoXIIntegrationServerltyour_hostgtrdquo as the component Then choose an appropriate time

interval for example the last day You have to enter the details of the specific interface you want to monitor

here

The interesting value is the ldquoProcessing Time [s]rdquo in the fifth column It is the sum of the processing time of the single steps in the Integration Server For now you are not interested in the single steps that are listed on the right side since you are still trying to find out the part of the PI system where the most time is lost Chapter 44 will work with the individual steps

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

13

If you do not know which interface is affected yet you first have to get an overview Instead of navigating to

Detailed Data Aggregated in the Runtime Workbench choose Overview Aggregated Use the button Display

Options and check the options for sender component receiver component sender interface and receiver

interface Check the processing times as described above

32 Adapter Engine Processing Time

Procedure

Log on to your Integration Server and call transaction SXMB_IFR In the browser window follow the link to

the Runtime Workbench In there click Message Monitoring choose Adapter Engine lthostgt from Database

from the drop down lists and press Display Enter your interface details and ldquoStartrdquo the selection Select one

message using the radio button and press Details

Calculate the difference between the start and end timestamp indicated on the screen

Do the above calculation for the outbound (AFW IS) as well as the inbound (IS AFW) messages

The audit log of successful message is no longer persisted in SAP NetWeaver PI 71 by default in order to minimize the load on the database The audit log has a high impact on the database load since it usually consists of many rows that are inserted individually in the database In newer releases therefore a cache was implemented that keeps the audit log information for a period of time only Thus no detailed information is available any longer from a historical point of view Instead you should use Wily Introscope for historical analyses of the performance of successful messages

The audit log can be persisted on the database to allow historical analysis of performance problems

on the Java stack To do so change the parameter ldquomessagingauditLogmemoryCacherdquo from

true to false for service ldquoXPI Service Messaging Systemrdquo using NetWeaver Administrator (NWA) as described in SAP Note 1314974 Note Only do this temporarily if you have identified the bottleneck on the AFW First try using Wily to determine the root cause of the problem

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

14

Due to above mentioned limitations Wily Introscope is the right tool for performance analysis of the Adapter Engine Wily persists the most important steps like queuing module processing and average response time in historical dashboards (aggregated data per intervals for all server nodes of a system or all systems with filters possible etc)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

15

Especially for Java-Only runtime using Integrated Configuration Wily shows the complete processing time after the initial sender adapter steps (receiver determination mapping and time in the backend call) as can be seen in the Message Processing Time dashboard below

Using the ldquoShow Minimum and Maximumrdquo functionality deviations from the average processing time can be seen In the example below a message processing step takes up to 22 seconds The reason for this could either be a slow mapping or a delay in the call to the receiver system (in the example below Java IDoc adapter is used to transfer the data to ERP)

In case of high processing times in one of your steps further analysis might be required using thread dumps or the Java Profiler as outlined in the appendix section A2 XPI inspector for troubleshooting and performance analysis

321 Adapter Engine Performance monitor in PI 731 and higher

Starting from PI 731 SP4 there is a new performance monitor available for the Adapter Engine More

information on the activation of the performance monitor can be found at Note 1636215 ndash Performance

Monitoring for the Adapter Engine Similar to the ABAP monitor in the RWB the data is displayed in an

aggregated format On the PI start page use the link ldquoConfiguration and Monitoring Homerdquo Go to ldquoAdapter

Enginerdquo and select ldquoPerformance Monitorrdquo The data is persisted on an hourly basis for the last 24 hours and

on a daily basis for the last 7 days The data is further grouped on interface level and per Java server node

With the information provided you can therefore see the minimum maximum and average response time of

an individual interface on a specific Java server node All individual steps of the message processing like

time spent in the Messaging System queues or Adapter modules are listed In the example below you can

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

16

see that most of the time (5 seconds) is spent in a module called SimpleWaitModule This would be the entry

point for the further analysis

Starting with 731 SP10 (740 SP05) this data can be exported to excel in two ways Overview or Detailed (details in SAP Note 1886761)

33 Processing Time in the Business Process Engine

Procedure

Log on to your integration server and call transaction SXMB_MONI Adjust your selection in a way that you

select one or more messages of the respective interface Once the messages are listed navigate to the

Outbound column (or Inbound if your integration process is the sender) and click on PE Alternatively you

can select the radio button for ldquoProcess Viewrdquo on the entranceselection screen of SXMB_MONI

To judge the performance of an integration process it is essential to understand the steps that are executed A long-running integration process in itself is not critical because it can wait for a trigger event or can include a synchronous call to a backend Therefore it is important to understand the steps that are included before analyzing an individual workflow log For example the screenshot below shows a long waiting time after the mapping This is perhaps due to an integrated wait step

Calculate the time difference between the first and the last entry Note that this time for the workflow does not include the duration of RFC calls for example To see this processing time navigate to the ldquoList with Technical Detailsrdquo (second button from the left on the screenshot below or shift + F9) Repeat this step for several messages to get an overview about the most time consuming steps

Please Note that the processing time mentioned above does not include the waiting time of the messages in the ccBPM queues Therefore it is essential to monitor a queue backlog in ccBPM queues as discussed in section ldquoQueuing and ccBPM (or increasing Parallelism for ccBPM Processes)rdquo

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

17

Via transaction SWI2_DURA you can display an aggregated performance overview across all ccBPM

processes running on your PI system In the initial screen you have to choose ldquo(Sub-) Workflow and the time

range you would like to look at

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

18

The results here allow you to compare the average performance of a process It shows you the processing

time based on barriers The 50 barrier means that 50 of the messages were processed faster than the

value specified Furthermore you also see a comparison of the processing time to eg the day before By

adjusting the selection criteria of the transaction you can therefore get a good overview about the normal

processing time of the Integration process and can judge if there is a general performance problem or just a

temporary one

The number of process instances per integration process can be easily checked via transaction

SWI2_FREQ The data shown allows you to check the volume of the Integration Processes and allow you to

judge if your performance problem is caused by an increase of volume which could cause a higher latency in

the ccBPM queues

New in PI 73 and higher

Starting from PI 73 a new monitoring for ccBPM processes is available This monitor can be started from

transaction SXMB_MONI_BPE Integration Process Monitoring (also available in ldquoConfiguration and

Monitoring Homerdquo on PI start page) This is new browser based view that allows a simplified and aggregated

view on the PI Integration Processes

On the initial screen you get an overview about all the Integration Processes executed in the selected time

interval Therefore you can immediately see the volume of each Integration Process

From there you can navigate to the Integration process facing the performance issues and look at the

individual process instances and the start and end time Furthermore there is a direct entry point to see the

PI messages that are assigned to this process

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

19

Choosing one special process instance ID you will see the time spend in each individual processing step

within the process In the example below you can see that most of the time is spend in the Wait Step

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

20

4 ANALYZING THE INTEGRATION ENGINE

If chapter Determining the Bottleneck has shown that the bottleneck is clearly in the Integration Engine there

are several transactions that can help you to analyze the reason for this To understand why this selection of

transactions helps to analyze the problem it is important to know that the processing within the Integration

Engine is done within the pipeline The central pipeline of the Integration Server executes the following main

steps

Central Pipeline Steps Description of Central Pipeline Steps

PLSRV_XML_VALIDATION_RQ_INB XML Validation Inbound Channel Request

PLSRV_RECEIVER_DETERMINATION Receiver Determination

PLSRV_INTERFACE_DETERMINATION Interface Determination

PLSRV_RECEIVER_MESSAGE_SPLIT Branching of Messages

PLSRV_MAPPING_REQUEST Mapping

PLSRV_OUTBOUND_BINDING Technical Routing

PLSRV_XML_VALIDATION_RQ_OUT XML Validation Outbound Channel Request

PLSRV_CALL_ADAPTER Transfer to Respective Adapter IDoc HTTP Proxy AFW

PLSRV_XML_VALIDATION_RS_INBXML Validation Inbound Channel Response

PLSRV_MAPPING_RESPONSE Response Message Mapping

PLSRV_XML_VALIDATION_RS_OUT Validation Outbound Channel Response

The last three steps highlighted in bold are available in case of synchronous messages only and reflect the

time spent in the mapping of the synchronous response message

The steps indicated in red are new in SAP NetWeaver PI 71 and higher and can be used to validate the

payload of a PI message against an XML schema These steps are optional and can be executed at different

times in the PI pipeline processing for example before and after a mapping (as shown above)

With PI 731 an additional option of using an external Virus Scan during PI message processing can be

activated as described in the Online Help The Virus Scan can be configured on multiple steps in the pipeline

ndash eg after the mapping (Virus Scan Outbound Channel Request) when receiving a synchronous response

(Virus Scan Inbound Channel Response) and after the response mapping (Virus Scan Outbound Channel

Response)

It is important to know is that these pipeline steps are executed based on qRFC Logical Unit of Work (LUW)

by using dialog work processes

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

21

With 711 and higher versions you can activate the Lean Message Search (LMS) to be able to search for

payload content This is described in more detail in chapter Long Processing Times for

ldquoLMS_EXTRACTIONrdquo

41 Work Process Overview (SM50SM66)

The work process overview is the central transaction to get an overview of the current processes running in

the Integration Engine For message processing PI is only using Dialog work processes (DIA WP) Therefore

it is essential to ensure that enough DIA WPs available

The process overview is only a snapshot analysis You therefore have to refresh the data multiple times to

get a feeling for the dynamics of the processes The most important questions to be asked are as follows

How many work processes are being used on average Use the CPU Time (clock symbol) to check that not all DIA WPs are used In case all DIA WPs have a high CPU time this indicates a WP bottleneck

Which users are using these processes It depends on the usage of your PI system All asynchronous messages are processed by the QIN scheduler who will start the message processing in DIA WPs The user that is shown in SM66 will be the one that triggered the QIN scheduler This can eg be a communication user for a connected backend (usually a copy of PIAPPLUSER) or the user used to send data from the Adapter Engine (per default PIAFUSER)

Activities related to the Business Process Engine (ccBPM) will be indicated by user WF-BATCH If you see high WF-BATCH activity take a look at chapter 51

Are there any long running processes and if so what are these processes doing with regard to the reports used and the tables accessed

o As a rule of thumb for the initial configuration the number of DIA work processes should be around six times the value of CPU in your PI system (rdispwp_no_dia = 6 to 8 times CPUs cores based on SAP Note 1375656 - SAP NetWeaver PI System Parameters)

If you would like to get an overview for an extended period of time without actually refreshing the transaction

at regular intervals use report SDFMON and schedule the Daily Monitoring It allows you to collect metrics

such as CPU usage amount of free Dialog work processes and most importantly a snapshot of the work

process activity as provided in transactions SM50 and SM66 The frequency for the data collection can be as

low as 10 seconds and the data can be viewed after the configurable timeframe of the analysis Note

SDFMON is only intended for analysis of performance issues and should only be scheduled for a limited

period of time

If you have Solution Manager Diagnostics installed and all the agents connected to PI you can also monitor

the ABAP resource usage using Wily Introscope As you can see in the dashboard below you can use it to

monitor the number of free work processes (prerequisite is that the SMD agent is running on the PI system)

The major advantage of Wily Introscope is that this information is also available from the past and allows

analysis after the problem has occurred in the system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

22

42 qRFC Resources (SARFC)

Depending on the configuration of the RFC server group not all dialog work processes can be used for

qRFC processing in the Integration Server As stated earlier all (ABAP based) asynchronous messages are

processed using qRFC Therefore tuning the RFC layer is one of the most important tasks to achieve good

PI performance

In case you have a high volume of (usually runtime critical) synchronous scenarios you have to ensure that

enough DIA WPs are kept free by the asynchronous interfaces running at the same time Since this is a very

difficult tuning exercise we usually recommend implementing runtime critical synchronous interfaces using

Java only configuration (ICO) whenever possible If this is not possible you have to ensure via SARFC tuning

that enough work processes are kept free to process the synchronous messages

The current resource situation in the system can be monitored using transaction SARFC

Procedure

First check which application servers can be used for qRFC inbound processing by checking the AS Group

assigned in transaction SMQR

Call transaction SARFC and refresh several times since the values are only snapshot values

Is the value for ldquoMax Resourcesrdquo close to your overall number of dialog work processes Max Resources is a fixed value and describes how many DIA work processes may be used for RFC communication If the value is too low your RFC parameters need tuning to provide the Integration Server with enough resource for the pipeline processing The RFC parameters can be displayed by double-clicking on ldquoServer Namerdquo

o A good starting point is to set the values as follows

Max no of logons = 90

Max disp of own logons = 90

Max no of WPs used = 90

Max wait time = 5

Min no of free WP = 3-10 (depending on the size of the application server)

These values are specified in SAP Note 1375656 ndash SAP NetWeaver PI System Parameters (this parameter must be increased if you plan to have synchronous interfaces) To avoid any bottleneck and blocking situation free DIA WPs should always be available The value should be adjusted based on the overall number of DIA WPs available at the system and the above requirements

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

23

Note You have to set the parameters using the SAP instance profile Otherwise the changes are lost

after the server is restarted

The field ldquoResourcesrdquo shows the number of DIA WPs available for RFC processing Is this value reasonably high for example above zero all the time If the ldquoResourcesrdquo value is zero then the Integration Server cannot process XML messages because no dialog work process is free to execute the necessary qRFC There are several conclusions that can be drawn from this observation as follows

1) Depending on the CPU usage (see Chapter ldquoMonitoring CPU Capacityrdquo) it might be necessary at this time to increase the number of dialog work processes in your system If the CPU usage however is already very high increasing the number of DIA will not solve the bottleneck

2) Depending on the number of concurrent PI queues (see Chapter qRFC Queues (SMQ2)) it might be necessary to decrease the degree of parallelism in your Integration Server because the resources are obviously not sufficient to handle the load The next section describes how to tune PI inbound and outbound queues

The data provided in SARFC is also collected and shown in a graphical and historical way using Solution

Manager Diagnostic and Wily Introscope (as can be seen in the screenshot below) This allows easy

monitoring of the RFC resources across all available PI application servers

43 Parallelization of PI qRFC Queues

The qRFC inbound queues on the ABAP stack are one of the most important areas in PI tuning Therefore it

is essential to understand the PI queuing concept

PI generally uses two types of queues for message processing ndash PI inbound and PI outbound queues Both

types are technical qRFC inbound queues and can therefore be monitored using SMQ2 PI inbound queues

are named XBTI (EO) or XBQI (EOIO) and are shared between all interfaces running on PI by default The

PI outbound queues are named XBTO (EO) and XBQO (EOIO) The queue suffix (in red XBTO0___0004)

specifies the receiver business system This way PI is using dedicated outbound queues for each receiver

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

24

system All interfaces belonging to the same receiver business system are contained in the same outbound

queue Furthermore there are dedicated queues for prioritization of separation of large messages To get an

overview about the available queues use SXMB_ADM Manage Queues

PI inbound and outbound queues execute different pipeline steps

PI Inbound Queues

Pipeline Steps Description of Pipeline Steps

PLSRV_XML_VALIDATION_RQ_INB XML Validation Inbound Channel Request

PLSRV_RECEIVER_DETERMINATION Receiver Determination

PLSRV_INTERFACE_DETERMINATION Interface Determination

PLSRV_RECEIVER_MESSAGE_SPLIT Branching of Messages

PI Outbound Queues

Pipeline Steps Description of Pipeline Steps

PLSRV_MAPPING_REQUEST Mapping

PLSRV_OUTBOUND_BINDING Technical Routing

PLSRV_XML_VALIDATION_RQ_OUT XML Validation Outbound Channel Request

PLSRV_CALL_ADAPTER Transfer to Respective Adapter IDoc HTTP Proxy AFW

Before the steps above are executed the messages are placed in a qRFC queue (SMQ2) The messages

will wait till they are first in queue and till a free DIA work process is assigned to a queue The wait time in the

queue are recorded in the performance header in DB_ENTRY_QUEUING (PI inbound queue) and

DB_SPLITTER_QUEUING (PI outbound queue) Most often you will see the long processing times there as

discussed in chapter Long Processing Times for ldquoDB_ENTRY_QUEUEINGrdquo and Long Processing Times for

ldquoDB_SPLITTER_QUEUEINGrdquo

Looking at the steps above it is clear that tuning the queues will have a direct impact on the connected PI

components and also backend systems For example by increasing the number of parallel outbound queues

more mappings will be executed in parallel which will in turn put a greater load on the Java stack or more

messages will be forwarded in parallel to the backend system in case of a Proxy call Thus when tuning PI

queues you must always keep the implications for the connected systems in mind

The tuning of PI ABAP queue parameters is done in transaction SXMB_ADM Integration Engine

Configuration and by selecting the category TUNING

For productive usage we always recommend to use inbound and outbound queues (parameter

EO_INBOUND_TO_OUTBOUND=1) Otherwise only inbound queues will be used which are shared across

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

25

all interfaces Hence a problem with one single backend system will affect all interfaces running on the

system

The principle of less is sometimes more also applies for tuning the number of parallel PI queues Increasing

the number of parallel queues will result in more parallel queues with fewer entries per queue In theory this

should result in a lower latency if enough DIA WPs are available

But practically this is not true for high volume systems The main reason for this is the overhead involved in

the reloading of queues (see details below) Furthermore important tuning measures like PI packaging (see

Chapter ldquoPI Message Packagingrdquo) aim to increase the throughput based on a higher number of messages in

the queue Thus from a throughput perspective it is definitely advisable to configure fewer queues

If you have very high runtime requirements you should prioritize these interfaces and assign a different

parallelism for high priority queues only This can be done using parameters

EO_INBOUND_PARALLEL_HIGH_PRIO and EO_OUTBOUND_PARALLEL_HIGH_PRIO as described in

section ldquoMessage Prioritization on the ABAP Stackrdquo

To tune the parallelism of inbound and outbound queues the relevant parameters are

EO_INBOUND_PARALLEL and EO_OUTBOUND_PARALLEL have to be used

Below you can see a screenshot of SMQ2 You can see the PI inbound queues and outbound queues Also

ccBPM queues (XBQO$PE) are displayed and will be discussed in section 5

Procedure

Log on to your Integration Server call transaction SMQ2 and execute If you are running ABAP proxies on a

separate client on the same system enter lsquorsquo for the client Transaction SMQ2 provides snapshots only and

must therefore be refreshed several times to get viable information

o The first value to check is the number of queues concurrently active over a period of time Since each queue needs a dialog work process to be worked on the number of concurrent queues must not be too high compared to the number of dialog work processes usable for RFC communication (see Chapter qRFC Resources (SARFC)) The optimum throughput can be achieved if the number of active queues is equal to or slightly higher than the number of dialog work processes (provided that the number of DIA work processes can be handled by the CPU of the system see transaction ST06)

In case many of the queues are EOIO queues (eg because the serialization is done on the material number) try to reduce the queues by following Chapter EOIO tuning

o The second value of importance is the number of entries per queue Hit refresh a few times to check if the numbers increasedecreaseremain the same An increase of entries for all queues or a specific

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

26

queue points to a bottleneck or a general problem on the system The conclusion that can be drawn from this is not simple Possible reasons have been found to include

1) A slow step in a specific interface

Bad processing time of a single message or a whole interface can be caused by expensive processing steps as for example the mapping step or receiver determination This can be confirmed by looking at the processing time for each step as shown in ldquoAnalyzing the runtime of pipeline stepsrdquo

2) Backlog in Queues

Check if inbound or outbound queues face a backlog A backlog in a queue is generally nothing critical since the queues ensure that the PI components as well as the backend systems are not overloaded For instance batch triggered interfaces are usually causing high backlogs that get processed over time Only in case the backlog prevents you to meet the business requirements for an interface this should be analyzed

If the backlog is caused by a high volume of messages arriving in a short period of time one solution to minimize the backlog is to increase the number of queues available for this interface (if sufficient hardware is available) If one interface is having a backlog in the outbound queues you could eg specify EO_OUTBOUND_PARALLEL with a sub parameter specifying your interface to increase the parallelism for this interface

But in general a backlog can also be caused by a long processing time in one of the pipeline steps as discussed in 1) or a performance problem with one of the connected components like PI Java stack or connected backend systems Due to the steps performed in the outbound queues (mapping or call adapter) it is more likely that a backlog will be seen in the outbound queues Inbound queue processing is only expensive if Content Based Routing or Extended Receiver Determination is used To understand which step is taking long follow once more chapter ldquoAnalyzing the runtime of pipeline stepsrdquo

In addition to tuning of the number of inbound and outbound queues also ldquoMessage Prioritization on the ABAP Stackrdquo and ldquoPI Message Packagingrdquo can help

The screenshot below shows such a backlog situation in Wily Introscope Please Note that the information about the queues is only collected by Wily in 5 minutes intervals explaining the gaps between the measurement points On the dashboard ldquoTotal qRFC Inbound Queue Entriesrdquo you can see that the number of messages in SMQ2 is increasing The number of inbound queues does not increase at the same time It also offers a dashboard for the number of entries by queue name and the time entries remain in SMQ2 With the information provided here it is also possible to distinguish a blocking situation or a general resource problem after the problem is no longer present in the system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

27

3) Queues stay in status READY in SMQ2

To see the status of the queues use the filter button in SMQ2 as shown below

It is a normal situation to see many queues in READY status for a limited amount of time The reason for this is the way the QIN scheduler reloads LUWs (see point 2 below) A waiting time of up to 5 seconds per queuing step is considered normal during normal load situation (even if there is no backlog in the queue and enough resources are available) If you observe a situation where many queues are in READY status for longer period of time the following situations can apply

A resource bottleneck with regard to RFC resources Confirm by following section qRFC Resources (SARFC)

To avoid overloading a system with RFC load the QIN scheduler only reloads new queues after a certain amount of queues have finished processing This can lead to long waiting times in the queues as explained in SAP Note 1115861 - Behaviour of Inbound Scheduler after Resource Bottleneck Since PI is a non-user system we recommend setting rfcinb_sched_resource_threshold to 3 as described in SAP Note 1375656 ndash SAP NetWeaver PI System Parameters

Long DB loading times for RFC tables If using Oracle ensure that SAP Note 1020260 - Delivery of Oracle statistics is applied

Additional overhead during the pipeline executions occurs due to PI internal mechanisms Per default a lock is set during processing of each message This is no longer necessary and should

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

28

be switched off by setting the parameter LOCK_MESSAGE of category RUNTIME to 0 as described in SAP Note 1058915 In addition during the processing of an individual message the message repeatedly calls an enqueue which leads to a deterioration of the throughput This can be avoided by setting parameter CACHE_ENQUEUE to 0 as described in SAP Note 1366904

Sometimes a queue stays in READY status for a very long time while other queues are getting processed fine After manual unlocking the queue is processed but then stops after some time again A potential reason is described in Note 1500048 - SMQ2 inbound Queue remains in READY state In a PI environment queues in status RUNNING for more than 30 minutes usually happen due to one individual step taking long (as discussed in step 1) or a problem on the infrastructure (eg memory problems on Java or HTTP 200 lost on network) After 30 minutes the QIN scheduler removes this queue from the scheduling and therefore it remains in READY To solve such cases the root cause for the blocking has to be understood

4) Queue in READY status due to queue blacklisting

If a queue is for a very long time in status RUNNING (eg due to a problem with a mapping or due to a very slow backend) it is possible that the QIN scheduler excludes this queue from queue scheduling Hence the queue stays in status READY even though enough work processes are available The ldquoblacklistingrdquo of queues takes place when the runtime of a queue exceeds the ldquoMonitoring thresholdrdquo configured on this queue Notes 1500048 - queues stay in ready (schedule_monitor) and Note 1745298 - SMQR Inbound queue scheduler identify excluded entries provide more input on this topic The screenshot below shows a setting of 2400 seconds (40 minutes) for this parameter

Check if a queue stays in READY status for a long time while others are processing without any issue Ensure Note 1745298 is implemented and check the system log for the following exception ldquoQINEXCLUDE ltqueue namegt lttimedate at which the queue scheduler startedgt If this is the case check why the long runtime occurs (see section Analyzing the runtime of PI pipeline steps) or increase the schedule_monitor in exceptional cases only (if eg the runtime of an individual processing step cannot be improved further)

5) An error situation is blocking one or more queues

Click the Alarm Bell (Change View) push button once to see only queues with error status Errors delay the queue processing within the Integration Server and may decrease the throughput if for example multiple retries occur Navigate to the first entry of the queue and analyze the problem by forward navigation to the relevant message in SXMB_MONI

Often you see EO queues in SYSFAIL or RETRY due to the problems during the processing of an individual message This can be prevented by following the description of Chapter Prevent blocking of EO queues

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

29

44 Analyzing the runtime of PI pipeline steps

The duration of the pipeline steps is the ultimate answer to long processing times in the Integration Engine

since it describes exactly how much time was spent at which point The recommended way to retrieve the

duration of the pipeline steps is the RWB and will be described below Advanced users may use the

Performance Header of the SOAP message using transaction SXMB_MONI but the timestamps are not

easy to read If you still prefer the latter method here is the explanation 20110409092656165 must be read

as yyyymmddhh(min)(min)(sec)(sec)(microseconds) that is the timestamp corresponds to April 9 2011 at

092656 and 165 microseconds Please note that these timestamps in Performance Header of

SXI_MONITOR transaction are stored and displayed in UTC time therefore conversion to system time must

be done when analyzing them

In case PI message packaging is configured the performance header will always reflect the processing time

per package Hence a duration of 50 seconds does not mean a single message took 50 seconds but

eventually the package contained 100 messages so that every message took 05 ms More details about

this can be found in section PI Message Packaging

Procedure

Log on to your Integration Server and call transaction SXMB_IFR In the browser window follow the link to

the Runtime Workbench In there click Performance Monitoring Change the display to Detailed Data

Aggregated and choose an appropriate time interval for example the last day For this selection you have to

enter the details of the specific interface you want to monitor

You will now see in the lowerndashpart of the screen a split of the processing time into single steps Check the time difference between the steps Does any step take longer than expected In the example in the screenshot below the DB_ENTRY_QUEUEING starts after 0032 seconds and ends after 0243 seconds which means it took 0211 seconds (211ms)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

30

Compare the processing times for the single steps for different measurements as outlined in Chapter

Pipeline Steps (SXMB_MONI or RWB) For example is a single step only long if many messages

are processed or if a single message is processed This helps you to decide if the problem is a

general design problem (single message has long processing step) or if it is related to the message

volume (only for a high number of messages this process step has large values)

Each step has different follow-up actions that are described next

441 Long Processing Times for ldquoPLSRV_RECEIVER_ DETERMINATIONrdquo PLSRV_INTERFACE_DETERMINATION

Steps that can take a long time in inbound processing are the Receiver and Interface Determination In these

steps the receiver system and interface is calculated Normally this is very fast but PI offers the possibility of

enhanced receiver determinations In these cases the calculation is based on the payload of a message

There are different implementation options

o Content-Based Routing (CBR)

CBR allows defining XPath expressions that are evaluated during runtime These expressions can be combined by logical operators for example to check the value for multiple fields The processing time of such a request directly correlates to the number and complexity of the conditions defined

No tuning options exist for the system in regard to CBR The performance of this step can only be changed by reducing the number of rules by changing the design of the interfaces Another option is to use an extended receiver determination (executing a mapping program) since this is faster than CBR using many rules

o Mapping to determine Receivers

A standard PI mapping (ABAP Java XSLT) can also be used to determine the receiver If you observe high runtimes in such a receiver determination follow the steps outlined in the next section since the same mapping runtime is used

442 Long Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo

Before analyzing the mapping you must understand which runtime is used Mappings can be implemented

in ABAP as graphical mappings in the Enterprise Service Builder as self-developed Java mappings or

XSLT mappings One interface can also be configured to use a sequence of mappings executed

sequentially In such a case analysis is more difficult because it is not clear which mapping in the sequence

is taking a long time

Normal debugging and tracing tools (transaction ST12) can be used for mappings executed in ABAP Any

type of mapping can be tested from ABAP using transaction SXI_MAPPING_TEST Knowing

senderreceiver partyserviceinterfacenamespace and source message payload it is possible to check the

target message (after mapping execution) and detailed trace output (similar to contents of trace of the

message in SXMB_MONI with TRACE_LEVEL = 3) It can also be used for debugging at runtime by using

the standard debugging functionality

For ABAP based XSLT mappings it is also possible via transaction XSLT_TOOL to test trace and debug

XSLT transformations on the ABAP stack

In general the mapping response time is heavily influenced by the complexity of the mapping and the

message size Therefore to analyze a performance problem in the mapping environment you should

compare the mapping runtime during the time of the problem with values reported several days earlier to get

a better understanding

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

31

If no AAE is used the starting point of a mapping is the Integration Engine that will send the request by RFC

destination AI_RUNTIME_JCOSERVER to the gateway There it will be picked up by a registered server

program The registered server program belongs to the J2EE Engine The request will be forwarded to the

J2EE Engine by a JCo call and then executed by the Java runtime When the mapping has been executed

the result is sent back to the ABAP pipeline There are therefore multiple places to check when trying to

determine why the mapping step took so long

Before analyzing the mapping runtime of the PI system check if only one interface is affected or if you face a

long mapping runtime for different interfaces To do so check the mapping runtime of messages being

processed at the same time in the system

The best tool for such an analysis is Wily Introscope which offers a dashboard for all mappings being

executed at a given time Each line in the dashboard represents one mapping and shows the average

response time and the number of invocations

In the screenshot below you can see that many different mapping steps have required around 500 seconds

for processing Comparing the data during the incident with the data from the day before will allow you to

judge if this might be a problem of the underlying J2EE engine as described in section J2EE Engine

Bottleneck

If there is only one mapping that faces performance problems there would be just one line sticking out in the

Wily graphs If you face a general problem that affects different interfaces you can choose a longer

timeframe that allows you to compare the processing times in a different time period and verify if it is only a

ldquotemporaryrdquo issue ndash this would for example indicate an overload of the mapping runtime

If you have found out that only one interface is affected then it is very unlikely to be a system problem but

rather a problem in the implementation of the mapping of that specific interface

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

32

Check the message size of the mapping in the runtime header using SXMB_MONI Verify if the message size is larger than usual (which would explain the longer runtime)

There can also be one or many lookups to a remote system (using RFC or JDBC) causing the long processing time of a mapping Together with the application you then have to check if the connection to the backend is working properly Such a mapping with lookups can be analyzed using Wily Transaction Trace as explained in the appendix section Wily Transaction Trace

If not one but several interfaces are affected a potential system bottleneck occurs and this is described in

the following

o Not enough resources (registered server programs) available That could either be the case if too many mapping requests are issued at the same time or if one J2EE server node is down and has not registered any server programs

To check if there were too many mapping requests for the available registered server programs compare the number of outbound queues that are concurrently active with the number of registered server programs The number of outbound queues that are concurrently active can be monitored with transaction SMQ2 and counting queues with the names XBTO amp XBQOXB2O XBTA and XBQAXB2A (high priority) XBTZ and XBQZXB2Z (low priority) and XBTM (large messages) In addition to this you have to take into account mapping steps that are executed by synchronous messages or in a ccBPM process Mappings executed by a ccBPM process are not queued but the ccBPM process calls the mapping program directly using tRFC The number of registered server programs can be determined with transaction SMGW Goto Logged On Clients and filtering by the program (ldquoTP Namerdquo) AI_RUNTIME_ltSIDgt In general we recommend configuring 20 parallel mapping connections per server node

If the problem occurred in the past it is more difficult to determine the number of queues that are concurrently active One way is to use transaction SXMB_MONI and specify every possible outbound queue (field ldquoQueue IDrdquo) in the advanced selection criteria (Note wildcard search not available) The timeframe can be restricted in the upper-part of the advanced selection criteria Advanced users could use transaction SE16 to search directly in table SXMSPMAST using the field ldquoQUEUEINTrdquo

The other option is to use the Wily ldquoqRFC Inbound Queue Detailrdquo dashboard as described in qRFC Queue Monitoring (SMQ2)

To check if the J2EE is not available or if the server program is not registered for a different reason use transaction SM59 to test the RFC destination AI_RUNTIME_JCOSERVER If the problem occurred in the past search the gateway trace (transaction SMGW GoTo Trace Gateway Display File) and the RFC developer traces dev_rfc files available in the work directory for the string ldquoAI_RUNTIMErdquo In addition check the std_serverltngtout in the work directory for restarts of the J2EE Engine

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

33

o Depending on the checks above there are two options to resolve the bottleneck Of course if the J2EE was not available the reason for this has to be found and prevented in the future

The two options for tuning are

1) the number of outbound queues that are concurrently active and

2) the number of mapping connections from the J2EE server to the ABAP gateway

The increase of mapping connections is only recommended for strong J2EE servers since each mapping thread needs resources like CPU and Heap memory Too many mapping connections mean too many mappings being executed on the J2EE Engine In turn this might reduce the performance of other parts of the PI system for example the pipeline processing in the Integration Engine or the processing within the adapters of the AFW or can even cause out-of-memory errors on the J2EE engine Add additional J2EE server nodes to balance the parallel requests Each J2EE server node will

register new destinations at the gateway and will therefore take a part of the mapping load

Another option is to reduce the number of outbound queues that are concurrently active to solve the bottleneck This will certainly result in higher backlogs in the queue and is therefore only applicable for less runtime critical interfaces This option can be used if the backlog is caused by a master data interface with high volume but no critical processing time requirements In such a case you can only lower the number of parallel queues for this interface by using a sub-parameter of EO_OUTBOUND_PARALLEL in transaction SXMB_ADM By doing so you slow down one interface to ensure other (more critical) interfaces have sufficient resources available

o If multiple application servers configured on the PI system ensure that all the server nodes are connected to their local gateway (local bundle option) as described in the guide How To Scale PI

443 Long Processing Times for ldquoPLSRV_CALL_ADAPTERrdquo

Call adapter is the last step executed in the PI pipeline and forwards the message to the next component

along the message flow In most cases (local Adapter Engine IDoc or BPE) this is a local call The IDoc

adapter only puts the message on the tRFC layer (SM58) of the PI system so that the actual transfer of the

IDoc is not included in the performance header For ABAP Proxies plain HTTP or calls to a central or

decentral Adapter Engine an HTTP call is made so that network time can have an influence here

Looking at the processing time we have to distinguish asynchronous (EO and EOIO) and synchronous (BE)

interfaces

In asynchronous interfaces the call adapter step includes the transfer of the message by the network and

the initial steps on the receiver side (in most cases this just means persisting the message on the receiverrsquos

database) A long duration can therefore have two reasons

o Network latency For large messages of several MB in particular the network latency is an important factor (especially if the receiving ABAP proxy system is located on another continent for example) Network latency has to be checked in case of a long call adapter step In case HTTP load balancers are used they are considered as part of the network

o Insufficient resources on receiver side Enough resources must be available at the receiver side to ensure quick processing of a message For instance enough ICM threads and dialog work processes must be available in the case of an ABAP proxy system Therefore the analysis of long call adapter steps always has to include the relevant backend system

For synchronous messages (requestresponse behavior) the call adapter step also includes the processing

time on the backend to generate the response message Therefore the call adapter for synchronous

messages includes the time of the transfer of the request message the calculation of the corresponding

response message at the receiver side and the transfer back to PI Therefore the processing time to

process a request at the receiving target system for synchronous messages must always be analyzed to find

the most costly processing steps

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

34

444 Long Processing Times for ldquoDB_ENTRY_QUEUEINGrdquo

The value for DB_ENTRY_QUEUEING describes the time that a message has spent waiting in a PI inbound

queue before processing started In case of errors the time also includes the wait time for restart of the LUW

in the queue The inbound queues (XBTI XBT1 XBT9 XBTL for EO messages and XBQIXB2I

XBQ1XB21 XBQ9XB29 for EOIO messages) process the pipeline steps for the Receiver

Determination the Interface Determination and the Message Split (and optionally XML inbound validation)

Thus if the step DB_ENTRY_QUEUEING has a high value the inbound queues have to be monitored using

transactions SMQ2 and SARFC The reasons are similar as those outlined in chapter qRFC Resources

(SARFC) and qRFC Queues (SMQ2)

o Not enough resources (DIA work processes for RFC communication) available

o A backlog in the queues caused by one of the following reasons

The number of parallel inbound queues is too low to handle the incoming messages in parallel Note that a simple increase of the parameter EO_INBOUND_PARALLEL might not always be the solution (as enough work processes also have to be available to process the queues in parallel) The number of work processes is in turn restricted by the number of CPUs that are available for your system If you would like to separate critical from non-critical sender systems define a dedicated set of inbound queues by specifying a sender ID as a sub parameter of EO_INBOUND_PARALLEL For example by doing so you can separate master data interfaces from business critical interfaces Please note that the separation on inbound queues only works on Business System level and not on interface level

The inbound queues were blocked by the first LUW Use transaction SMQ2 to check if that is still the case To identify if such a situation occurred in the past use the Wily Introscope dashboard ldquoABAP System qRFC Inbound Detailldquo It is generally not easy to find the one message that is blocking the queue It might be achieved by checking RFC traces or by searching for a single message with a long pipeline processing step For EO queues there is no business reason to block the queues in case of a single message error To prevent this follow the description in section Prevent blocking of EO queues

In case you have a different runtime of the messages in the queues (eg due to complex extended Receiver Determinations) the number of messages per queue might be uneven distributed between the queues In PI 73 a new balancing option is available as discussed in Avoid uneven backlogs with queue balancing

Problems when loading the LUW by the QIN scheduler as described in qRFC Queues (SMQ2)

Note If your system uses the parameter EO_INBOUND_TO_OUTBOUND = 0 you must also read chapter

445 for analyzing the reasons EO_INBOUND_TO_OUTBOUND only determines whether inbound queues

(value lsquo0rsquo) or inbound and outbound queues (value lsquo1rsquo) are used for the pipeline processing in the integration

server Check the value with transaction SXMB_ADM Integration Engine Configuration Specific

Configuration The default value is lsquo1rsquo meaning the usage of inbound and outbound queues (the

recommended behavior)

445 Long Processing Times for ldquoDB_SPLITTER_QUEUEINGrdquo

The value for DB_SPLITTER_QUEUEING describes the time that a message has spent waiting in a PI

outbound queue until a work process was assigned In case of errors the time also includes the wait time for

the restart of the LUW in the queue

The outbound queues (XBTO XBTA XBTZ XBTM for EO messages and XBQOXB2O

XBQAXB2A XBQZXB2Z for EOIO messages) process the pipeline steps for the mapping outbound

binding and call adapter

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

35

Thus if the step DB_SPLITTER_QUEUEING has a high value the outbound queues have to be monitored

using transactions SMQ2 and SARFC The reasons are similar as outlined in chapters qRFC Resources

(SARFC) and qRFC Queues (SMQ2) or described in the section above for PI inbound queues

o Not enough resources (DIA work processes for RFC communication) available

o A backlog in the queues caused by one of the following reasons

The number of parallel outbound queues is too low to handle the outgoing messages in parallel Note that simply increasing the parameter EO_OUTBOUND_ PARALLEL might not always be the solution as enough work processes have to be available to process the queues in parallel The number of work processes in turn is limited by the number of CPUs available for the system Also the parameter EO_OUTBOUND_PARALLEL is used differently than the parameter EO_INBOUND_ PARALLEL because it determines the degree of parallelism for each receiver system Thus it is possible to increase the number of parallel outbound queues for specific receivers whereas other receivers are handled with a lower degree of parallelism Note that not every receiver backend is able to handle a high degree of parallelism so that in case of an ABAP Proxy system it might not indicate a problem on PI in case you observe a high value for DB_SPLITTER_QUEUING

The outbound queues were blocked by the first LUW To identify if such a situation occurred in the past use the Wily Introscope dashboard ldquoABAP System qRFC Inbound Detailldquo In general it is not easy to find the one message that is blocking the queue It might be achieved by checking RFC traces or by searching for a single message with a long pipeline processing step For EO queues there is no business reason to block the queues in case of a single message error To prevent this follow the description in section Prevent blocking of EO queues

Since the outbound queues include a processing step that is bound to take longer than the other steps (that is the mapping and the call adapter step) this might mean a long waiting time for all subsequent messages As an example assume that a mapping takes 1 second and that 100 messages are waiting in a queue The first message gets executed at once the second message has to wait about 1 second (a bit more since more steps are executed than just the mapping but this should be ignored at the moment) the third takes 3 seconds and so on The 100

th message

has to wait 100 seconds until it is processed that is the value for DB_SPLITTER_QUEUING is as high as 100 For a given outbound queue you would therefore see an increase for the DB_SPLITTER_QUEUING value over time If you experience this situation proceed with Chapter 442

In case you have a different runtime of the messages in the queues (eg due to different message size) the number of messages per queue might be uneven distributed between the queues In PI 73 a new balancing option is available as discussed in Avoid uneven backlogs with queue balancing

o Problems when loading the LUW by the QIN scheduler as described in qRFC Queues (SMQ2)

Note The number of parallel outbound queues is also connected with the ability of the receiving system to

process a specific amount of messages for each time unit

In section Adapter Parallelism default restrictions of the specific adapters of the J2EE Engine are explained Some adapters (JDBC File Mail) can process only one message for each server node at a time (this can be changed) It does not make sense for such adapters to have too many parallel qRFC queues since this will only move the message backlog from ABAP to Java

For messages that are directed to the IDoc outbound adapter the value for EO_OUTBOUND_PARALLEL is connected to the MAXCONN value of the corresponding outbound destination You can define the maximum number of connections using transaction SMQS ndash the default value is 10 parallel connections If applicable SAP highly recommends that you use IDoc packaging in the outbound IDoc adapter to reduce the load during the transmission of tRFC calls from the PI system to the receiving system

For ABAP proxy systems the number of outbound queues directly determines the number of messages sent in parallel to the receiving system Thus you have to ensure that the resources

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

36

available on PI are aligned with those on the sendingreceiving ABAP proxy system

446 Long Processing Times for ldquoLMS_EXTRACTIONrdquo

Lean Message Search can be configured for newer PI releases as described in the Online Help After

applying Note 1761133 - PI runtime Enhancement of performance measurement an additional header for

LMS will be written to the performance header

The header could look like this indicating that around 25 seconds were spent in the LMS analysis

When using trace level two additional timestamps are written to provide details about this overall runtime

LMS_EXTRACTION_GET_VALUES

This timestamp describes the evaluation of the message according to user-defined filter criteria

LMS_EXTRACTION_ADJUST_VALUES

This timestamp describes the post-processing of the previously found values in the BAdI (as of PI 730)

The runtime of the LMS heavily depends on the number of payload attributes to be extracted Also the

payload size and the complexity of the XPath expression has a direct impact on the performance of the LMS

In general the number of elements to be indexed should be kept at a minimum and very deep and complex

XPath expressions should be avoided

If you want to minimize the impact of LMS on the PI message processing time you can define the extraction

method to use an external job In that case the messages will be indexed after processing only and it will

therefore have no performance impact during runtime Of course this method imposes a delay in the

indexing (based on the frequency of job SXMS_EXTRACT_MESSAGES) so that newly processed

messages cannot be searched using LMS If this delay is acceptable for the monitoring responsible the

messages should be indexed using an external job

447 Other step performed in the ABAP pipeline

As discussed earlier there are additional steps in the ABAP pipeline that can be executed based on the

configuration of your scenario Per default these steps will not be activated and should therefore not

consume any time All these steps are traced in the performance header of the message Below you can see

the details for

- XML validation

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

37

- Virus Scan

In case one of these steps is taking long you have to check the configuration of your scenario In the

example of the Virus scan the problem might be related to the external virus scanner used and tuning has to

happen on that side

45 PI Message Packaging for Integration Engine

PI message packaging was introduced with SAP NetWeaver PI 70 SP13 and is per default activated in PI

71 and higher Message packaging in BPE is independent of message packaging in PI (see chapter 56)

but they can be used together

To improve performance and message throughput asynchronous messages can be assembled to packages

and processed as one entity For this the qRFC scheduler was enabled to process a set of messages (a

package) in one LUW Thus instead of sending one message to the different pipeline steps (for example

mappingrouting) a package of messages will be sent that will reduce the number of context switches that is

required Furthermore access to databases is more efficient since requests can be bundled in one database

operation Depending on the number and size of messages in the queue this procedure improves

performance considerably In return message packaging can increase the runtime for individual messages

(latency) due to the delay in the packaging process

Message packaging is only applicable for asynchronous scenarios (QoS EO and EOIO) Due to the potential

latency of individual messages packaging is not suitable for interfaces with very high runtime requirements

In general packaging has the highest benefit for interfaces with high volume and small message size

The performance improvement achieved directly relates to the number of the messages bundled in each

package Message packaging must not solely be used in the PI system Tests have shown that the

performance improvement significantly increases if message packaging is configured end-to-end - that is

from the sending system using PI to the receiving system Message packaging is mainly applicable for

application systems connected to PI by ABAP proxy

From the runtime perspective no major changes are introduced when activating packaging Messages

remain individual entities in regards to persistence and monitoring Additional transactions are introduced

allowing the monitoring of the packaging process

Messages are also treated individually for error handling If an error occurs in a message processed in a

package then the package will be disassembled and all messages will be processed as single messages Of

course in a case where many errors occur (for example due to interface design) this will reduce the benefits

of message packaging SAP systems connected to PI via ABAP Proxy can also make use of the message

packaging to reduce the HTTP calls necessary to transmit messages between the systems Other sending

and receiving applications in particular will not see any changes because they send and receive individual

messages and receive individual commits

The PI message packaging can be adapted to meet your specific needs In general the packaging is

determined by three parameters

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

38

1) Message count Maximum number of messages in a package (default 100)

2) Maximum package size Sum of all messages in kilobytes (default 1 MB)

3) Delay time Time to wait before the queue is processed if the number of messages does not reach

the message count (default 0 meaning no waiting time)

These values can be adjusted using transactions SXMS_BCONF and SXMS_BCM To better use the

performance improvements offered by packaging you could for example define a specific packaging for

interfaces with very small messages to allow up to 1000 messages for each package Another option could

be to increase the waiting time (only if latency is not critical) to create bigger packages

See SAP Note 1037176 - XI Runtime Message Packaging for more details about the necessary

prerequisites and configuration of message packaging More information is also available at

httphelpsapcom SAP NetWeaver SAP NetWeaver PIMobileIdM 71 SAP NetWeaver Process

Integration 71 Including Enhancement Package 1 SAP NetWeaver Process Integration Library

Function-Oriented View Process Integration Integration Engine Message Packaging

46 Prevent blocking of EO queues

In general messages for interfaces using Quality of Service Exactly Once are independent of each other In

case of an error in one message there is no business reason to stop the processing of other messages But

exactly this happens in case an EO queue goes into error due to an error in the processing of a single

message The queue will then be automatically retried in configurable intervals This retry will cause a delay

of all other messages in the queue which cannot be processed due to the error of the first message in the

queue

To avoid this a new error handling was introduced in SAP Notes 1298448 - XI runtime No automatic retry

for EO message processing (set EO_RETRY_AUTOMATIC = 0 and schedule restart job

RSXMB_RESTART_MESSAGES) and SAP Note 1393039 - XI runtime Dump stops XI queue After

applying this Note EO queues will not go in SYSFAIL status any longer These Notes are recommended for

all PI systems and are per default activated on PI 73 systems By specifying a receiver ID as sub parameter

this behavior can be configured for individual interfaces only

In PI 73 you can also configure the number of messages that would go into error before the whole queue

would be stopped For permanent errors (eg due to a problem with the receiving backend) the queue would

go into SYSFAIL so that not too many messages are going into error This also reduces the number of alerts

for Message Based Alerting To define the threshold the parameter EO_RETRY_AUT_COUNT can be set

The default value is 0 and indicates that the number of messages is not restricted

47 Avoid uneven backlogs with queue balancing

From PI 73 on a new mechanism is available to address the distribution of messages in queues Per default

in all PI versions the messages are getting assigned to the different queues randomly In case of different

runtimes of LUWs caused by eg different message sizes or different runtimes in mappings this can cause

an uneven distribution of messages within the different queues This can increase the latency of the

messages waiting at the end of that queue

To avoid such an unequal distribution the system checks the queue length before putting a new message in

the queue Hence it tries to achieve an equal balancing during inbound processing A queue which has a

higher backlog will get less new messages assigned Queues with fewer entries will get more messages

assigned Therefore it is different than the old BALANCING parameter of category TUNING) which was used

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

39

to rebalance messages already assigned to a queue The assignment of messages to queues is shown in

the diagram below

To activate the new queue balancing you have to set the parameters EO_QUEUE_BALANCING_READ and

EO_QUEUE_BALANCING_SELECT of category TUNING in SXMB_ADM If the value of the parameter

EO_QUEUE_BALANCING_READ is 0 (default value) then the messages are distributed randomly across

the queues that are available If however EO_QUEUE_BALANCING_READ is set to a value n greater than

zero then on average the current fill level of the queue is determined after every nth message and stored in

the shared memory of the application server This data is used as the basis for determining the queue (see

description of parameter EO_QUEUE_BALANCING_SELECT) relevant for balancing

The parameter EO_QUEUE_BALANCING_SELECT specifies the relative fill levels of the queues in percent

Relative here means in relation to the maximum filled queue If there are queues with a lower fill level than

defined here then only these are taken into consideration for distribution If all queues have a higher fill level

then all queues are taken into consideration

Please note that determining the fill level takes place using database access and therefore impacts system

performance The value of EO_QUEUE_BALANCING_READ should for this reason be oriented towards the

message throughput and specific requirements or even distribution For higher volume system a higher value

should be chosen

Letrsquos look at an example In case you set the parameter EO_QUEUE_BALANCING_READ to 1000 after

every 1000th incoming message the queue distribution will be checked This requires a database access

and can therefore cause a performance impact The fill level for each queue will then be written to shared

memory In our example we configured EO_QUEUE_BALANCING_SELECT to 20 At a given point in time

you have three outbound queues on the system XBTO__A contains 500 entries XBTO__B 150 and

XBTO__C contains 50 messages Therefore XBTO__B has a fill level of 30 and XBTO__C of 10 That

means the only queue relevant for balancing is XBTO__C which will get more messages assigned

Based on this example you can see that it is important to find the correct values As a general guideline to

minimize performance impact you should set EO_QUEUE_BALANCING_READ to a rather large value The

correct value of EO_QUEUE_BALANCING_SELECT depends on the criticality of a delay caused by a

backlog

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

40

48 Reduce the number of parallel EOIO queues

As discussed in Chapter Parallelization of PI qRFC Queues it is important to keep the number of queues

limited As a rule of thumb the number of active queues should be equal to the number of available work

processes in the system It is especially crucial to not have a lot of queues containing only one message

since this will cause a high overhead on the QIN scheduler due to very frequent reloading of the queues

Especially for EOIO queues we often see a high number of parallel queues containing only one message

The reason for this is for example that the serialization has to be done on document number to eg ensure

that updates to the same material are not transferred out of sequence Typically during a batch triggered

load of master data this can result in hundreds of EOIO queues in SMQ2 just containing only one or two

messages This is very bad from a performance point of view and will cause significant performance

degradation for all other interfaces running at the same time To overcome this situation the overall number

of EOIO queues (for all interfaces) can be limited as of PI 71 as described in Change Number of EOIO

Queues

The maximum number of queues can be set in SXMB_ADM Configure Filter for Queue Prioritization

Number of EOIO queues

During runtime a new set of queues will be used with the name XB2 as can be seen below for the outbound

queues

These queues will be shared between all EOIO interfaces for the given interface priority and by setting the

number of queues the parallelization for all EOIO interfaces is limited Thus more messages will be using the

same EOIO queue and therefore PI message packaging will work better and also the reloading of the

queues by the QIN scheduler will show much better performance

In case of errors the messages will be removed from the XB2 queues and will be moved to the standard

XBQ queues All other messages for the same serialization context will be moved to the XBQ queue

directly to ensure the serialization is maintained This means that in case of an error the shared EOIO

queues will not be blocked and messages for other serialization contexts will be not delayed

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

41

49 Tuning the ABAP IDoc Adapter

Very often the IDoc adapter deals with very high message volume and tuning of this is very essential

491 ABAP basis tuning

As stated above the IDoc adapter uses the tRFC layer to transfer messages from PI to the receiver system

The IDoc adapter will only put the message in SM58 and from there the standard tRFC layer forwards the

LUW You therefore have to ensure that sufficient resources (mainly DIA WPs) are available for processing

tRFC requests

Wily Introscope also offers a dashboard (shown below) for the tRFC layer which shows the tRFC entries and

their different statuses With these dashboards you are also able to identify history backlogs on tRFC

In order to control the resources used when sending the IDocs from sender system to PI or from PI to the

receiver backend you can also think about registering the RFC destinations in SMQS in PI as known from

standard ALE tuning By changing the Max Conn or the Max Runtime values in SMQS you can limit or

increase the number of DIA WPs used by the tRFC layer to send the IDocs for a given destination This will

mitigate the risk of system overload on sender and receiver side

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

42

492 Packaging on sender and receiver side

One option for improving the performance of messages processed with the IDoc adapter is to use IDoc

packaging

The receiver IDoc adapter in PI (for sending messages from PI to the receiving application system) already

had the option for packaging IDocs in previous releases For this messages processed in the IDoc adapter

are bundled together before being sent out to the receiving system Thus only one tRFC call is required to

send multiple IDocs The message packaging that is being discussed in section PI Message Packaging uses

a similar approach and therefore replaces the ldquooldrdquo packaging of IDocs in the receiver IDoc adapter

Therefore we highly recommend configuring Message Packaging since this helps transferring data for the

IDoc adapter as well as the ABAP Proxy

For the sender IDoc adapter there was previously the option to activate packaging in the partner profile of

the sending system in case IDocs are not send immediately but in batch By specifying the ldquoMaximum

Number of IDocsrdquo in the report RSEOUT00 multiple IDocs are bundled in one tRFC calls At the PI side

these packages will be disassembled by the IDoc adapter and the messages will be processed individually

Therefore this will help mainly to reduce the tRFC resources involved in transferring the data between the

systems but not within PI

This behavior was changed with SAP NetWeaver PI 711 and higher If the sender backend sends an IDoc

package to PI a new option has been introduced which allows the processing of IDocs as a package on PI

as well You can specify an IDoc package size on the sender IDoc Communication Channel The necessary

configuration is described in detail in the following SDN blog - IDoc Package Processing Using Sender IDoc

Adapter in PI EhP1 A significant increase in throughput was recorded for high volume interfaces with small

IDocs

493 Configuration of IDoc posting on receiver side

As described in SAP Note 1828282 - IDoc adapter long runtime XI -gt IDoc-gt ALE also the way the IDocs

are posted on the receiver ERP backend will directly influence the processing of the LUW on the PI side In

general two options exist for IDoc posting

Trigger immediately The IDoc will be posted directly when it is received For this a free dialog workprocess will be required

Trigger by background program If this option is chosen the IDoc is persisted on the ERP side and the posting of the application data will happen asynchronously using report RBDAPP01 via a background job

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

43

When ldquotrigger immediatelyrdquo is configured the sender system (PI) has to wait till the IDoc is posted While the

workprocess will roll out you will see the LUW in ldquoTransaction Executingrdquo during this time For the

IDOC_AAE the RFC connection and the java threads will be kept busy until the IDOC is posted in the

backend therefore leading to backlogs on the IDOC_AAE

The coding for posting the application data can be very complex and therefore this operation can take a long

time consuming unnecessary resources also on the sender side With background processing of the IDocs

the sender and receiver systems are decoupled from each other

Due to this we generally recommend to use ldquotrigger immediatelyrdquo in exceptional cases where the data has to

be posted without any delay In all other cases like high volume replication of master data background

processing of IDocs should be chosen

With Note 1793313 and 1855518 a third option for posting IDocs on the ERP side was introduced This

option is using bgRFC on the ERP side to queue the IDocs As soon as the configured resources for the

bgRFC destination are available the IDocs will be posted By doing so the IDocs will be posted based on

resource availability on the receiver system and no additional background jobs will be required Furthermore

storing the LUWs in bgRFC will ensure that the sender and receiver systems are decoupled This option

therefore guarantees that the IDocs will be posted as soon as possible based on the available resources on

the receiver system without the requirement to schedule many background jobs This new option is therefore

the recommended way of IDoc posting in systems that fulfill the necessary requirements

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

44

410 Message Prioritization on the ABAP Stack

A performance problem is often caused by an overload of the system for example due to a high volume

master interface If business critical interfaces with a maximum response time are running at the same time

then they might face delays due to a lack of resources Furthermore as described in qRFC Queues (SMQ2)

messages of different interfaces use the same inbound queue by default which means that critical and less-

critical messages are mixed up in the same queue

We highly recommend using message prioritization on the ABAP stack to prevent such delays for critical

interfaces (for Java see Interface Prioritization in the Messaging System )

For more information about PI prioritization see httphelpsapcom and navigate to SAP NetWeaver SAP

NetWeaver PIMobileIdM 71 SAP NetWeaver Process Integration 71 Including Enhancement Package 1

SAP NetWeaver Process Integration Library Function-Oriented View Process Integration

Integration Engine Prioritized Message Processing

To balance the available resources further between the configured interfaces you can also configure a

different parallelization level for queues with different priorities Details for this can be found in SAP Note

1333028 ndash Different Parallelization for HP Queues

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

45

5 ANALYZING THE BUSINESS PROCESS ENGINE (BPE)

This chapter helps to analyze performance problems if Chapter Determining the Bottleneck has shown that

the BPE is the slowest step for a message during its time in the PI system Of course this type of runtime

engine is very susceptible to inefficient design of the implemented integration process Information about

best practices for designing BPE processes can be found in the documentation Making Correct Use of

Integration Processes In general the memory and resource consumption is higher than for a simple pipeline

processing in the Integration Engine As outlined in the document linked above every message that is sent

to BPE and every message that is sent from BPE is duplicated In addition work items are created for the

integration process itself as well as for every step More database space is therefore required than

previously expected and more CPU time is needed for the additional steps

51 Work Process Overview

The Business Process Engine executes the steps (work items) of an integration process using tRFC calls to

the RFC destination WORKFLOW_LOCAL_ltclientgt To check if there are enough resources available to

process the work items call transaction SM50 (on each application server) while one of the integration

processes is running

o Look for the user ldquoWF-BATCHrdquo and follow its activities by refreshing the transaction

o Are there always dialog work processes available

The most important point to keep in mind here is the concurrent activity of the Business Process Engine (for

ccBPM processes) and the Integration Engine (for pipeline processing) Both runtime environments need

dialog work processes Thus performance problems in one of the engines cannot be solved by restricting

the other Rather an appropriate balance between the two runtimes has to be found

The activity of the WF-BATCH user is not restricted by default It is highly recommended that you register the

RFC destination WORKFLOW_LOCAL_ltclientgt with transaction SMQS and assign a suitable amount of

maximum connections Otherwise ccBPM activity may block the pipeline processing of the Integration

Server and reduce the throughput of PI drastically SAP Note 888279 - Regulatingdistributing the Workflow

Load gives more details

In addition it might help to use specific servers (dialog instances) to carry out a given workflow or to use load

balancing Of course both options only apply for larger PI systems that consist of at least a central instance

and one dialog instance

52 Duration of Integration Process Steps

As described in section Processing Time in the Business Process Engine with PI 73 there is a new monitor

available in ldquoConfiguration and Monitoring Homerdquo on PI start page It shows the duration of individual steps

of an integration process in a very easy way and is the now tool to analyze performance related issues on

the ccBPM

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

46

For releases prior than PI 73 check in the workflow log for long-running steps as described in the previous

chapter Use the ldquoList with Technical Detailsrdquo to do so Since the design of integration processes varies

significantly there is no general rule for the duration of a step or for specific sequences of steps Instead

you have to get a feeling for your implemented integration processes

Did a specific process step decrease in performance over a period of time

Does one specific process step stick out with regard to the other steps of the same integration process

Do you observe long durations for a transformation step (ldquomappingrdquo)

Do you observe long durations of synchronous sendreceive steps in the integration process which correlates for example to a slow response from a connected remote system

Use the SAP Notes database to learn about recommendations and hints SAP Note 857530 - BPE

Composite Note Regarding Performance acts as a central note for all performance notes and might be used

as the entry point

Once you are able to answer the above questions there are several paths to follow

o Two common reasons should be taken into consideration for a performance decrease over time Firstly the load on PI might have increased over time as well Check the hardware resources (chapter 9) and carefully monitor the work process availability (chapter 51) Alternatively the underlying databases might slow down the integration processes Use chapter 54 to check this possibility

o If it is a specific process step that sticks out the process design itself has to be reviewed The SAP Notes mentioned above might help here

o If the mapping step takes a long time it is worth having a look at chapter 442 since the BPE and the IE use the same resources (mapping connections from the ABAP gateway to the J2EE server) to execute their mapping

o If there is a long processing time for a synchronous sendreceive step check the connected backend system for any performance issues

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

47

53 Advanced Analysis Load Created by the Business Process Engine (ST03N)

Since the Business Process Engine runs on the ABAP part of the PI Integration Server you can use the

statistical data written by the dialog work processes to analyze a performance problem

Procedure

Log in to your Integration Server and call transaction ST03N Switch to ldquoExpert Moderdquo and choose the

appropriate timeframe (this could be the ldquoLast Minutersquos Loadrdquo from the Detailed Analysis menu or a specific

day or week from the Workload menu) In the Analysis View navigate to User and Settlement Statistics and

then to User Profile The user you are interested in is the WF-BATCH user who does all ccBPM-related work

Compare the total CPU Time (in seconds) to the time frame (in seconds) of the analysis In the above example an analysis time frame of 1 hour has been chosen which corresponds to 3600 seconds To be exact these 3600 seconds have to be multiplied by the number of available CPUs Out of these 3600CPU seconds the WF-BATCH user used up 36 seconds

o The above value gives you a good idea of how much load the Business Process Engine creates on your PI Integration Server By comparison with the other users (as well as the CPU utilization from transaction ST06) you can determine if the Integration Server is able to handle this load with the available number of CPUs or not This does not help you to optimize the load but it does provide support when deciding how to distribute the workload of the BPE and the workload of the Integration Server (see Chapter 51)

Experienced ABAP administrators should also analyze the database time wait time and compare these values

54 Database Reorganization

The database is accessed many times during the processing of an Integration Process This is not usually

connected with performance problems but if a specific database table is large then statements may take

longer than for small database tables

Use transaction ST05 to collect a SQL trace while the integration process in question is being executed Restrict the trace to user WF-BATCH When the integration process is finished stop the trace and view it Look for high values in the duration column especially for SELECT statements

Use transaction SE16 to determine the number of entries for the typical workflow tables SWW_WI2OBJ and SWFRXI There are many more database tables used by BPE but if the number of entries is high for the above tables they are high for the rest as well See SAP Note 872388 - Troubleshooting Archiving and Deletion in XI and the links given in this note if you find high amounts of entries

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

48

55 Queuing and ccBPM (or increasing Parallelism for ccBPM Processes)

Until recently ccBPM processes accepted messages from only one queue (one queue XBQO$PEWS for

each workflow) If there was a high message throughput for this workflow then a high backlog occurred for

these queues A couple of enhancements have therefore been implemented to improve scalability if the

ccBPM runtime

o Inbound processing without buffering

o Use of one configurable queue that can be prioritized or assigned to a dedicated server

o Use of multiple inbound queues (queue name will be XBPE_WS)

o Configure transactional behavior of ccBPM process steps to reduce number of persistency steps

Details about the configuration and tuning of the ccBPM runtime are described at

httpswwwsdnsapcomirjsdnhowtoguides SAP Netweaver 70 End-to-End Process Integration

o How to Configure Inbound Processing in ccBPM Part I Delivery Mode

o How to Configure Inbound Processing in ccBPM Part II Queue Assignment

o How to Configure ccBPM Runtime Part III Transactional Behavior of an Integration Process

Procedure

Use transaction SMQ2 to check if the queues of type XBQO$PE show a high backlog

o Based on this information verify if the ccBPM process is suitable to run with multiple queues This is especially the case for collect patterns realized in BPE If you can use multiple queues configure the workflow based on the above guides

56 Message Packaging in BPE

Inbound processing takes up the most amount of processing time in many scenarios within BPE

Message packaging in BPE helps to improve performance by delivering multiple messages to BPE process

instances in one transaction This can lead to an increased throughput which means that the number of

messages that can be processed in a specific amount of time can increase significantly Message packaging

can also increase the runtime for individual messages (latency) due to the delay in the packaging process

The sending of packages can be triggered when the packages exceed a certain number of messages a

specific package size (in kB) or a maximum waiting time The extent of the performance improvement that

can be obtained depends on the type of scenario Scenarios with the following prerequisites are generally

most suitable

o Many messages are received for each process instance

o Messages that are sent to a process instance arrive together in a short period of time

o Generally high load on the process type

o Messages do not have to be delivered immediately

For example collect scenarios are particularly suitable for message packaging The higher the number of

messages is in a package the higher the potential performance improvements will be

Tests that have been performed have shown a high potential for throughput improvements up to factor 47 in

tested Collect Scenarios depending on the packaging size that has been configured For more details about

the functionality of Message Packaging in BPE and its configuration see SAP Note 1058623 - BPE

Message Packaging

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

49

Message packaging in BPE is independent of message packaging in PI (see Chapter 45) but they can be

used together

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

50

6 ANALYZING THE (ADVANCED) ADAPTER ENGINE

This chapter describes further checks and possible reasons for bottlenecks in the (Advanced) Adapter

Engine Although this is not the place to describe the architecture of the Adapter Framework in detail the

following aspects are important to know for the analysis of a performance problem on the AFW

o The AFW contains the Messaging System that is responsible for the persistence and queuing of messages Within the flow of a message it is placed in between the sender adapter (for example a file adapter that polls from a folder) and the Integration Server as well as between the Integration Server and the receiver adapter (for example a JMS adapter that sends a message out to an external system)

o The Messaging System uses 4 queues for each adapter type and these are responsible for receiving and sending messages To separate the message transfer each adapter (for example JMS SOAP JDBC and so on) has its own set of queues One queue for receiving synchronous messages (Request queue) one queue for sending synchronous messages (Call queue) one queue for receiving asynchronous messages (Receive queue) and one queue for sending asynchronous messages (Send queue) The name of the queue always states the adapter type and the queue name (for example JMS_httpsapcomxiXISystemReceive for the JMS receiver asynchronous queues)

o A configurable number of consumer threads are assigned on each queue These threads work on the queue in parallel meaning that the Messaging Queues are not strictly FirstIn-FirstOut Five threads are assigned by default to each queue and thus five messages can be processed in parallel But as will be discussed later the parallel execution of requests to the backend depends heavily on the adapter being used since not every adapter can work in parallel by default

o SAP NetWeaver PI 71 introduced a new queue The dispatcher queue The dispatcher queue is used to realize prioritization on the AAE (see chapter Prioritization in the Messaging System) Every message in the PI system is first assigned to the dispatcher queue before it is assigned to the adapter-specific queues

Viewed from the perspective of a message that enters the PI system using a J2EE Engine adapter (for

example File) and leaves the PI system using a J2EE Engine adapter (for example JDBC) asynchronously

the steps are as follows

1) Enter the sender adapter

2) Put into the dispatcher queue of the messaging system

3) Forwarded to the File send queue of the messaging system (based on Interface priority)

4) Retrieved from the send queue by a consumer thread

5) Sent from the messaging system to the pipeline of the ABAP Integration Engine (if no Integrated Configuration (AAE) is used

6) Processed by the pipeline steps of the Integration Engine

7) Sent to the messaging system by the Integration Engine

8) Put into the dispatcher queue of the messaging system

9) Forwarded to the JDBC receive queue of the messaging system (based on Interface priority)

10) Retrieved from the receive queue (based on the maxReceivers)

11) Sent to the receiver adapter

12) Sent to the final receiver

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

51

All the analysis is based on the information provided in the Audit Log With SAP NetWeaver PI 71 the audit

log is not persisted for successful messages in the database by default to avoid performance overhead

Therefore the audit log is only available in the cache for a limited period of time (based on the overall

message volume) More details can be found in chapter Persistence of Audit Log information in PI 710 and

higher

As described in chapter Adapter Engine Performance monitor in PI 731 and higher a new performance

monitor is available from PI 731 SP4 With this monitor you can display the processing time per interface in

a given interval and identify the processing steps in which a lot of time is spend

With 731 SP10 and 74 SP5 a download functionality for the performance monitor is provided as described

in SAP Note 1886761

61 Adapter Performance Problem

Compared to a bottleneck in the Messaging System it is relatively easy to detect a performance

problembottleneck in the adapter Since the timestamps for inbound and outbound messages are written at

different points in time the procedure is described separately for sender and receiver adapters

Note It is not generally possible for all sender and receiver adapters to work in parallel (discussed in section

Adapter Parallelism)

611 Adapter Parallelism

As mentioned before not all adapters are able to handle requests in parallel Thus increasing the number of

threads working on a queue in the messaging system will not solve a performance problembottleneck

There are 3 strategies to work around these restrictions

1) Create additional communication channels with a different name and adjust the respective senderreceiver agreements to use them in parallel

2) Add a second server node that will automatically have the same adapters and communication channel running as the first server node This does not work for polling sender adapters (File JDBC or Mail) since the adapter framework scheduler assigns only one server node to a polling communication channel

3) Install and use a non-central Adapter Framework for performance-critical interfaces to achieve better separation of interfaces

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

52

Some of the most frequently-used adapters and the possible options are discussed below

o Polling Adapters (JDBC Mail File)

At the sender side these adapters use an Adapter Framework Scheduler which assigns a server node that does the polling at the specified interval Thus only one server node in the J2EE cluster does the polling and no parallelization can be achieved Therefore scaling via additional server node is not possible More details will be presented in section Adapter Framework Scheduler For example since the channels are doing the same SELECT statement on the database or pick up files with the same file name parallel processing will only result in locking problems To increase the throughput for such interfaces the polling interval has to be reduced to avoid a big backlog of data that is to be polled If the volume is still too high you should think about creating a second interface for example the new interface would poll the data from a different directory or database table to avoid locking

At the receiver side the adapters work sequentially on each server node by default For example only one UPDATE statement can be executed for JDBC for each Communication Channel (independent of the number of consumer threads configured in the Messaging System) and all the other messages for the same Communication Channel will wait until the first one is finished This is done to avoid blocking situations on the remote database On the other hand this can cause blocking situations for whole adapters as discussed in section Avoid Blocking Caused by Single SlowHanging Receiver Interface

To allow better throughput for these adapters you can configure the degree of parallelism at the receiver side In the Processing tab of the Communication Channel in the field ldquoMaximum Concurrencyrdquo enter the number of messages to be processed in parallel by the receiver channel For example if you enter the value 2 then two messages are processed in parallel on one J2EE server node The parallel execution of these statements at database level of course depends on the nature of the statements and the isolation level defined on the database If all statements update the same database record database locking will occur and no parallelization can be achieved

o JMS Adapter

The JMS adapter is per default using a push mechanism on the PI sender side This means the data is pushed by the sending MQ provider Per default every Communication channel has one JMS connection established per J2EE server node On each connection it processes one message after the other so that the processing is sequential Since there is one connection per server node scaling via additional server nodes is an option

With Note 1502046 - JMS Adapter message pull mechanism the JMS protocol can be changed to use a pull mechanism By doing so you can specify a polling interval in the PI Communication Channel and PI will be the initiator of the communication Also here the Adapter Framework Scheduler is used which implies sequential processing But in contrary to JDBC or File sender channels the JMS polling sender channel will allow parallel processing on all server nodes of the J2EE cluster Therefore scaling via additional J2EE server nodes is an option

The JMS receiver side supports parallel operation out of the box Only some small parts during message processing (pure sending of the message to the JMS provider) are synchronized Therefore to enable parallel processing on the JMS receiver side no actions are necessary

o SOAP Adapter

The SOAP adapter is able to process requests in parallel The SOAP sender side has in general no limitations in the number of requests it can execute in parallel The limiting factor here is the FCAThreads available to process the incoming HTTP calls This is discussed in more detail in chapter Tuning SOAP sender adapter On the receiver side the parallelism depends on the number of the threads defined in the messaging system and the ability of the receiving system to cope with the load

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

53

o RFC Adapter

The RFC adapter offers parameters to adjust the degree of parallelism by defining the number of initial and maximum connections to be used The initial threads are allocated from the Application thread pool directly and are therefore not available for any other tasks Therefore the number of initial thread should be kept minimal To avoid bottlenecks during the peak time the maximum connections can be used A bottleneck is indicated by the following exception in the Audit Log

comsapaiiaframsapiDeliveryException error while processing message to remote systemcomsapaiiafrfccoreclientRfcClientException resource error could not get a client from JCOPool comsapmwjcoJCO$Exception (106) JCO_ERROR_RESOURCE Connection pool RfcClient hellip is exhausted The current pool size limit (max connections) is 1 connections

Also this should be done carefully since these threads will be taken from the J2EE application thread pool Therefore a very high value there can cause a bottleneck on the J2EE engine and therefore cause a major instability of the system As per RFC adapter online help the maximum number of connections are restricted to 50

o IDoc_AAE Adapter

The IDoc adapter on Java (IDoc_AAE) was introduced with PI 73 On the sender side the parallelization depends on the configuration mode chosen In Manual Mode the adapter works sequential per server node For channels in ldquoDefault Moderdquo it depends on the configuration of the inbound Resource Adapter (RA) Via the parameter MaxReaderThreadCount of the inbound RA you can configure how many threads are globally available for all IDoc adapters running in Default Mode Hence this determines the overall parallelization of the IDoc sender adapter per Java server node Currently the recommended maximum number of threads is 10

The receiver side of the Java IDoc adapter is working parallel per default

The table below gives a summary of the parallelism for the different adapter types

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

54

612 Sender Adapter

The first step(s) to be taken for a sender adapter depend on the type of adapter itself Afterwards however

the message flow is always the same First the message is processed in the Module Processor and

afterwards put into the queue of the Messaging System If the message is synchronous then it is the call

queue if the message is asynchronous that it is the send queue The final action of the Adapter Framework

is then to send the message from the messaging system to the Integration Server (for Non ICO interfaces)

Only the first entries of the audit log are of interest to establish if the sender adapter has a performance

problem From the first entry to the entry ldquoMessage successfully put into the queuerdquo These steps logically

belong to the sender adapter and the Module Processor while the subsequent steps belong to the

Messaging System The sender adapter uses one J2EE Engine application thread to execute its work This

is generally the same thread for all steps involved

Select the ldquoDetailsrdquo of a message in the AFW The first entry of the audit log will be the beginning of the activity of the sender adapter The end of the adapterrsquos activity is marked by the entry ldquoThe application sent the message asynchronously using connection Returning to applicationrdquo

6121 Tuning SOAP sender adapter As described above in general there is no limitation in the parallelization of the SOAP sender adapter itself

The incoming SOAP requests are limited by the available FCA Threads for HTTP processing The FCA

Threads are discussed in more detail in FCA Server Threads

Per default all interfaces using the SOAP adapter are using the same set of FCA Threads since they are all

using the same URL httplthostgtltportgtMessageServlet In case one interface is facing a very high load or

slow backend connections this can cause a blockage of the available FCA Threads and could cause a heavy

impact on the processing time of other interfaces

To avoid this SAP Note 1903431 allows the usage of different URLs for specific interfaces This way you

could isolate interfaces with a very high volume on a dedicated entry point using its own set of FCA Threads

In case of high load for this interface the other SOAP sender interfaces would not be affected by a shortage

of FCA Threads

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

55

To use this new feature you can specify on the sender system the following two entry points

o MessageServletInternal

o MessageServletExternal

More information about the URL of the SOAP adapter can be found here

613 Receiver Adapter

For a receiver adapter it is just the other way around when compared to the sender adapter First the

message is received from the Integration Server and then put into a queue of the Messaging System this

time either the request or the receive queue for asynchronous and the request queue for synchronous

messages The last step is then the activity of the receiver adapter itself

Select the ldquoDetailsrdquo of a message in the AFW The adapter activity starts after the entry ldquoThe message status set to DLNGrdquo The message will then be forwarded to the entry point of the AFW ldquoDelivering to channel ltchannel_namegtrdquo enters the Module Processor ldquoMP Entering Module Processorrdquo and ends with ldquoThe message was successfully delivered to the application using connection rdquo

Depending on the type of the adapter you now have to investigate why the receiver adapter needs a high amount of processing time This could be a complicated operating system command a bad network connection or a slow backend system among many other reasons

Here too custom modules can be the reason for a prolonged processing time Check Performance of Module Processing for more details

Note From 71 the audit log is not persisted any longer for performance reasons as described in Persistence of Audit Log information in PI 710 and higher

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

56

614 IDoc_AAE adapter tuning

A comprehensive and up-to date description about the tuning of the IDoc adapter is summarized in Note

1641541 - IDoc_AAE Adapter performance tuning Please refer to this Note for details

Similar to all other adapters running on the Adapter Engine also the IDOC_AAE adapter is using the Messaging System queues and the explanations of the next chapter Messaging System Bottleneck are also valid for the IDOC_AAE adapter Note that the IDOC_AAE adapter only supports async Scenarios ndash so it is only using the ldquoSendrdquo queue in Java-only scenarios and the ldquoSendrdquo and ldquoReceiverdquo queue in dual-stack scenariosrdquo

For the IDoc_AAE sender you can configure in addition bulk support as indicated in the screenshot below By doing so all messages received by PI in one RFC call from ERP will be processed as one message This is similar to the mechanism of the ABAP IDoc adapter described in the chapter Packaging on sender and receiver side except with the difference that the number of messages in one bulk cannot be further limited

It is also essential to ensure that the size of the IDocs or IDoc packages for sender or receiver IDocs is not

getting too large to avoid any negative impact on the Java heap memory and Garbage Collection In the

case of the IDoc_AAE adapter the XML size of the IDoc is less critical it is the number of segments in an

IDoc or the overall number of segments in all IDocs of an IDoc package that will influence the memory

allocation during message processing Based on the current implementation per IDoc segment around 5 KB

of memory during processing is required A package of 5 IDocs with 2000 segments for each IDoc would

consume roughly 500 MB during processing In such cases it is important to eg lower the package size

andor use large message queues as outlined in chapter Large message queues on PI Adapter Engine

615 Packaging for Proxy (SOAP in XI 30 protocol) and Java IDoc adapter

Due to message packaging in the Integration Engine (see PI Message Packaging for Integration Engine) the

ABAP based IDoc and Proxy adapter was always using packaging when sending asynchronous messages

to the receiver system With PI 731 packaging was also introduced for Java based IDoc and Proxy (SOAP in

XI 30 protocol) adapter The aim of packaging is to achieve similar benefits to packaging already existing in

ABAP pipeline

o Less calls to backend consuming less parallel Dialog Work Processes on the receiver

o Less overhead due to context switches authentication when calling the backend

Especially for high volume interfaces packaging should be enabled to avoid overloading of the backend

systems and impact on users operating on the system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

57

Important

In the ABAP stack the packaging is active per default (in 71 and higher) as described in chapter PI

Message Packaging for Integration Engine Experience shows that packaging can have a very high impact

on the DIA Work process usage on the ERP receiver side Especially for migration projects from dual-stack

PI to AEXPO it is important to evaluate packaging to avoid any negative impact on the receiving ECC due

to missing packaging

The packaging on Java works in general different then on ABAP While in ABAP the aggregation is done on

the individual qRFC queue level (which always contains messages to one receiver system only) this is not

possible on the Adapter Engine since messages for all receivers are located in the same queue Therefore

packages are built outside of the Adapter Engine queues The messages to be packaged will be forwarded

to a bulk handler which waits till either the time for the package is exceeded or the number of messages or

data size is reached After this the message is send by a bulk thread to the receiving backend system

Packaging on Java is always done for a server node individually If you have a high number of server nodes

packaging will work less efficiently due to the load balancing of messages across the available server nodes

When enabling message packaging the message status will stay in status ldquoDeliveringrdquo throughout all steps

described above In the audit log you can see the time it spends in packaging The audit log shown below

shows eg that the message was almost waiting one minute before the package was built and 9 IDocs were

sent with this package

Packaging can be enabled globally by setting the following parameters in Messaging System service as

described in Note 1913972

o messagingsystemmsgcollectorenabled Enable packaging or disable it globally

o messagingsystemmsgcollectormaxMemPerBulk The maximum size of a bulk message

o messagingsystemmsgcollectorbulkTimeout The wait time per bulk - default 60 seconds

o messagingsystemmsgcollectormaxMemTotal Maximum memory which can be used by message collector

o messagingsystemmsgcollectorpoolSize Number of parallel threads used to send out packages to the backend This value has to be tuned based on the general performance of the receiver system the network latency between the systems or the configuration of the IDoc posting on the ERP sender side (described in chapter IDoc posting on receiver side) Therefore tuning of these threads might be very important These threads can unfortunately not be monitored yet with any PI tool or Wily

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

58

Introscope To verify that not all packaging threads are in use you can use the Thread monitoring in the SAP Management Console and check for threads named ldquoBULK_EXECUTORrdquo as shown below

If you would like to adapt the packaging for specific communication channel this can also be done using the

configuration options in the Integration Directory

While in the SOAP adapter in XI 30 protocol you can define three different packaging KPIs this is currently

not possible for the IDoc adapter There you can only specify the package size based on number of

messages

616 Adapter Framework Scheduler

For polling adapters like File JDBC and JMS the Adapter Framework Scheduler determines on which server

node the polling takes place Per default a File or JDBC sender only works on one server node and the

Adapter Framework (AFW) Scheduler determines on which one the job is scheduled

The Adapter Frameworks scheduler had several performance and reliability issues in a cluster environment

especially in systems having many Java server nodes and many polling communication channels Based on

this and the symptoms described in Note 1355715 - AF Scheduler to avoid using cluster communication a

new scheduler was released We highly recommend using the new version of the AFW scheduler To

activate the new scheduler you have to set the parameter schedulerrelocMode of the Adapter framework

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

59

service property to some negative value For instance by setting the value to -15 after every 15th polling

interval a rebalancing of the channel within the J2EE cluster might happen This can allow for better

balancing of the incoming load across the available server nodes if the files are coming in on regular

intervals The balancing is achieved by doing a servlet call Based on the HTTP load balancing the channel

might then be dispatched to another server node To avoid the balancing overhead this value should not be

set too low A value between -10 and -15 should be acceptable in most cases For very short polling intervals

(eg every second) even lower values (eg -50) can be configured

In case many files are put at one given time on the directory (by eg using a batch process) all these files will

be processed by one polling process only Hence a proper load balancing cannot be achieved by the AFW

scheduler In such a case the only option is to use write the files with different names or different directories

so that you can configure multiple sender channels to pick up the files

Starting from PI 731 you can monitor the Adapter Framework Scheduler in pimon Monitoring

Background Job Processing Monitor Adapter Framework Scheduler Jobs There you can check on which

server node the Communication Channel is polling (Status=rdquoActiverdquo) Also you can see the time the channel

was polling the last time and will poll next time

You cannot influence the server node on which the channel is polling This is determined via the AFW

scheduler You can only influence the frequency after which a channel can potentially move to another

server node by tuning the parameter relocMode as outlined above

Prior to PI 731 you have to use the following URL to monitor the Adapter Framework Scheduler

httpserverportAdapterFrameworkschedulerschedulerjsp

Above you see an example of a File sender channel The type value determines if the channel runs only on

one server node (type = L (local)) or on all server nodes (Type = G (global)) The time in the brackets

[60000] determines the polling interval in ms ndash in this case 60 seconds The Status column shows the

status of the channel

o ldquoONrdquo Currently polling

o ldquoonrdquo Currently waiting for the next polling

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

60

o ldquooffrdquo not scheduled any longer (eg channel deactivated or scheduled on another server node)

62 Messaging System Bottleneck

As described earlier the Messaging System is responsible for persisting and queuing messages that come

from any sender adapter and go to the Integration Server pipeline or Java only interfaces (ICO) or that come

from the Integration Server pipeline and go to any receiver adapter The time difference between receiving

the message from the one side and delivering it to the other side is the value that can be used to analyze

bottlenecks in the Messaging System The following chapters describe how to determine the time difference

for both directions

A bottleneck (or rather a ldquofakerdquo bottleneck) in the Messaging System can usually be closely connected with a

performance problem in the receiver adapter You must therefore make sure you execute the check

described in chapter 613 If the receiver adapter needs a lot of time to process messages then of course

the messages get queued in the messaging system and remain there for a long time since the adapter is not

ready to process the next one yet It looks like the messaging system is not fast enough but actually the

receiver adapter is the limiting factor

Execute the checks described in chapter 621 and 622 to get an overview of the situation you should do this for multiple messages

Decide if a specific receiver adapter is causing long waiting times in the messaging system by executing the check described in chapter 613 An additional hint at a receiver adapter problem is if only messages to specific receivers show a long wait time in the queue

Use the queue monitoring in the Runtime Workbench (RWB) to analyze how many threads are active for each of the queue types and how large the queue size currently is To do so choose Component Monitoring Adapter Engine Press the Engine Status button and choose tab ldquoAdditional Datardquo This will open the page below showing the number of messages in a queue the threads currently working and the maximum available threads Note This view only shows the Messaging System of one J2EE server node To get the whole picture you have to check all the server nodes that are available in your system via the drop down list box

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

61

o Since checking all the server nodes with this browser frontend can be very tedious it is better to use Wily Introscope since it allows easy graphical analysis of the queues and thread usage of the messaging system across Java server nodes

The starting point in Wily Introscope is usually the PI triage showing all the backlogs in the queue In the screenshot below a backlog in the dispatcher queue (for 71 and higher only) can be seen

The dispatcher queue is not the problem here since it will forward messages to the adapter queue if any free

threads are available on the adapter specific consumer threads The analysis must start in the PI inbound

queues By using the navigation button in the upper right corner of the inbound queue size you can directly

jump to a more detailed view where you can see that the file adapter was causing the backlog

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

62

To see the consumer thread usage you can then follow the link to the file adapter In the screenshot below

you can see that all consumer threads were exceeded during the time of the backlog Thus increasing the

number of consumer threads could be a solution if the resulting delay is not acceptable for business

o If it is obvious that the messaging system does not have enough resources as in the case above increase the number of consumers for the queue in question Use the results of the previous check to decide which adapter queues need more consumers

The adapter specific queues in the messaging system have to be configured in the NWA using

service ldquoXPI Service AF Corerdquo and property messagingconnectionDefinition The default

values for the sending and receiving consumer threads are set as follows

(name=global messageListener=localejbsAFWListener exceptionListener=

localejbsAFWListener pollInterval=60000 pollAttempts=60

SendmaxConsumers=5 RecvmaxConsumers=5 CallmaxConsumers=5

RqstmaxConsumers=5)

To set individual values for a specific adapter type you have to add a new property set to the default set with the name of the respective adapter type for example

(name=JMS_httpsapcomxiXISystem

messageListener=localejbsAFWListener

exceptionListener=localejbsAFWListener pollInterval=60000

pollAttempts=60 SendmaxConsumers=7 RecvmaxConsumers=7

CallmaxConsumers=7 RqstmaxConsumers=7)

The name of the adapter specific queue is based on the pattern ltadapter_namegt_ ltnamespacegt for

example RFC_httpsapcomxiXISystem

Note that you must not change parameters such as pollInterval and pollAttempts For more details see SAP Note 791655 - Documentation of the XI Messaging System Service Properties

o Not all adapters use the parameter above Some special adapters like CIDX RNIF or Java Proxy can be changed by using the service ldquoXPI Service Messaging Systemrdquo and property messagingconnectionParams by adding the relevant lines for the adapter in question as described above

o A performance problem on the Messaging System can also be caused by a high logging as described in chapter Logging Staging on the AAE (PI 73 and higher)

o Important Make sure that your system is able to handle the additional threads that is monitor the CPU usage and application thread availability as well as the memory of the J2EE Engine after you have applied the change

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

63

o Important Keep in mind that not all adapters are able to handle requests in parallel even if you assign more threads on the Messaging System queues In such a case adding additional threads will not speed up processing significantly See chapter Adapter Parallelism for details

o A backlog of messages in the messaging system is highly critical for synchronous scenarios due to timeouts that might occur Therefore the number of synchronous threads always has to be tuned so that no bottleneck occurs Often a backlog in synchronous queues is caused due to bad performance on the receiving backend system

o More information is provided in the blog Tuning the PI Messaging System Queues

621 Messaging System in between AFW Sender Adapter and Integration Server (Outbound)

Open the Audit Log of a message and look for the following entry ldquoThe message was successfully retrieved

from the send queuerdquo Note that the queue name would be ldquoCall Queuerdquo if the message was synchronous

To determine how long the message waited to be picked up from the queue compare the timestamp of the

above entry with the timestamp for the step ldquoMessage successfully put into the queuerdquo A large time

difference between those two timestamps would therefore indicate a bottleneck for consumer threads on the

sender queues in the Messaging System

Instead of checking the latency of single messages we recommend using Wily Introscope to monitor the

queue behavior as shown above

Asynchronous messages only use one thread for processing in the messaging system Multiple threads are

used for synchronous messages The adapter thread puts the message into the messaging system queue

and will wait till the messaging system delivers the response The adapter thread is therefore not available

for other tasks until the response is returned A consumer thread on the call queue sends the message to the

Integration Engine The response will be received by a third thread (consumer thread of send queue) which

correlates the response with the original request After this the initiating adapter thread will be notified to

send the response to the original sender system This correlation can be seen in the audit log of the

synchronous message below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

64

622 Messaging System in Between Integration Server and AFW Receiver Adapter (Inbound)

Open the Audit Log of a message and look for the following entry ldquoMessage successfully put into the

queuerdquo Compare this timestamp with the timestamp for the step ldquoThe message was successfully retrieved

from the receive queuerdquo The time difference between these two timestamps is the time that the message

waited in the messaging system for a free consumer thread A large time difference between those two

timestamps indicates a bottleneck in the adapter specific inbound queues of the Messaging System

623 Interface Prioritization in the Messaging System

SAP NetWeaver PI 71 and higher offers the possibility for prioritization of interfaces similar to the ABAP

stack This feature is especially helpful if high volume interfaces run at the same time with business critical

interfaces To minimize the delay for critical messages the Messaging System now makes it possible for you

to define high medium and low priority processing at interface level Based on the priority the Dispatcher

Queue of the messaging system (which will be the first entry point for all messages) will forward the

messages to the standard adapter-specific queues The priority assigned to an interface determines the

number of messages that are forwarded once the adapter-specific queues have free consumer threads

available This is done based on a weighting of the messages to be reloaded The weights for the different

priorities are as follows High 75 Medium 20 Low 5

Based on this approach you can ensure that more resources can be used for high priority interfaces The

screenshot below shows the UI for message prioritization available in pimon Configuration and

Administration Message Prioritization

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

65

The number of messages per priority can be seen in a dashboard in Wily as shown below

You can find more details on configuring usage prioritization within the AAE at SAP Online Help navigate to

SAP NetWeaver SAP NetWeaver PI SAP NetWeaver Process Integration 71 Including Enhancement

Package 1 SAP NetWeaver Process Integration Library Function-Oriented View Process Integration

Process Integration Monitoring Component Monitoring Prioritizing the Processing of Messages

624 Avoid Blocking Caused by Single SlowHanging Receiver Interface

As already discussed above some receiver adapters process messages sequentially on one server node by

default This is independent of the number of consumer threads defined for the corresponding receiver

queue in the messaging system For example even though you have configured 20 threads for the JDBC

Receiver one Communication Channel will only be able to send one request to the remote database at a

given time If there are many messages for the same interface all of them will get a thread from the

messaging system but will be blocked when performing the adapter call

In the case of a slow message transmission for example for EDI interfaces using ISDN technology or

remote system with small network bandwidth this behavior can cause a ldquohangingrdquo situation for a specific

adapter since one interface can block all messaging threads As a result all the other interfaces will not get

any resources and will be blocked

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

66

Based on this a parameter messagingsystemqueueParallelismmaxReceivers was introduced which can

be set in NWA in service XPI Service Messaging System With this parameter you can restrict the maximum

number of receiver consumer threads that can be used by one interface In older releases (lower 731 SP11

and 74 SP6) this is a global parameter at the moment that affects all adapters It should therefore not be set

too restrictively If you need high parallel throughput one option is to set the maxReceiver parameter to 5 (so

that each interface can use 5 consumer threads for each server node) and increase the overall number of

threads on the receive queue for the adapter in question (for example JDBC) to 20 With this configuration it

will be possible for four interfaces to get resources in parallel before all threads are blocked For more

information see Note 1136790 - Blocking Receiver Channel May Affect the Whole Adapter Type and SDN

blog Tuning the PI Messaging System Queues

In the screenshot below you see the impact of maxReceivers parameter when a backlog for one interface

occurs Even though there are more free SOAP threads available they are not consumed Hence the free

SOAP threads could be assigned to other interfaces running in parallel

This parameter has a direct impact on the queue where message backlogs occur As mentioned earlier

usually a backlog should occur on the Dispatcher queue only since it is responsible to dispatch threads only

if there are free consumer threads in the adapter specific queue When setting maxReceiver you restrict the

threads per interface and this means that less consumer threads are blocked When one interface is facing a

high backlog there will always be free consumer threads and therefore the dispatcher queue can dispatch

the message to the adapter specific queue Hence the backlog will appear in the adapter specific queue so

that the message prioritization would not work properly any longer

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

67

Per default the maxReceiver parameter is only relevant for asynchronous message processing (ICo and

classical dual-stack scenarios) The reason here is that a wait time for synchronous scenarios should be

avoided by all means and therefore additional restrictions in the number of available threads can be very

critical Therefore it is usually advisable to not limit the threads per interface but increase the overall number

of available threads In case you have many high volume synchronous scenarios with different priority that

run in parallel it might be advisable to limit the threads each interface can use To do so you have to set the

parameter messagingsystemqueueParallelismqueueTypes to Recv ICoAll as described in SAP Note

1493502 - Max Receiver Parameter for Integrated Configurations

Enhancement with 731 SP11 and 74 SP6

With Note 1916598 - NF Receiver Parallelism per Interface an enhancement was introduced that allows

the specification of the maximum parallelization on more granular level This new feature has to be activated

by setting the parameter messagingsystemqueueParallelismperInterface to true Using a configuration UI

you can specify rules to determine the parallelization for all interfaces of a given receiver service If no rule

for a given interface is specified the global maxReceivers value will be considered If the receiver service

corresponds to a technical business system this configuration would help to restrict the parallel requests

from PI to that system This also works across protocols so that eg it could be used to restrict the number of

parallel requests using Proxy or IDoc to the same ERP receiver system Below you can find a screenshot of

the configuration UI in NWA SOA Monitoring

With the improvement mentioned above also the dispatching mechanism in the dispatcher queue is

changed so that it is aware of the maxReceiver settings This means that now again the backlog will now be

placed in the dispatcher queue and the prioritization will work properly

625 Overhead based on interface pattern being used

Also the interface pattern configured can cause some overhead When choosing the interface pattern

ldquoStateless Operationrdquo each message is parsed against its data type during inbound processing This can

cause performance and memory overhead but gives a syntactical check of the data received

When choosing ldquoStateless (XI30-Compatible)rdquo no check of the data type is performed and therefore the

overhead is avoided Therefore for interface with good data quality and high message throughput this

interface pattern should be chosen

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

68

63 Performance of Module Processing

If your analysis in chapter Adapter Performance Problem pointed to a problem in the Module Processing

then you must analyze the runtime of the modules used in the Communication Channel Adapter modules

can be combined so that one CC calls multiple modules in a defined sequence Adapter modules can be

custom-developed or SAP standard and can be used for many different purposes ndash like to transform the EDI

message before sending it to the partner or to change header attributes of a PI message before forwarding it

to a legacy application Due to the flexible usage of adapter modules there is also no standard way to

approach performance problems

In the audit log shown below you can see two adapter modules One module is a customer-developed

module called SimpleWaitModule The next module called CallSAPAdapter which is a standard module that

inserts the data into the messaging system queues

In the audit log you will get a first impression of the duration of the module In the example above you can

see that the SimpleWaitModule requires 4 seconds of processing time

Once again Wily Introscope offers a better overview of the processing time of modules In the screenshot

below you can see a dashboard showing the cumulativeaverage response times and number of invocations

of different modules If there is one that has been running for a very long time then it would be very easy to

identify since there will be a line indicating a much higher average response time The tooltip displays the

name of the module

Module

Processing

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

69

Once you have identified the module with the long running time you have to talk to the responsible developer

to understand why it is taking that long Eventually the module executes a look-up to a remote system using

JCo or JDBC which could be responsible for the delay In the best case the module will print out information

in addition to the audit log to detect such steps If not use the Wily Introscope transaction trace as explained

in appendix Wily Transaction Trace

64 Java only scenarios Integrated Configuration objects

From SAP NetWeaver PI 71 on you can configure scenarios that only run on the Java stack using the

Advance Adapter Engine (AAE) when only Java-based adapters are used A new configuration object is

used for this that is called Integrated Configuration When using this the steps executed so far in the ABAP

pipeline (Receiver Determination Interface Determination and Mapping) are also executed by the services in

the Adapter Engine

641 General performance gain when using Java only scenarios

The major advantage of AAE processing is a reduced overhead due to the context switches between ABAP

and Java Thus the overall throughput can be increased significantly and the overall latency of a PI

message can be reduced greatly If possible interfaces should always use the Integrated Configuration to

achieve best performance Therefore the best tuning option is to change a scenario that is using Java based

sender and receiver adapters from a classical ABAP-based scenario to an AAE-based one

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

70

Based on SAP internal measurements performed on 71 releases the throughput as well as the response

time could be improved significantly as shown in the diagrams below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

71

Based on these measurements the following statements can be made

1) Significant performance improvements can be achieved always with local processing (if available) for a certain scenario

2) This is valid for all adapter types huge mapping runtimes slow networks slow (receiver) applications reduce the throughput and therefore rated for the overall scenario also reduce the benefit of local processing in terms of overall throughput and response time measurements

3) Greatest benefit is recognized for small payloads (comparison 10k 50k 500k) and asynchronous messages

642 Message Flow of Java only scenarios

All the Java based tuning options mentioned in chapter Analyzing the (Advanced) Adapter Engine and Long

Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo are still valid Just the calling sequence is different

The following describes the steps used in Integrated Configuration to help you better understand the

message flow of an interface The example is JMS to Mail

1) Enter the JMS sender adapter

2) Put into the dispatcher queue of the messaging system

3) Forwarded to the JMS send queue of the messaging system

4) Message is taken by JMS send consumer thread

a No message split used

In this case the JMS consumer thread will do all the necessary steps previously done in the ABAP pipeline (like receiver determination interface determination and mapping) and will then also transfer the message to the backend system (remote database in our example) Thus all the steps are executed by one thread only

b Message split used (1n message relation)

In this case there is a context switch The thread taking the message out of the queue will process the message up to the message split step It will create the new message and put it in the Send queue again from where it will be taken by different threads (which will then map the child message and finally send it to the receiving system)

As we can see in this example for Integrated Configuration only one thread does all the different steps of a

message The consumer thread will not be available for other messages during the execution of these steps

The tuning of the Send queue (Call for synchronous messages) is therefore much more important for

scenarios using Integrated Configuration than for ABAP-based scenarios

The different steps of the message processing can be seen in the audit log of a message If for example a

long-running mapping or adapter module is indicated then you can use the relevant chapter of this guide

There is no difference in the analysis except that for mappings no JCo connection is required since the

mapping call is done directly from the Java stack

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

72

The example audit logs below are based on a Java only synchronous SOAP to SOAP scenario

In the highlighted areas you can see that all the steps are very fast except the mapping call which lasts

around 7 seconds To analyze the long duration of the mapping now the same steps as discussed in chapter

Long Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo have to be followed

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

73

643 Avoid blocking of Java only scenarios

Like classical ABAP based scenarios Java only scenarios can face backlog situations in case of a slow or

hanging receiver backend Since the Java only interfaces are only using send queues the restriction of

consumer threads on the receiver queue as described in chapter Avoid Blocking Caused by Single

SlowHanging Receiver Interface is no solution

Based on that an additional property messagingsystemqueueParallelismqueueTypes of service SAP XI AF

MESSAGING was introduced via Note 1493502 - Max Receiver Parameter for Integrated Configurations

The parameter can be set for synchronous and asynchronous interfaces Please keep in mind that

messagingsystemqueueParallelismmaxReceivers is a global parameter for all adapters and for all

configured Quality of Service This global restriction can be avoided starting with 731 SP11 740 SP06 as

per SAP Note 1916598 - NF Receiver Parallelism per Interface Usually a restriction for the parallelization

can be highly critical for synchronous interfaces Therefore we generally recommend to set

messagingsystemqueueParallelismqueueTypes in most cases to Recv IcoAsync only

644 Logging Staging on the AAE (PI 73 and higher)

Up to PI 711 on the PI Adapter Engine only one message version was persisted Especially for Java only

scenarios this was often not sufficient for troubleshooting since for instance the result of a mapping could not

be verified

Starting from PI 73 the persistence steps can be configured globally in the Adapter Engine for synchronous

and asynchronous interfaces This is similar to the LOGGING or LOGGING_SYNC on the ABAP stack In

later version of PI you will be able to do the configuration on interface level

The versions that can be persisted are shown in the diagram below (the abbreviations in green are the

values for the configuration)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

74

In the Messaging System we generally distinguish Staging (versioning) and Logging An overview is given

below

For details about the configuration please refer to the SAP online help Saving Message Versions

Similar to the LOGGING on the ABAP Integration Server persisting several versions of the Java message

can cause a high overhead on the DB and can cause a decrease in performance

The persistence steps can be seen directly in the audit log of a message

Also the Message Monitor shows the persisted versions

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

75

While this additional monitoring attributes are extremely helpful for troubleshooting they should be used

carefully from a performance perspective Especially the number of message versions and the

loggingstaging mode you choose can have a severe impact on performance Therefore you have to find the

balance between business requirement and performance overhead Some guidelines on how to use staging

and logging are summarized in SAP Note 1760915 - FAQ Staging and Logging in PI 73 and higher

65 J2EE HTTP load balancing

With NetWeaver version 71 and higher the Java HTTP load balancing is no longer done by the Java

Dispatcher but by the ICM The default load balancing rules of the ICM are designed with a stateful

application in mind giving priority to the balancing of HTTP sessions These default rules are not optimal for

stateless applications such as SAP PI

In the past we could observe an unequal load distribution across Java server nodes caused by this In case

of high backlogs this can cause a delay in the overall message processing of the interface because one

server node has more messages assigned than others and therefore not all available resources are used

A typical example can be seen in the Wily screenshot below

SAP Note 1596109 ndash Uneven distribution of HTTP requests on Java server nodes introduces new load

balancing rules for stateless applications like PI Please follow the description in the Note to ensure the

messages are distributed equally across the available server nodes In the meantime these load balancing

rules can also be executed using the CTC templates as described in Notes 1756963 and 1757922 Also this

has been included in the PI Initial Setup wizard in order to execute this task automatically as a post-

installation step (see Note 1760700 - PI CTC Add HTTP loadbalancing to initial setup)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

76

For the example given above we could see a much better load balancing after the new load balancing rules

were implemented This can be seen in the following screenshot

Please Note The load balancing rules mentioned above are only responsible to balance the messages

across the available server nodes of one instance HTTP load balancing across application servers is done

by the SAP Web Dispatcher More information about this is provided in the Guide How To Scale SAP PI 71

66 J2EE Engine Bottleneck

All tuning that can be done in the AFW is limited by the ability of the J2EE Engine to handle a given amount

of threads and to provide enough memory that is needed to process all requests Of course the CPU is also

a limiting factor but this will be discussed in Chapter 91

From SAP NetWeaver PI 71 onwards the SAP VM is used on all platforms Therefore the analysis of all

platforms can use the same tools

661 Java Memory

The first analysis of the J2EE memory is usually done using the garbage collection (GC) behavior of the

Java Virtual Machine (JVM) To get the information of the GC written to the standard log file the JVM has to

be started with the parameter ndashverbosegc

Analyze the file std_serverltngtout for the frequency of GCs their duration and the allocated memory

Look for unusual patterns indicating an Out-of-Memory error or an unintentional restart of your J2EE engine This would be visible if the allocated memory drops to 0 at a given point in time A healthy Garbage Collection output would display a saw tooth pattern ndash that is the memory usage would increase over time but then go down to its original value as soon as a full Garbage Collection is triggered You should see the memory usage go down to initial values during low volume times (night-time)

Pay attention to GCs with a high duration This is important because no PI message can be mapped or processed by the J2EE Adapter Framework during a GC in a JAVA VM One of the many reasons for long GCs that should be mentioned here is paging (see chapter 92) If the heap space of the J2EE Engine is exhausted all objects in the heap are evaluated for their references If there is still a reference the object is kept If there is no reference the object is deleted If some or all of the objects have been swapped out due to insufficient RAM then they have to be swapped in to be evaluated This leads to very long runtimes for GCs GCs of more than 15 minutes have been observed on swapping systems

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

77

Different tools exist for the Garbage Collection Analysis

1) Solution Manager Diagnostics

Solution Manager Diagnostics (SMD) offers a memory analysis application that is able to read the GC output in the std_serverltngtout files To start call the Root Cause Analysis section In the Common Task list you find the entry Java Memory Analysis which will give you the chance to upload your std_serverltngtout file It shows the memory allocated after the GC and also prints the duration of each GC (black dots with duration scale on right axis) A normal GC output should show a saw tooth pattern where the memory always goes down to an initial value (especially during low volume times) This is shown in the example screenshot below

2) Wily Introscope

Again Wily Introscope offers different dashboards that can be used to check the Garbage Collection behavior on the J2EE engine The dashboard can be found using the J2EE Overview J2EE GC Overview and shows important KPIs for the GC such as the count of GC or the duration for each interval The GC Time dashboards shows the ratio of time spent in GC An example is shown in the screenshot below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

78

3) Netweaver Administrator

The NWA also offers a view to monitor the Java heap usage However the important information about the duration of the GC is not available This monitor can be found by navigating to Availability and Performance Management Java System Reports Choose Report System Health

In NetWeaver 73 you find this data in Availability and Performance Management Resource Monitoring History Reports There you can build your own Memory report of the monitoring data provided The screenshot below eg shows the Used Memory per server node

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

79

4) SAP JVM Profiler

The SAP JVM profiler allows the analysis of the extensive GC information stored in the sapjvm_gcprf files of the server node folders or to directly connect to the server node to retrieve this information More information can be found at httpwikiscnsapcomwikidisplayASJAVAJava+Profiling

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

80

Procedure

o Search for restarts of the J2EE Engine by searching for the string ldquois startingrdquo in the file std_serverltngtout Apart from the first of these entries (which simply marks the initial start of the J2EE Engine) the second and any later entries mark a fatal situation after which the J2EE server node had to be restarted This could for example be an outndashof-memory situation Search the log above those entries for a possible reason

o Search for exceptions and fatal entries Was an out-of-memory error reported (search for pattern OutOfMemory) Was a thread shortage reported

o This document does not deal with tuning the J2EE Engine If at this point it becomes clear that the J2EE Engine is the limiting factor proceed with SAP Notes or open a customer incident Only some very basic recommendations are given here

o The parameters for the SAP VM are defined by Zero Admin templates From time to time new Zero Admin templates might be released which change important parameters for the SAP VM Thus you should check SAP Note 1248926 - AS Java VM Parameters for NetWeaver 71-based Products regularly and apply the changes accordingly

o The problem might be caused by too high parallelization when processing large messages This should be restricted to avoid overloading of the Java heap memory

1) In case the interfaces are using the ABAP Integration Server use the large message queue filters to restrict the parallelization of mapping calls from ABAP queues To do so set parameter EO_MSG_SIZE_LIMIT to eg 5000 to direct all large messages to dedicated XBTL or XBTM queues

2) To restrict the parallelization on the Java Messaging System you can use the large message queue mechanism described in chapter Large message queues on the Messaging System

o If it is already obvious that the memory of your J2EE Engine is getting short and nothing can be changed on the scenarios itself you have several options for scaling your PI landscape Especially for large PI installations with high message throughput or large messages SAP Note 1502676 - Scaling up large PI installations describes potential options

1) Increase the Java Heap of your server node If sufficient physical memory is available increase the default heap size of 2 GB to a higher value such as 3 or 4 GB Experience shows that the SAP VM can handle these heap sizes without major impact on performance Larger heap sizes can cause longer processing time of GCs which in turn can affect the PI application negatively Increasing the maximum heap size on the J2EE engine will automatically adapt the new area of the heap due to a dynamic configuration (in newer J2EE versions this will be per default set to 16 of the overall heap) After increasing the heap size monitor the duration of the GC execution

2) Add an additional server node to distribute the load over more processes SAP recommends that productive systems have at least 2 Java server nodes per instance Prerequisite is that your physical host has enough RAM and CPU To minimize the overhead for J2EE cluster communication it is in general recommended configuring larger server nodes (as described above) then a high number of server nodes

3) Add an additional application server consisting of a double stack Web Application Server (ABAP and J2EE Engine) to distribute the load to an additional hardware Find details on the scaling of PI with multiple instances in the Guide How To Scale up SAP NetWeaver Process Integration

4) Use a non-central Adapter Engine to separate large messages that cause the memory problem from business critical scenarios

662 Java System and Application Threads

A system or application thread is required for every task that the J2EE Engine executes Each of these

thread types are maintained using a thread pool If the pool of available threads is expired no more requests

can be processed It is therefore important that you have enough threads available at all times

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

81

o In general SAP recommends that you increase the system and application thread count for a PI system As described in SAP Note 937159 - XI Adapter Engine is stuck the application threads should be increased to 350 and the system threads to lsquo120rsquo for all J2EE server nodes

o There are two options for checking the thread usage

1) Wily Introscope

Wily Introscope provides a dashboard in the J2EE Overview section called J2EE CPU and Memory Detail that can also be used to monitor the thread usage on a PI engine Different views exist for Application and System Threads as shown in the screenshot below

2) NetWeaver Administrator (NWA)

The NWA also offers a monitor similar to the one available for the memory mentioned above This monitor can be found by navigating to Availability and Performance Management Java System Reports Choose Report System Health and looking at the windows shown in the screenshot below

Again in NetWeaver 73 and higher the monitor can be found at Availability and Performance

Management Resource Monitoring History Report An example for the Application Thread

Usage is shown below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

82

Check if the ThreadPoolUsageRate is below 80 for both the application threads and the system threads Also check the maximum value by choosing a longer history in Wily Introscope

Has the thread usage increased over the last daysweeks

Is there one server node that shows a higher thread usage than others

3) SAP Management Console

The SAP Management Console is an essential tool to analyze different aspects of the J2EE engine It also offers a thread dump view similar to SM50 in ABAP showing the activity of all the threads of a given Java Application Server More information about the SAP MC can be found in Note 1014480 - SAP Management Console (SAP-MC) and the online help The SAP Management Console can be started as Java web-start application using httpltservergt5ltInstanc_numbergt13

Navigate to AS Java Threads and check for any threads in red status (gt20 secs running time) This indicates long running threads and you should check for the related thread stacks The example below shows a long running thread related to the PI directory You can see the user doing the request and can see that the thread is waiting for an HTTP response from the CPACache

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

83

If you are not sure what is causing the problem you should take multiple thread dumps (at least 3 every 30 seconds) for all server nodes This can be done in the Management Console navigating to AS Java Process Table

The Thread Dump Viewer tool (TDV) from Note 1020246 - Thread Dump Viewer for SAP Java Engine can be used to analyze the produced thread dumps (inside the results archive that can be downloaded to your local PC)

663 FCA Server Threads

An additional type of thread was introduced with SAP NetWeaver PI 71 that is responsible for HTTP traffic

FCA Server Threads The FCA Server Threads are responsible for receiving HTTP calls at the Java side

(after the Java dispatcher is no longer available) Also FCA Threads use a Thread Pool Fifteen FCA

Threads are configured by default but based on SAP Note 1375656 - SAP Netweaver PI System Parameters

we recommend that you increase it to 50 This can be done in the NWA by changing the parameter

FCAServerThreadCount of service HTTP Provider

FCA Server Threads are particularly crucial for synchronous message transfer and for HTTP-based

scenarios like WebServices or calls from the ABAP engine to the Adapter Engine If there is a long duration

of the HTTP call due to a slow backend system then the thread is blocked for the whole time and not

available to send other HTTP requests (other PI messages)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

84

The configured FCAServerThreadCount represents the maximum number of threads working on a single

entry point An entry point in the PI sense could eg be the SOAP Adapter Servlet (share with all channels

as described in Tuning SOAP sender adapter)

o If response times for specific entry point are high (gt15 seconds) additional FCA threads will be spawned to be available for parallel HTTP based incoming requests using different entry points only This ensures that in case of problems with one application not all other applications will be blocked constantly

o An overall maximum of 20xFCAServerThreadCount may be used in the J2EE engine

There is currently no standard monitor available for FCA Threads except the thread view in the SAP

Management Console A new dashboard will be integrated into Wily Introscope soon and will also show the

FCA Server Threads that are in use

664 Switch Off VMC

The Virtual Machine Container (VMC) is enabled by default for an ABAP WebAS 71 Since PI does not use

VMC at runtime the VMC can be switched off on a PI system to avoid resource overhead This

recommendation is based on SAP Note 1375656 - SAP NetWeaver PI System Parameters

You can activate the VM Container setting the profile parameters vmcjenable = off which can be changed

using profile maintenance (transaction RZ10) Once you have made the change you need to restart the SAP

Web Application Server instance

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

85

7 ABAP PROXY SYSTEM TUNING

Every ABAP WebAS gt 620 includes an ABAP Proxy runtime This enables a SAP system to talk native PI

protocol (XI-SOAP) so that no costly transformation is necessary

In general ABAP proxy is just a framework that is triggered by (sender) or triggers (receiver) application-

specific coding It is not possible to give a general tuning recommendation because the applications and use

cases of ABAP proxy can differ so greatly

In this section we would like to highlight the system tuning options that can be applied to improve the

throughput at the ABAP Proxy backend side

In general you can increase the throughput of EO interfaces by changing the queue parallelization Two

different parameters exist to change the number of qRFC queues on the ABAP Proxy backend As in PI

ABAP Proxy processing is based on qRFC inbound queues (SMQ2) with specific names A sender proxy

uses XBTS queues (10 parallel by default) and a receiver proxy XBTR (20 parallel by default)

The sender proxy only forwards the message to PI by executing an HTTP call that must not take too much

time and which is not resource-critical On the other hand the receiver proxy executes the inbound

processing of the messages based on the application context (and which can be very time-consuming) It is

therefore more likely to face a backlog situation in the Receiver Proxy queues In the screenshot below you

can see the performance header of a receiver proxy message that required around 20 minutes in the

PLSRV_CALL_INBOUND_PROXY (which corresponds to the posting of the application data)

Of course such a long-running message will block the queue and all messages behind it will face a higher

latency Since this step is purely application-related it is only possible to perform tuning at the application

side

The number of sender and receiver ABAP Proxy queues can be changed in transaction SXMB_ADM on the

proxy backend with parameter EO_INBOUND_PARALLEL and sub parameter SENDER (XBTS queues)

and RECEIVER (XBTR queues) The prerequisite for increasing the number of queues is that there are

enough qRFC resources are available at the proxy system (as discussed in section qRFC Resources

(SARFC))

Also as described in chapter Message Prioritization on the ABAP Stack prioritization can be configured in

the ABAP Proxy system This can be used to separate runtime-critical interfaces from interfaces with a long

application processing (as shown above)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

86

71 New enhancements in Proxy queuing

The queuing mechanism was enhanced for sender and receiver proxy systems with Note 1802294

(Receiver) and Note 1831889 (Sender) After implementing these two Notes interface specific queues will be

used and tuning of these queues on interface level will be possible This can be very helpful in cases where

one receiver interface shows very long posting time in the application coding that cannot be further

improved Other messages for more business critical interfaces will eventually be blocked by this message

due to the common usage of the XBTR queues On the sender side of a Proxy system the processing of the

queues call the central PI hub In general the processing time there should be fast But in case of high

volume interfaces you might want to slow down less business critical interfaces to avoid overloading of your

central PI hub The prioritization mechanisms available prior to these Notes did not provide such options

This is achieved by adding an interface specific identifier to the queue-name In the screenshot below you

see a comparison in the queue names for the old framework (red) and the new framework (blue)

For the sender queues the following new parameters are introduced

o Parameter EO_QUEUE_PREFIX_INTERFACE with sub parameter SENDERSENDER_BACK Should be set to lsquo1rsquo to active interface specific queue This is a global switch affecting all sender interfaces

o Parameter EO_INBOUND_PARALLEL_SENDER without sub parameter determines the general number of queues per interface

Using a sub parameter ltsender IDgt the parallelization of individual interfaces can be changed This can eg be used to assign more queues for high priority interfaces

Wildcards can be used in the ltsender IDgt to group interfaces sharing eg the same service name or the same interface name but different services

For the receiver queues the following new parameters are introduced

o Parameter EO_QUEUE_PREFIX_INTERFACE with sub parameter RECEIVER Should be set to lsquo1rsquo to active interface specific queue This is a global switch affecting all receiver interfaces

o Parameter EO_INBOUND_PARALLEL_RECEIVER without sub parameter determines the general number of queues per interface

Using a sub parameter ltsender IDgt the parallelization of individual interfaces can be changed This can eg be used to assign more queues for high priority interfaces

Wildcards can be used in the ltsender IDgt to group interfaces sharing eg the same service name or the same interface name but different services

This new feature also replaces the currently existing prioritization since it is in general more flexible and

powerful We therefore recommend adjusting existing prioritization rules using the new queuing possibilities

described above

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

87

8 MESSAGE SIZE AS SOURCE OF PERFORMANCE PROBLEMS

The message size directly influences the performance of an interface The size of a PI message depends on

two elements The PI header elements with a rather static size and the payload which can vary greatly

between interfaces or over time for one interface (for example larger messages during year-end closing)

The size of the PI message header can cause a major overhead for small messages of only a few kB and

can cause a decrease in the overall throughput of the interface Furthermore many system operations (like

context switches or database operations) are necessary for only a small payload The larger the message

payload the smaller the overhead due to the PI message header On the other hand large messages

require a lot of memory on the Java stack which can cause heavy memory usage on ABAP or excessive

garbage collection activity (see section Java Memory) that will also reduce the overall system performance

Very large messages can even crash the PI system by causing an Out-of-Memory exception for example

You therefore have to find a compromise for the PI message size

Below you see throughput measurements performed by SAP In general the best throughput was identified

for messages sizes of 1 to 5 MB

The message size in these measurements corresponds to the XML message size processed in PI and not

the size of the file or IDoc being sent to PI You can use the Runtime Header of the ABAP stack to check the

message size Below you can see an example of a very small message While the MessageSizePayload

field describes the size of the payload in bytes (here 433 bytes) the MessageSizeTotal describes the total

message size (header + payload) In the example this is around 14 KB demonstrating the overhead by the PI

header for small messages The next two lines describe the payload size before and after the mapping In

the example below the mapping reduces the payload size The last two lines determine the size of the

response message that is sent back to PI before and after the response mapping for synchronous

messages

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

88

Based on the above observations we highly recommend that you use a reasonable message size for your

interfaces During the design and implementation of the interface we therefore recommend using a message

size of 1 to 5 MB if possible Messages can be collected at the sender side or by using IDoc packaging as

described in Tuning the IDoc Adapter to achieve this In case of large messages a split has to be performed

by changing the sender processing or by using the split functions available in the structure conversion of the

File adapter

81 Large message queues on PI ABAP

In case the interfaces are using the ABAP Integration Server use the large message queue filters to restrict

the parallelization of mapping calls from ABAP queues To do so set the parameter EO_MSG_SIZE_LIMIT of

category TUNING to eg 5000 to direct all messages larger 5 MB to dedicated XBTL or XBTM queues

The value of the parameter depends on the number of large messages and the acceptable delay that might

be caused due to a backlog in the large message queue

To reduce the backlog the number of large message queues can also be configured via parameter

EO_MSG_SIZE_LIMIT_PARALLEL of category TUNING The default value is 1 so that all messages larger

than the defined threshold will be processed in one single queue Naturally the parallelization should not be

set higher than 2 or 3 to avoid overloading of the Java memory due to parallel large message requests

82 Large message queues on PI Adapter Engine

SAP Note 1727870 - Handling of large messages in the Messaging System introduces large message

queues (virtual queues no adapter behind) also for the Java based Adapter Engine Contrary to the

Integration Engine it is not the size of a single large message only that determines the parallelization

Instead the sum of the size of the large messages across all adapters on a given Java server node is limited

to avoid overloading the Java heap This is based on so called permits that define a threshold of a message

size Each message larger than the permit threshold is considered as large message The number of permits

can be configured as well to determine the degree of parallelization Per default the permit size is 10 MB and

10 permits are available This means that large messages will be processed in parallel if 100 MB are not

exceeded

To show this let us look at an example using the default values Let us assume we have 5 messages waiting

to be processed (status ldquoTo Be Deliveredrdquo on one server node Message A has 5 MB message B has 10

MB message C has 50 MB message D 150 MB message E 50 MB and message F 40 MB Message A is

not considered large since the size is smaller than the permit size and is not considered large and can be

immediately processed Message B requires 1 permit message C requires 5 Since enough permits are

available processing will start (status DLNG) Hence for message D all available 10 permits would be

required Since the permits are currently not available it cannot be scheduled If blacklisting is enabled the

message will be put to error status (NDLV) since it exceed the maximum number of defined permits In that

case the message would have to be restarted manually Message E requires 5 permits and can also not be

scheduled But since there are 4 permits left message F is put to DLNG Due to the smaller size message B

and message F finish first releasing 5 permits This is sufficient to schedule message E which requires 5

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

89

permits Only after message E and C have finished message D can be scheduled consuming all available

permits

The example above shows a potential delay a large message could face due to the waiting time for the

permits But the assumption is that large messages are not time critical and therefore additional delay is less

critical than potential overload of the system

The large message queue handling is based on the Messaging System queues This means that restricting

the parallelization is only possible after the initial persistence of the message in the Messaging System

queues Per default this is only done after the Receiver Determination Therefore if you have a very high

parallel load of incoming large requests this feature will not help Instead you would have to restrict the size

of incoming requests on the sender channel (eg file size limit in the file adapter or the

icmHTTPmax_request_size_KB limit in the ICM for incoming HTTP requests) If you have very complex

extended receiver determination or complex content based routing it might be useful to configure staging in

the first processing step of the Messaging System (BI=3) as described in Logging Staging on the AAE (PI

73 and higher)

The number of permits consumed can be monitored in PIMON Monitoring Adapter Engine Status The

number of threads corresponds to the number of consumed permits

In newer Wily versions there is also a dashboard showing the number of consumed permits as well The

Worker Threads on the right hand side of the screenshot correspond to the permits

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

90

9 GENERAL HARDWARE BOTTLENECK

During all tuning actions discussed in the previous chapters you must keep in mind that the limitation of all

activities is set by the underlying CPU and memory capacity The physical server and its hardware have to

provide resources for three PI runtimes the Adapter Engine the Integration Engine and the Business

Process Engine Tuning one of the engines for high throughput leaves fewer resources for the remaining

engines Thus the hardware capacity has to be monitored closely

91 Monitoring CPU Capacity

When monitoring the CPU capacity it is not enough to use the average that is displayed in the ldquoPrevious

Hoursrdquo section of transaction ST06 Instead you should start your interface and monitor the CPU while it is

running The best procedure is to use operating system tools to monitor the CPU usage (particularly if using

hardware virtualization)

The SMD Host Agent also reports CPU data to Wily Introscope which can be viewed using dashboards as

shown below This example shows two systems where one is facing a temporary CPU overload and the

other a permanent one

Also the NWA offers a view on the CPU activity from NetWeaver 73 on via Availability and Performance

Management Resource Monitoring History Reports There you can build your own report based on the

ldquoCPU utilizationrdquo data as shown in the screenshot

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

91

You can also monitor the CPU usage in ST06 as described below

Procedure

Log on to your Integration Server and call transaction ST06 Take a look at the snapshot that is provided on

the entry screen than navigate to ldquoDetailed Analysis Menurdquo

Snapshot view Is the idle time (upper left corner) at a reasonable value at all times for example around 10 or higher

Snapshot view The load average is a good indicator of the dimension of the bottleneck It describes the number of processes for each CPU that are in a wait queue before they are assigned to a free CPU As long as the average remains at one process for each available CPU the CPU resources are sufficient Once there is an average of around three processes for each available CPU there is a bottleneck at the CPU resources In connection with a high CPU usage a high value here can indicate that too many processes are active on the server In connection with a low CPU usage a high value here can indicate that the main memory is too small The processes are then waiting due to excessive paging

Detailed Analysis View TOP CPU Which thread uses up the highest amount of CPU Is it the J2EE Engine the work processes or the database

o There are different follow-up actions to be taken depending on the findings of the second check The first option is of course to simply increase the hardware This should be accompanied by a thorough sizing (for which SAP provides the Quicksizer in the Service Marketplace httpservicesapcomquicksizer )

o The second option is to identify the CPU-consuming threads and to think about a reduction of the load in this component For example if the J2EE Engine is the largest consumer then it is worth re-evaluating the number of concurrent mapping threads andor consumer queues as described in the previous chapters

o SAP Note 742395 - Analyzing High CPU Usage by the J2EE Engine provides a good entry point if the J2EE Engine consumes too much CPU

o If it is the work processes that consume a lot of CPU use transaction ST03N to figure out if it is the Integration Server or the Business Process Engine You can do that by looking at the CPU usage per user BPE uses the user WF-BATCH while the Integration Server uses the last user who activated a qRFC queue (typically PIAFUSER or PIAPPLUSER) Use the previous chapter to restrict these activities

92 Monitoring Memory and Paging Activity

As stated above paging is very critical for the Java stack since it influence the Java GC behavior directly

Therefore paging should be avoided in every case for a Java-based system

The OS tools are the most reliable for monitoring the paging activity on your system Transaction ST06 can

also be used to perform a snapshot analysis Take a look at the snapshot that is provided on the entry

screen

Snapshot view Is there enough physical memory available

Snapshot view Is there considerable paging Paging can have a negative influence on your J2EE Engine If it does this can be seen in long GC times as described in Chapter 661

93 Monitoring the Database

Monitoring database performance is a rather complex task and requires information to be gathered from a

variety of sources Only a basic set of indicators is given for the purpose of this guide A more detailed

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

92

analysis is required if these indicators point to a major performance problem in the database If assistance is

needed open a support message under the component BC-DB-ORA (or MSS SDB DB6 respectively)

931 Generic J2EE database monitoring in NWA

The next sections of this chapter will be mainly based on ABAP transactions to do a detailed analysis of the

database performanceThe NWA however offers possibilities to monitor the database performance for the

Java based tables This is especially helpful for Java only AEXPO or Non-Central AAEs In the NWA you

can use the ldquoOpen SQL Monitorrdquo in the Troubleshooting section of NWA The official documentation can be

found in the Online Help Monitoring DB via DBACOCKPIT transaction (ABAP based) is possible also for the

Java-only systems from SAP Solution Manager (DBA Cockpit must be configured in the Managed System

Setup)

It will be too much to outline all the available functionalities here Instead only a few key capabilities will be

demonstrated If you want to eg see the number of select update or insert statements in your system you

can use the ldquoTable Statistics Monitorrdquo It clearly shows you which tables in your system are accessed the

most frequently

To see the processing time of the different statements on your system you can use the ldquoOpen SQL statisticsrdquo

monitor There you can see the count the total average and max processing time of the individual SQL

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

93

statements and can therefore identify the expensive statements on your system

Per default the recorded period is always from the last restart of the system If you would like to just look at

the statistics for a specific time period (eg during a test) you have the option of resetting the statistics in the

individual monitor prior to the test

932 Monitoring Database (Oracle)

Procedure

Log on to your Integration Server and call transaction ST04

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

94

Many of the numbers in the screenshot above depend on each other The following checklist names a few

key performance figures of your Oracle database

The data buffer quality for instance is based on the ratio of physical reads versus the total number of reads The lower the ratio the better the buffer quality The data buffer quality should be better than 94 the statistics should be based on 15 million total reads This number of reads ensures that the database is in an equilibrated state

Ratio of user and recursive calls A good performance is indicated by ratio values greater than 2 Otherwise the number of recursive calls compared to the number of user calls is too high Over time this ratio always declines because more and more SQL statements get parsed in the meantime

Number of reads for each user call If this value exceeds 30 blocks for each user call this indicates an expensive SQL statement

Check the value of TimeUser call Values larger than 15 ms often indicate an optimization issue

Compare busy wait time versus CPU time A ratio of 6040 generally indicates a well-tuned system Significantly higher values (for example 8020) indicate room for improvement

The DD-cache quality should be better than 80

933 Monitoring Database (MS SQL)

Procedure

To display the most important performance parameters of the database call Transaction ST04 or choose

Tools Administration Monitor Performance Database Activity An analysis is only meaningful if

the database has been running for several hours with a typical workload To ensure a significant database

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

95

workload we recommend a minimum of 500 CPU busy seconds Note The default values displayed in

section Server Engine are relative values To display the absolute values press button Absolute values

Check the values in (1)

The cache hit ratio (2) which is the main performance indicator for the data cache shows the

average percentage of requested data pages found in the cache This is the average value since startup The value should always be above 98 (even during heavy workload) If it is significantly below 98 the data cache could be too small To check the history of these values use Transaction ST04 and choose Detail Analysis Menu Performance Database A snapshot is collected every 2 hours

Memory setting (5) shows the memory allocation strategy used and shows the following

FIXED SQL Server has a constant amount of memory allocated which is set by SQL Server configuration parameters min server memory (MB) = max server memory (MB)

RANGE SQL Server dynamically allocates memory between min server memory (MB) lt gt max server memory (MB)

AUTO SQL Server dynamically allocates memory between 4 MB and 2 PB which is set by min server memory (MB) = 0 and max server memory (MB) = 2147483647

FIXED-AWE SQL Server has a constant amount of memory allocated which is set by min server memory (MB) = max server memory (MB) In addition the address windowing extension functionality of Windows 2000 is enabled

934 Monitoring Database (DB2)

Procedure

To get an overview of the overall buffer pool usage catalog and package cache information go to

transaction ST04 choose Performance Database section Buffer Pool (or section Cache respectively)

1 1

1 2

1 3

1 5

1 4

gt 6h

gt 98

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

96

Buffer Pools Number Number of buffer pools configured in this system

Buffer Pools Total Size The total size of all configured buffer pools in KB If more than one buffer pool is used choose Performance Buffer Pools to get the size and buffer quality for every single buffer pool

Overall buffer quality This represents the ratio of physical reads to logical reads of all buffer pools

Data hit ratio In addition to overall buffer quality you can use the data hit ratio to monitor the database (data logical reads - data physical reads) (data logical reads) 100

Index hit ratio In addition to overall buffer quality you can use the index hit ratio to monitor the database (index logical reads - index physical reads) (index logical reads) 100

Data or Index logical reads The total number of read requests for data or index pages that went through the buffer pool

Data or Index physical reads The total number of read requests that required IO to place data or index pages in the buffer pool

Data synchronous reads or writes Read or write requests performed by db2agents

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

97

Catalog cache size Maximum size of the catalog cache that is used to maintain the most frequently accessed sections of the catalog

Catalog cache quality Ratio of catalog entries (inserts) to reused catalog entries (lookups)

Catalog cache overflows Number of times that an insert in the catalog cache failed because the catalog cache was full (increase catalog cache size)

Package cache size Maximum size of the package cache that is used to maintain the most frequently accessed sections of the package

Package cache quality Ratio of package entries (inserts) to reused package entries (lookups)

Package cache overflows Number of times that an insert in the package cache failed because the package cache was full (increase package cache size)

935 Monitoring Database (MaxDB SAP DB)

Procedure

As with the other database types call transaction ST04 to display the most important performance

parameters

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

98

With the Database Performance Monitor (transaction ST04) and its submonitors the CCMS in the SAP R3

System allows you to view in the SAP R3 System all of the information that can be used to identify

bottleneck situations

The SQL Statements section provides information about the number of SQL statements executed and related sizes

The IO Activity section lists physical and logical read and write accesses to the database

The Lock Activity section (in transaction ST04) allows you to identify possible bottlenecks in the assignment of locks by the database

The Logging Activity section combines information about the log area

The Scan and sort activity section can be helpful in identifying that suitable indexes are missing

The Cache Activity section provides information about the usage of the caches and the associated hit rates

o The data cache Hit rate should be 99 in a balanced system If this is not the case you need to check if the low hit rate is due to the size of the data cache being too small or due to an unsuitable SQL statement If necessary you can increase the data cache by increasing the MaxDB parameter Data_Cache_Size (lower than 74) or Cache_Size (74 or higher)

o The catalog cache hit rate should be 85 in a balanced system It is not a cause for concern if the catalog hit rate is temporarily lower since accesses to the system tables and an active command monitor can temporarily impair the hit rate If the catalog cache is busy the pages are paged out in the data cache rather than on the hard disk

o Log entries are not written directly to the log volume but first into a buffer (LOG_IO_QUEUE) so as to be able to write several log entries together asynchronously with one IO on the hard disk Only once a LOG_IO_QUEUE page is full is it written to the hard disk However there are situations in which you need to write LOG entries from the LOG_QUEUE onto the hard disk as quickly as possible for example if transactions are completed (COMMITROLLBACK) The transactions wait until the log writer reports the OK informing them that the log entry is on the hard disk Firstly this means that is important to use the quickest possible hard disks for the log volume(s) and secondly you must ensure that no LOG_QUEUE_OVERFLOWS occur in production operation If the LOG_IO_QUEUE is full then all transactions that want to write LOG entries must wait until free memory becomes available again in the LOG_IO_QUEUE At LOG_QUEUE_OVERFOLWS (Transaction DB50 -gt Current Status -gt Activities Overview -gt LOG_IO_QUEUE Overflow) you need to increase the LOG_IO_QUEUE parameter

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

99

o In accordance with Note 819324 - FAQ MaxDB SQL optimization check which SQL statements are responsible for the most disk accesses and whether they can be optimized You should also optimize SQL statements that have a poor runtime and poor selectivity

Bottleneck Analysis

The Bottleneck Analysis function is available in transaction ST04 under the Problem Analyzes

Performance Database Analyzer Bottlenecks You can use this function to activate the corresponding

analysis tool dbanalyzer

The dbanalyzer then collects important performance data every 15 minutes and evaluates this for possible

bottlenecks

The result of the bottleneck analysis is output in text format to provide a quick overview of possible causes of

performance problems For a more detailed description of the bottleneck messages see the online

documentation (httphelpsapcom) in the SAP Web Application Server area and search for SAP DB

bottleneck analysis messages

The dbanalyzer tool can also be started at operating system level It logs its analysis results in files with date

stamps in the subdirectory ltrundirectorygtanalyzer (you can find the actual directory by double-clicking on

lsquoPropertiesrsquo in DB50) of the relevant database instance

94 Monitoring Database Tables

Some of the tables of SAP PI grow very quickly and can cause severe performance problems if archiving or

deletion does not take place frequently For troubleshooting see SAP Note 872388 ndash Troubleshooting

Archiving and Deletion in PI

Procedure

Log on to your Integration Server and call transaction SE16 Enter the table names as listed below and execute In the following screen simply press the button ldquoNumber of Entriesrdquo The most important tables are SXMSPMAST (cleaned up by XML message archivingdeletion)

If you use the switch procedure you have to check SXMSPMAST2 as well

SXMSCLUR SXMSCLUP (cleaned up by XML message archivingdeletion)

If you use the switch procedure you have to check SXMSPCLUR2 and SXMSCLUP2 as well

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

100

SXMSPHIST (cleaned up by the deletion of history entries)

If you use the switch procedure you have to check SXMSPHIST2 as well

SXMSPFRAWH and SXMSPFRAWD (cleaned up by the performance jobs see SAP Note 820622 ndash Standard Jobs for XI Performance Monitoring)

SWFRXI (cleaned up by specific jobs see SAP Note 874708 - BPE HT Deleting Message Persistence Data in SWFRXI)

SWWWIHEAD (cleaned up by work item archivingdeletion)

Use an appropriate database tool for example SQLPLUS for Oracle for the database tables of the J2EE schema Main tables that could be affected of growth

PI Message tables BC_MSG and BC_MSG_AUDIT (when audit log persistence is enabled)

PI message logging information when stagingloging is used BC_MSG_LOG_VERSION

XI IDoc tables (if IDoc persistence is activated) XI_IDOC_IN_MSG and XI_IDCO_OUT_MSG

Check for all tables if the number of entries is reasonably small or remains roughly constant over a period of time If that is not the case check your archivingdeletion setup

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

101

10 TRACES LOGS AND MONITORING DECREASING PERFORMANCE

101 Integration Engine

There are three important locations for the configuration of tracing logging in the Integration Engine The

pipeline settings of PI itself the Internet Communication Manager (ICM) and the gateway All three are

heavily involved in message processing and therefore their tracing or logging settings can have an impact on

performance On top of this you might have changed the SM59 RFC destinations and have switched on the

trace in order to analyze a problem If so then this needs to be reset as well

Procedure

Start transaction SXMB_ADM and navigate to Integration Engine Configuration Specific Configuration and search for the following entries

Category Parameter Subparameter Current Value Default

RUNTIME TRACE_LEVEL ltnonegt ltyour valuegt 1

RUNTIME LOGGING ltnonegt ltyour valuegt 0

RUNTIME LOGGING_SYNC ltnonegt ltyour valuegt 0

Set the above parameters back to the default value which is the recommended value by SAP

Start transaction SMICM and check the value of the Trace Level that is displayed in the overview

(third line) Set the trace level to lsquo1rsquo if not already the case You can do so by navigating to Goto Trace Level Set

Start transaction SMGW navigate to Goto Parameters Display and check the parameter Trace Level Set the trace level to lsquo1rsquo if not already the case You can do so by navigating to Goto Trace Gateway Reduce Level (or Increase Level respectively)

Start transaction SM59 and search the following RFC destination for the flag ldquoTracerdquo on the tab Special Options and remove the Flag ldquoTracerdquo if it has been set

AI_RUNTIME_JCOSERVER

AI_DIRECTORY_JCOSERVER

LCRSAPRFC

SAPSLDAPI It is possible that other RFC destinations are used for sending out IDocs and similar

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

102

102 Business Process Engine

You only have to check the Event Trace in the Business Process Engine

Procedure

Call transaction SWELS and check if the Event Trace is switched on The recommended setting is lsquooffrsquo (as shown in the screenshot below)

103 Adapter Framework

Since the Adapter Framework runs on the J2EE Engine the tracing and logging can be conveniently

checked and controlled using NetWeaver Administrator To improve the performance SAP recommends that

you set all trace levels to default Higher trace levels are only acceptable in productive usage during problem

analysis Below you will find a description on how to set the default trace level for all locations at once It is of

possible to do it for every location separately but this way you might forget to reset one of the locations

Procedure

Start the NetWeaver Administrator and navigate to Problem Management (or Troubleshooting in NW 73 and higher) Logs and Traces and choose Log Configuration In the Show drop down list box choose Tracing Locations Select the Root Location in the tree and press the button Default Configuration This will set the default configuration for all locations

Start the NetWeaver Administrator and navigate to Troubleshooting Logviewer (direct link nwalinks) Open the View ldquoDeveloper Tracerdquo and check if you have very frequent reoccurring exceptions that fill up the trace Analyze the exception and check what is causing this (eg a wrong configuration of a Communication Channel causing repeated traces If you see very frequent errors for which you cannot find the root cause open a customer incident at SAP

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

103

An aggregated view of Java exceptions (by number date instance server node etc) are reported and available in SAP Solution Manager Root Cause Analysis Exception Analysis functionality

1031 Persistence of Audit Log information in PI 710 and higher

With SAP NetWeaver PI 71 the audit log is not persisted for successful messages in the database by default

to avoid performance overhead Therefore the audit log is only available in the cache for a limited period of

time (based on the overall message volume)

Audit log persistence can be activated temporarily for performance troubleshooting where audit log

information is required To do so use the NWA and change the parameter

ldquomessagingauditLogmemoryCacherdquo to false in service XPI Service Messaging System by setting the

parameter to false Details can be found in SAP Note 1314974 - PI 71 AF Messaging System audit log

persistence Only do so temporarily during the time of troubleshooting to avoid any performance problems

from the additional persistence

After implementing Note 1611347 - New data columns in Message Monitoring additional information like

message processing time in ms and server node is visible in the message monitor

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

104

11 ERRORS AS SOURCE OF PERFORMANCE PROBLEMS

If errors occur within messages or within technical components of the SAP Process Integration then this can

have a severe impact on the overall performance Thus to solve performance problems it is sometimes

necessary to analyze the log files of technical components and to search for messages with errors To

search for messages with errors use the PI Admin Check which is available using SAP Note 884865 - PIXI

Admin Check

Procedure

ICM (Internet Communication Manager)

Start transaction SMICM open the log file (Shift + F5 or GoTo Trace File Show All) and check for errors

Gateway

Start transaction SMGW open the Log File (CTRL + Shift + F10 or Goto Trace Gateway Display File) and check for errors

System Log

Start transaction SM21 choose an appropriate time interval and search the log entries for errors Repeat the procedure for remote systems if you are using dialog instances

ABAP Runtime Errors

Start transaction ST22 choose an appropriate time interval and search for PI-related dumps In general the number of ABAP dumps should be very small on a PI system and therefore all occurring dumps should be analyzed

Alerts CCMS

Start transaction RZ20 and search for recent alerts

Work Process and RFC trace

In the PI work directory check all files which begin with dev_rfc or dev_w for errors

J2EE Engine

In the PI work directory search for errors in file dev_server0out If you are using more than one J2EE server node check dev_serverltngtout as well

Applications running on the J2EE Engine

The traces for application-related errors are the defaulttrc files located in j2eeclusterserverltngtlog directory To monitor exceptions across multiple server nodes the best tool to use is the LogViewer service of the NWA This is available via Problem Management Logs and Traces and choose Log Viewer Search for exceptions in the area of the performance problem for example the PI Messaging system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

105

APPENDIX A

A1 Wily Introscope Transaction Trace

As mentioned above the Wily Transaction trace can be used to trace expensive steps that were noticed

during the mapping or module processing The transaction trace will allow you to drill down further into Java

performance problems and to distinguish if it is a pure coding problem or caused by a look-up to a remote

system or a slow connection to the local database

The Wily Transaction trace can be used to trace all steps that exceed a specific duration on the Java stack If

a user is known (for example for SAP Enterprise Portal) the tracing can be restricted to a specific user In PI

if you encounter a module that lasts several seconds then you can restrict the tracing as shown below

Note Starting the Transaction Trace increases the data collection on the satellite system (PI) and is

therefore only recommended in productive environment for troubleshooting purposes

With the selection above the trace will run for 10 minutes (this is the maximum and it can be canceled earlier)

and will trace all steps exceeding 3 seconds for the agents of the PI system If such long-running steps are

found then a new window will be displayed listing these steps

In the screenshot below you can see the result in the Trace View The Trace View shows the elapsed time

from left to rightndash in the example below around 48 seconds From top to bottom we can see the call stack of

the thread In general we are interested in long-running threads on the bottom of the trace view A long-

running block at the bottom means that this is the lowest level coding that was instrumented and which is

consuming all the time

In the example below we see a mapping call that is performing many individual database statements ndash this

will become visible by highlighting the lowest level In such a case you have to review the coding of the

mapping to see if the high amount of database calls can be summarized in one call

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

106

Another case that is often seen is that a lookup using JDBC to a remote database or RFC to an ABAP

system takes a long time in the mapping or adapter module In such a case there will be one long block at

the bottom of the transaction trace that also gives you some details about the statement that was executed

A2 XPI inspector for troubleshooting and performance analysis

The XPI Inspector is an additional tool that provides enhanced logging and tracing capabilities with a main

focus on troubleshooting But the tool can also be used for troubleshooting of performance issues

General information about the tool can be found at Note 1514898 - XPI Inspector for troubleshooting PI The

tool can be called using the following URL on your system httphostjava-portxpi_inspector

To analyze performance problems typically the example ldquo51 ndash ldquoPerformance Problem)rdquo is used As a basic

measurement it allows you to take multiple thread dumps in specific time intervals These thread dumps can

be analyzed later by your administrators or SAP support to identify what is causing the problem The Thread

Dump Viewer tool (TDV) from Note 1020246 - Thread Dump Viewer for SAP Java Engine can be used to

analyze the produced thread dumps (inside the results archive that can be downloaded to your local PC)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

107

In addition the tool allows you to do JVM profiling be either doing JVM Performance tracing or JVM Memory

Allocation tracing This can help to understand in detail which steps of the processing are taking a long time

As output the tool provides prf files that can be loaded into the JVM Profiler More information can be found

at httpwikiscnsapcomwikidisplayASJAVAJava+Profiling Please Note that these traces (especially the

memory profiling) will cause a high overhead on the J2EE engine and are therefore not recommended to be

used in production Instead it is advisable to reproduce the problem on your QA system and use the tool

there

Document Version 30 March 2014

wwwsapcom

copy 2014 SAP AG All rights reserved

SAP R3 SAP NetWeaver Duet PartnerEdge ByDesign SAP

BusinessObjects Explorer StreamWork SAP HANA and other SAP

products and services mentioned herein as well as their respective

logos are trademarks or registered trademarks of SAP AG in Germany

and other countries

Business Objects and the Business Objects logo BusinessObjects

Crystal Reports Crystal Decisions Web Intelligence Xcelsius and

other Business Objects products and services mentioned herein as

well as their respective logos are trademarks or registered trademarks

of Business Objects Software Ltd Business Objects is an SAP

company

Sybase and Adaptive Server iAnywhere Sybase 365 SQL

Anywhere and other Sybase products and services mentioned herein

as well as their respective logos are trademarks or registered

trademarks of Sybase Inc Sybase is an SAP company

Crossgate mgic EDDY B2B 360deg and B2B 360deg Services are

registered trademarks of Crossgate AG in Germany and other

countries Crossgate is an SAP company

All other product and service names mentioned are the trademarks of

their respective companies Data contained in this document serves

informational purposes only National product specifications may vary

These materials are subject to change without notice These materials

are provided by SAP AG and its affiliated companies (SAP Group)

for informational purposes only without representation or warranty of

any kind and SAP Group shall not be liable for errors or omissions

with respect to the materials The only warranties for SAP Group

products and services are those that are set forth in the express

warranty statements accompanying such products and services if

any Nothing herein should be construed as constituting an additional

warranty

wwwsapcom

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

3

(ICO) scenarios

Additional new chapters

Prevent blocking of EO queues

Avoid uneven backlogs with queue balancing

Reduce the number of EOIO queues

Adapter Framework Scheduler

Avoid blocking of Java only scenarios

J2EE HTTP load balancing

Persistence of Audit Log information in PI 710 and higher

Logging Staging on the AAE (PI 73 and higher)

11 December 2009 Reason for new version Adaption to PI 71

General review of entire document with corrections

Additional new chapters

Tuning the IDoc adapter

Message Prioritization on the ABAP stack

Prioritization in the Messaging System

Avoid blocking caused by single slowhanging receiver interface

Performance of Module Processing

Advanced Adapter Engine Integrated Configuration

ABAP Proxy system tuning

Wily Transaction Trace

10 October 2007 Initial document version released for XI 30 and PI 70

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

4

TABLE OF CONTENTS

1 INTRODUCTION 6

2 WORKING WITH THIS DOCUMENT 10

3 DETERMINING THE BOTTLENECK 12 31 Integration Engine Processing Time 12 32 Adapter Engine Processing Time 13 321 Adapter Engine Performance monitor in PI 731 and higher 15 33 Processing Time in the Business Process Engine 16

4 ANALYZING THE INTEGRATION ENGINE 20 41 Work Process Overview (SM50SM66) 21 42 qRFC Resources (SARFC) 22 43 Parallelization of PI qRFC Queues 23 44 Analyzing the runtime of PI pipeline steps 29 441 Long Processing Times for ldquoPLSRV_RECEIVER_ DETERMINATIONrdquo PLSRV_INTERFACE_DETERMINATION 30 442 Long Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo 30 443 Long Processing Times for ldquoPLSRV_CALL_ADAPTERrdquo 33 444 Long Processing Times for ldquoDB_ENTRY_QUEUEINGrdquo 34 445 Long Processing Times for ldquoDB_SPLITTER_QUEUEINGrdquo 34 446 Long Processing Times for ldquoLMS_EXTRACTIONrdquo 36 447 Other step performed in the ABAP pipeline 36 45 PI Message Packaging for Integration Engine 37 46 Prevent blocking of EO queues 38 47 Avoid uneven backlogs with queue balancing 38 48 Reduce the number of parallel EOIO queues 40 49 Tuning the ABAP IDoc Adapter 41 491 ABAP basis tuning 41 492 Packaging on sender and receiver side 42 493 Configuration of IDoc posting on receiver side 42 410 Message Prioritization on the ABAP Stack 44

5 ANALYZING THE BUSINESS PROCESS ENGINE (BPE) 45 51 Work Process Overview 45 52 Duration of Integration Process Steps 45 53 Advanced Analysis Load Created by the Business Process Engine (ST03N) 47 54 Database Reorganization 47 55 Queuing and ccBPM (or increasing Parallelism for ccBPM Processes) 48 56 Message Packaging in BPE 48

6 ANALYZING THE (ADVANCED) ADAPTER ENGINE 50 61 Adapter Performance Problem 51 611 Adapter Parallelism 51 612 Sender Adapter 54 613 Receiver Adapter 55 614 IDoc_AAE adapter tuning 56 615 Packaging for Proxy (SOAP in XI 30 protocol) and Java IDoc adapter 56 616 Adapter Framework Scheduler 58 62 Messaging System Bottleneck 60 621 Messaging System in between AFW Sender Adapter and Integration Server (Outbound) 63 622 Messaging System in Between Integration Server and AFW Receiver Adapter (Inbound) 64 623 Interface Prioritization in the Messaging System 64 624 Avoid Blocking Caused by Single SlowHanging Receiver Interface 65 625 Overhead based on interface pattern being used 67 63 Performance of Module Processing 68 64 Java only scenarios Integrated Configuration objects 69

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

5

641 General performance gain when using Java only scenarios 69 642 Message Flow of Java only scenarios 71 643 Avoid blocking of Java only scenarios 73 644 Logging Staging on the AAE (PI 73 and higher) 73 65 J2EE HTTP load balancing 75 66 J2EE Engine Bottleneck 76 661 Java Memory 76 662 Java System and Application Threads 80 663 FCA Server Threads 83 664 Switch Off VMC 84

7 ABAP PROXY SYSTEM TUNING 85 71 New enhancements in Proxy queuing 86

8 MESSAGE SIZE AS SOURCE OF PERFORMANCE PROBLEMS 87 81 Large message queues on PI ABAP 88 82 Large message queues on PI Adapter Engine 88

9 GENERAL HARDWARE BOTTLENECK 90 91 Monitoring CPU Capacity 90 92 Monitoring Memory and Paging Activity 91 93 Monitoring the Database 91 931 Generic J2EE database monitoring in NWA 92 932 Monitoring Database (Oracle) 93 933 Monitoring Database (MS SQL) 94 934 Monitoring Database (DB2) 95 935 Monitoring Database (MaxDB SAP DB) 97 94 Monitoring Database Tables 99

10 TRACES LOGS AND MONITORING DECREASING PERFORMANCE 101 101 Integration Engine 101 102 Business Process Engine 102 103 Adapter Framework 102 1031 Persistence of Audit Log information in PI 710 and higher 103

11 ERRORS AS SOURCE OF PERFORMANCE PROBLEMS 104

APPENDIX A 105 A1 Wily Introscope Transaction Trace 105 A2 XPI inspector for troubleshooting and performance analysis 106

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

6

1 INTRODUCTION

SAP NetWeaver Process Integration (PI) consists of the following functional components Integration Builder

(including Enterprise Service Repository (ESR) Service Registry (SR) and Integration Directory) Integration

Server (including Integration Engine Business Process Engine and Adapter Engine) Runtime Workbench

(RWB) and System Landscape Directory (SLD) The SLD in contrast to the other components may not

necessarily be part of the PI system in your system landscape because it is used by several SAP NetWeaver

products (however it will be accessed by the PI system regularly) Additional components in your PI

landscape might be the Partner Connectivity Kit (PCK) a J2SE Adapter Engine one or several non-central

Advanced Adapter Engines or an Advanced Adapter Engine Extended An overview is given in the graphic

below The communication and accessibility of these components can be checked using the PI Readiness

Check (SAP Note 817920)

Processing at runtime is carried out by the Integration Server (IS) with the aid of the SLD The IS in this

graphic stands for the Web Application Server 70 or higher In the classic environment PI is installed as a

double stack an ABAP and a Java stack (J2EE Engine) The ABAP stack is responsible for pipeline

processing in the Integration Engine (Receiver Determination Interface Determination and so on) and the

processing of integration processes in the Business Process Engine Every message has to pass through

the ABAP stack The Java stack executes the mapping of messages (with the exception of ABAP mappings)

and hosts the Adapter Framework (AFW) The Adapter Framework contains all XI adapters except the plain

HTTP WSRM and IDoc adapters

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

7

With 71 the so called Advanced Adapter Engine (AAE) was introduced The Adapter Engine was enhanced

to provide additional routing and mapping call functionality so that it is possible to process messages locally

This means that a message that is handled by a sender and receiver adapter based on J2EE does not need

to be forwarded to the ABAP Integration Engine This saves a lot of internal communication reduces the

response time and significantly increases the overall throughput The deployment options and the message

flow for 71 based systems and higher are shown below Currently not all the functionalities available in the

PI ABAP stack are available on the AAE but each new PI release closes the gap further

In SAP PI 73 and higher the Adapter Engine Extended (AEX) was introduced In addition to the AAE the

Adapter Engine Extended also allows local configuration of the PI objects The AEX can therefore be seen

as a complete PI installation running on Java only From the runtime perspective no major differences can be

seen compared to the AAE and therefore no differentiation is made in this guide

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

8

Looking at the different involved components we get a first impression of where a performance problem

might be located In the Integration Engine itself in the Business Process Engine or in one of the Advanced

Adapter Engines (central non-central plain PCK) Of course a performance problem could also occur

anywhere in between for example in the network or around a firewall Note that there is a separation

between a performance issue ldquoin PIrdquo and a performance issue in any attached systems If viewed ldquothrough

the eyesrdquo of a message (that is from the point of view of the message flow) then the PI system technically

starts as soon as the message enters an adapter or (if HTTP communication is used) as soon as the

message enters the pipeline of the Integration Engine Any delay prior to this must be handled by the

sending system The PI system technically ends as soon as the message reaches the target system for

example a given receiver adapter (or in case of HTTP communication the pipeline of the Integration Server)

has received the success message of the receiver Any delay after this point in time must be analyzed in the

receiving system

There are generally two types of performance problems The first is a more general statement that the PI

system is slow and has a low performance level or does not reach the expected throughput The second is

typically connected to a specific interface failing to meet the business expectation with regard to the

processing time The layout of this check is based on the latter First you should try to determine the

component that is responsible for the long processing time or the component that needs the highest absolute

time for processing Once this has been clarified there is a set of transactions that will help you to analyze

the origin of the performance problem

If the recommendations given in this guide are not sufficient SAP can help you to optimize the service via

SAP consulting or SAP MaxAttention services SAP might also handle smaller problems restricted to a

specific interface if you describe your problem in a SAP customer incident Support offerings like ldquoSAP

Going Live Analysis (GA)rdquo and ldquoSAP Going Live Verification (GV)rdquo can be used to ensure that the main

system parameters are configured according to SAP best practice You can order any type of service using

SAP Service Marketplace or your local SAP contacts

If you have already worked with the PI Admin Check (SAP Note 884865) you will recognize some of the

transactions This is because SAP considers the regular checking of the performance to be an important

administrational task However this check tries to show the methodology to approach performance problems

to its reader Also it offers the most common reasons for performance problems and links to possible follow-

up actions but does not refer to any regular administrational transactions

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

9

Wily Introscope is a prerequisite for analyzing performance problems at the Java side It allows you to

monitor the resource usage of multiple J2EE server nodes and provides information about all the important

components on the Java stack like mapping runtime messaging system or module processor Furthermore

the data collected by Wily is stored for several days so it is still possible to analyze a performance problem at

a later date Wily Introscope is provided free-of-charge for SAP customers It is delivered as part of Solution

Manager Diagnostics but can also be installed separately For more information see

httpservicesapcomdiagnostics or SAP Note 797147

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

10

2 WORKING WITH THIS DOCUMENT

If you are experiencing performance problems on your Process Integration system start with Chapter 3

Determining the Bottleneck It helps you to determine the processing time for the different runtimes within PI

Integration Engine Business Process Engine and Adapter Engine or connected Proxy System Once you

have identified the area in which the bottleneck is most probably located continue with the relevant chapter

This is not always easy to do because a long processing time is not always an indication for a bottleneck It

can make sense to do this if a complicated step is involved (Business Process Engine) if an extensive

mapping is to be executed (IS) or if the payload is quite large (all runtimes) For this reason you need to

compare the value that you retrieve from Chapter 3 with values that you have received previously for

example The history data provided for the Java components by Wily Introscope is a big help If this is not

possible you will have to work with a hypothesis

Once the area of concern has been identified (or your first assumption leads you there) Chapters 4 5 6

and 7 will help you to analyze the Integration Engine the Business Process Engine the PI Adapter

Framework and the ABAP Proxy runtime

After (or preferably during) the analysis of the different process components it is important to keep in mind

that the bottlenecks you have observed could also be caused by other interfaces processing at the same

time It will lead you to slightly different conclusions and tuning measures You therefore have to distinguish

the following cases based on where the problem occurs

A with regard to the interface itself that is it occurs even if a single message of this interface is processed

B or with regard to the volume of this interface that is many messages of this interface are processed at the same time

C or with regard to the overall message volume processed on PI

This can be analyzed by repeating the procedure of Chapter 3 for A) a single message that is processed

while no other messages of this interface are processed and while no other interfaces are running Then

compare this value with B) the processing time for a typical amount of messages of this interface not simply

one message as before If the values of measurement A) and B) are similar repeat the procedure with C) a

typical volume of all interfaces that is during a representative timeframe on your productive PI system or

with the help of a tailored volume test

These three measurements ndash A) processing time of a single message B) processing time of a typical

amount of messages of a single interface and C) processing time of a typical amount of messages of all

interfaces ndash should enable you to distinguish between the three possible situations

o A specific interface has long processing steps that need to be identified and improved The tuning options are usually limited and a re-design of the interface might be required

o The mass processing of an interface leads to high processing times This situation typically calls for tuning measures

o The long processing time is a result of the overall load on the system This situation can be solved by tuning measures and by taking advantage of PI features for example to establish separation of interfaces If the bottleneck is hardware-related it could also require a re-sizing of the hardware that is used

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

11

Chapters 8 9 10 and 11 deal with more general reasons for bad performance such as heavy

tracinglogging error situations and general hardware problems They should be taken into account if the

reason for slow processing cannot be found easily or if situation C from above applies (long processing times

due to a high overall load)

Important

Chapter 9 provides the basic checks for the hardware of the PI server It should not only be used when

analyzing a hardware bottleneck due to a high overall load andor an insufficient sizing but should also be

used after every change made to the configuration of PI to ensure that the hardware is able to handle the

new situation This is important because a classical PI system uses three runtime engines (IS BPE AFW)

Tuning one engine for high throughput might have a direct impact on the others With every tuning action

applied you have to be aware of the consequences on the other runtimes or the available hardware

resources

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

12

3 DETERMINING THE BOTTLENECK

The total processing time in PI consists of the processing time in the Integration Server processing time in

the Business Process Engine (if ccBPM is used) as well as the processing time in the Adapter Framework (if

adapters except IDoc plain HTTP or ABAP proxies are used) In the subsequent chapters you will find a

detailed description of how to obtain these processing times

For reasons of completeness we will also have a look at the ABAP Proxy runtime and the available

performance evaluations

31 Integration Engine Processing Time

You can use an aggregated view of the performance (details about activation in Note 820622 - Standard jobs

for XI performance monitoring) data for the Integration Engine as shown below

Procedure

Log on to your Integration Engine and call transaction SXMB_IFR In the browser window follow the link to

the Runtime Workbench In there click ldquoPerformance Monitoringrdquo Change the display to ldquoDetailed Data

Aggregatedrdquo select lsquoISrsquo as the Data Source (or lsquoPMIrsquo if you have configured the Process Monitoring

Infrastructurersquo) and ldquoXIIntegrationServerltyour_hostgtrdquo as the component Then choose an appropriate time

interval for example the last day You have to enter the details of the specific interface you want to monitor

here

The interesting value is the ldquoProcessing Time [s]rdquo in the fifth column It is the sum of the processing time of the single steps in the Integration Server For now you are not interested in the single steps that are listed on the right side since you are still trying to find out the part of the PI system where the most time is lost Chapter 44 will work with the individual steps

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

13

If you do not know which interface is affected yet you first have to get an overview Instead of navigating to

Detailed Data Aggregated in the Runtime Workbench choose Overview Aggregated Use the button Display

Options and check the options for sender component receiver component sender interface and receiver

interface Check the processing times as described above

32 Adapter Engine Processing Time

Procedure

Log on to your Integration Server and call transaction SXMB_IFR In the browser window follow the link to

the Runtime Workbench In there click Message Monitoring choose Adapter Engine lthostgt from Database

from the drop down lists and press Display Enter your interface details and ldquoStartrdquo the selection Select one

message using the radio button and press Details

Calculate the difference between the start and end timestamp indicated on the screen

Do the above calculation for the outbound (AFW IS) as well as the inbound (IS AFW) messages

The audit log of successful message is no longer persisted in SAP NetWeaver PI 71 by default in order to minimize the load on the database The audit log has a high impact on the database load since it usually consists of many rows that are inserted individually in the database In newer releases therefore a cache was implemented that keeps the audit log information for a period of time only Thus no detailed information is available any longer from a historical point of view Instead you should use Wily Introscope for historical analyses of the performance of successful messages

The audit log can be persisted on the database to allow historical analysis of performance problems

on the Java stack To do so change the parameter ldquomessagingauditLogmemoryCacherdquo from

true to false for service ldquoXPI Service Messaging Systemrdquo using NetWeaver Administrator (NWA) as described in SAP Note 1314974 Note Only do this temporarily if you have identified the bottleneck on the AFW First try using Wily to determine the root cause of the problem

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

14

Due to above mentioned limitations Wily Introscope is the right tool for performance analysis of the Adapter Engine Wily persists the most important steps like queuing module processing and average response time in historical dashboards (aggregated data per intervals for all server nodes of a system or all systems with filters possible etc)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

15

Especially for Java-Only runtime using Integrated Configuration Wily shows the complete processing time after the initial sender adapter steps (receiver determination mapping and time in the backend call) as can be seen in the Message Processing Time dashboard below

Using the ldquoShow Minimum and Maximumrdquo functionality deviations from the average processing time can be seen In the example below a message processing step takes up to 22 seconds The reason for this could either be a slow mapping or a delay in the call to the receiver system (in the example below Java IDoc adapter is used to transfer the data to ERP)

In case of high processing times in one of your steps further analysis might be required using thread dumps or the Java Profiler as outlined in the appendix section A2 XPI inspector for troubleshooting and performance analysis

321 Adapter Engine Performance monitor in PI 731 and higher

Starting from PI 731 SP4 there is a new performance monitor available for the Adapter Engine More

information on the activation of the performance monitor can be found at Note 1636215 ndash Performance

Monitoring for the Adapter Engine Similar to the ABAP monitor in the RWB the data is displayed in an

aggregated format On the PI start page use the link ldquoConfiguration and Monitoring Homerdquo Go to ldquoAdapter

Enginerdquo and select ldquoPerformance Monitorrdquo The data is persisted on an hourly basis for the last 24 hours and

on a daily basis for the last 7 days The data is further grouped on interface level and per Java server node

With the information provided you can therefore see the minimum maximum and average response time of

an individual interface on a specific Java server node All individual steps of the message processing like

time spent in the Messaging System queues or Adapter modules are listed In the example below you can

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

16

see that most of the time (5 seconds) is spent in a module called SimpleWaitModule This would be the entry

point for the further analysis

Starting with 731 SP10 (740 SP05) this data can be exported to excel in two ways Overview or Detailed (details in SAP Note 1886761)

33 Processing Time in the Business Process Engine

Procedure

Log on to your integration server and call transaction SXMB_MONI Adjust your selection in a way that you

select one or more messages of the respective interface Once the messages are listed navigate to the

Outbound column (or Inbound if your integration process is the sender) and click on PE Alternatively you

can select the radio button for ldquoProcess Viewrdquo on the entranceselection screen of SXMB_MONI

To judge the performance of an integration process it is essential to understand the steps that are executed A long-running integration process in itself is not critical because it can wait for a trigger event or can include a synchronous call to a backend Therefore it is important to understand the steps that are included before analyzing an individual workflow log For example the screenshot below shows a long waiting time after the mapping This is perhaps due to an integrated wait step

Calculate the time difference between the first and the last entry Note that this time for the workflow does not include the duration of RFC calls for example To see this processing time navigate to the ldquoList with Technical Detailsrdquo (second button from the left on the screenshot below or shift + F9) Repeat this step for several messages to get an overview about the most time consuming steps

Please Note that the processing time mentioned above does not include the waiting time of the messages in the ccBPM queues Therefore it is essential to monitor a queue backlog in ccBPM queues as discussed in section ldquoQueuing and ccBPM (or increasing Parallelism for ccBPM Processes)rdquo

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

17

Via transaction SWI2_DURA you can display an aggregated performance overview across all ccBPM

processes running on your PI system In the initial screen you have to choose ldquo(Sub-) Workflow and the time

range you would like to look at

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

18

The results here allow you to compare the average performance of a process It shows you the processing

time based on barriers The 50 barrier means that 50 of the messages were processed faster than the

value specified Furthermore you also see a comparison of the processing time to eg the day before By

adjusting the selection criteria of the transaction you can therefore get a good overview about the normal

processing time of the Integration process and can judge if there is a general performance problem or just a

temporary one

The number of process instances per integration process can be easily checked via transaction

SWI2_FREQ The data shown allows you to check the volume of the Integration Processes and allow you to

judge if your performance problem is caused by an increase of volume which could cause a higher latency in

the ccBPM queues

New in PI 73 and higher

Starting from PI 73 a new monitoring for ccBPM processes is available This monitor can be started from

transaction SXMB_MONI_BPE Integration Process Monitoring (also available in ldquoConfiguration and

Monitoring Homerdquo on PI start page) This is new browser based view that allows a simplified and aggregated

view on the PI Integration Processes

On the initial screen you get an overview about all the Integration Processes executed in the selected time

interval Therefore you can immediately see the volume of each Integration Process

From there you can navigate to the Integration process facing the performance issues and look at the

individual process instances and the start and end time Furthermore there is a direct entry point to see the

PI messages that are assigned to this process

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

19

Choosing one special process instance ID you will see the time spend in each individual processing step

within the process In the example below you can see that most of the time is spend in the Wait Step

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

20

4 ANALYZING THE INTEGRATION ENGINE

If chapter Determining the Bottleneck has shown that the bottleneck is clearly in the Integration Engine there

are several transactions that can help you to analyze the reason for this To understand why this selection of

transactions helps to analyze the problem it is important to know that the processing within the Integration

Engine is done within the pipeline The central pipeline of the Integration Server executes the following main

steps

Central Pipeline Steps Description of Central Pipeline Steps

PLSRV_XML_VALIDATION_RQ_INB XML Validation Inbound Channel Request

PLSRV_RECEIVER_DETERMINATION Receiver Determination

PLSRV_INTERFACE_DETERMINATION Interface Determination

PLSRV_RECEIVER_MESSAGE_SPLIT Branching of Messages

PLSRV_MAPPING_REQUEST Mapping

PLSRV_OUTBOUND_BINDING Technical Routing

PLSRV_XML_VALIDATION_RQ_OUT XML Validation Outbound Channel Request

PLSRV_CALL_ADAPTER Transfer to Respective Adapter IDoc HTTP Proxy AFW

PLSRV_XML_VALIDATION_RS_INBXML Validation Inbound Channel Response

PLSRV_MAPPING_RESPONSE Response Message Mapping

PLSRV_XML_VALIDATION_RS_OUT Validation Outbound Channel Response

The last three steps highlighted in bold are available in case of synchronous messages only and reflect the

time spent in the mapping of the synchronous response message

The steps indicated in red are new in SAP NetWeaver PI 71 and higher and can be used to validate the

payload of a PI message against an XML schema These steps are optional and can be executed at different

times in the PI pipeline processing for example before and after a mapping (as shown above)

With PI 731 an additional option of using an external Virus Scan during PI message processing can be

activated as described in the Online Help The Virus Scan can be configured on multiple steps in the pipeline

ndash eg after the mapping (Virus Scan Outbound Channel Request) when receiving a synchronous response

(Virus Scan Inbound Channel Response) and after the response mapping (Virus Scan Outbound Channel

Response)

It is important to know is that these pipeline steps are executed based on qRFC Logical Unit of Work (LUW)

by using dialog work processes

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

21

With 711 and higher versions you can activate the Lean Message Search (LMS) to be able to search for

payload content This is described in more detail in chapter Long Processing Times for

ldquoLMS_EXTRACTIONrdquo

41 Work Process Overview (SM50SM66)

The work process overview is the central transaction to get an overview of the current processes running in

the Integration Engine For message processing PI is only using Dialog work processes (DIA WP) Therefore

it is essential to ensure that enough DIA WPs available

The process overview is only a snapshot analysis You therefore have to refresh the data multiple times to

get a feeling for the dynamics of the processes The most important questions to be asked are as follows

How many work processes are being used on average Use the CPU Time (clock symbol) to check that not all DIA WPs are used In case all DIA WPs have a high CPU time this indicates a WP bottleneck

Which users are using these processes It depends on the usage of your PI system All asynchronous messages are processed by the QIN scheduler who will start the message processing in DIA WPs The user that is shown in SM66 will be the one that triggered the QIN scheduler This can eg be a communication user for a connected backend (usually a copy of PIAPPLUSER) or the user used to send data from the Adapter Engine (per default PIAFUSER)

Activities related to the Business Process Engine (ccBPM) will be indicated by user WF-BATCH If you see high WF-BATCH activity take a look at chapter 51

Are there any long running processes and if so what are these processes doing with regard to the reports used and the tables accessed

o As a rule of thumb for the initial configuration the number of DIA work processes should be around six times the value of CPU in your PI system (rdispwp_no_dia = 6 to 8 times CPUs cores based on SAP Note 1375656 - SAP NetWeaver PI System Parameters)

If you would like to get an overview for an extended period of time without actually refreshing the transaction

at regular intervals use report SDFMON and schedule the Daily Monitoring It allows you to collect metrics

such as CPU usage amount of free Dialog work processes and most importantly a snapshot of the work

process activity as provided in transactions SM50 and SM66 The frequency for the data collection can be as

low as 10 seconds and the data can be viewed after the configurable timeframe of the analysis Note

SDFMON is only intended for analysis of performance issues and should only be scheduled for a limited

period of time

If you have Solution Manager Diagnostics installed and all the agents connected to PI you can also monitor

the ABAP resource usage using Wily Introscope As you can see in the dashboard below you can use it to

monitor the number of free work processes (prerequisite is that the SMD agent is running on the PI system)

The major advantage of Wily Introscope is that this information is also available from the past and allows

analysis after the problem has occurred in the system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

22

42 qRFC Resources (SARFC)

Depending on the configuration of the RFC server group not all dialog work processes can be used for

qRFC processing in the Integration Server As stated earlier all (ABAP based) asynchronous messages are

processed using qRFC Therefore tuning the RFC layer is one of the most important tasks to achieve good

PI performance

In case you have a high volume of (usually runtime critical) synchronous scenarios you have to ensure that

enough DIA WPs are kept free by the asynchronous interfaces running at the same time Since this is a very

difficult tuning exercise we usually recommend implementing runtime critical synchronous interfaces using

Java only configuration (ICO) whenever possible If this is not possible you have to ensure via SARFC tuning

that enough work processes are kept free to process the synchronous messages

The current resource situation in the system can be monitored using transaction SARFC

Procedure

First check which application servers can be used for qRFC inbound processing by checking the AS Group

assigned in transaction SMQR

Call transaction SARFC and refresh several times since the values are only snapshot values

Is the value for ldquoMax Resourcesrdquo close to your overall number of dialog work processes Max Resources is a fixed value and describes how many DIA work processes may be used for RFC communication If the value is too low your RFC parameters need tuning to provide the Integration Server with enough resource for the pipeline processing The RFC parameters can be displayed by double-clicking on ldquoServer Namerdquo

o A good starting point is to set the values as follows

Max no of logons = 90

Max disp of own logons = 90

Max no of WPs used = 90

Max wait time = 5

Min no of free WP = 3-10 (depending on the size of the application server)

These values are specified in SAP Note 1375656 ndash SAP NetWeaver PI System Parameters (this parameter must be increased if you plan to have synchronous interfaces) To avoid any bottleneck and blocking situation free DIA WPs should always be available The value should be adjusted based on the overall number of DIA WPs available at the system and the above requirements

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

23

Note You have to set the parameters using the SAP instance profile Otherwise the changes are lost

after the server is restarted

The field ldquoResourcesrdquo shows the number of DIA WPs available for RFC processing Is this value reasonably high for example above zero all the time If the ldquoResourcesrdquo value is zero then the Integration Server cannot process XML messages because no dialog work process is free to execute the necessary qRFC There are several conclusions that can be drawn from this observation as follows

1) Depending on the CPU usage (see Chapter ldquoMonitoring CPU Capacityrdquo) it might be necessary at this time to increase the number of dialog work processes in your system If the CPU usage however is already very high increasing the number of DIA will not solve the bottleneck

2) Depending on the number of concurrent PI queues (see Chapter qRFC Queues (SMQ2)) it might be necessary to decrease the degree of parallelism in your Integration Server because the resources are obviously not sufficient to handle the load The next section describes how to tune PI inbound and outbound queues

The data provided in SARFC is also collected and shown in a graphical and historical way using Solution

Manager Diagnostic and Wily Introscope (as can be seen in the screenshot below) This allows easy

monitoring of the RFC resources across all available PI application servers

43 Parallelization of PI qRFC Queues

The qRFC inbound queues on the ABAP stack are one of the most important areas in PI tuning Therefore it

is essential to understand the PI queuing concept

PI generally uses two types of queues for message processing ndash PI inbound and PI outbound queues Both

types are technical qRFC inbound queues and can therefore be monitored using SMQ2 PI inbound queues

are named XBTI (EO) or XBQI (EOIO) and are shared between all interfaces running on PI by default The

PI outbound queues are named XBTO (EO) and XBQO (EOIO) The queue suffix (in red XBTO0___0004)

specifies the receiver business system This way PI is using dedicated outbound queues for each receiver

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

24

system All interfaces belonging to the same receiver business system are contained in the same outbound

queue Furthermore there are dedicated queues for prioritization of separation of large messages To get an

overview about the available queues use SXMB_ADM Manage Queues

PI inbound and outbound queues execute different pipeline steps

PI Inbound Queues

Pipeline Steps Description of Pipeline Steps

PLSRV_XML_VALIDATION_RQ_INB XML Validation Inbound Channel Request

PLSRV_RECEIVER_DETERMINATION Receiver Determination

PLSRV_INTERFACE_DETERMINATION Interface Determination

PLSRV_RECEIVER_MESSAGE_SPLIT Branching of Messages

PI Outbound Queues

Pipeline Steps Description of Pipeline Steps

PLSRV_MAPPING_REQUEST Mapping

PLSRV_OUTBOUND_BINDING Technical Routing

PLSRV_XML_VALIDATION_RQ_OUT XML Validation Outbound Channel Request

PLSRV_CALL_ADAPTER Transfer to Respective Adapter IDoc HTTP Proxy AFW

Before the steps above are executed the messages are placed in a qRFC queue (SMQ2) The messages

will wait till they are first in queue and till a free DIA work process is assigned to a queue The wait time in the

queue are recorded in the performance header in DB_ENTRY_QUEUING (PI inbound queue) and

DB_SPLITTER_QUEUING (PI outbound queue) Most often you will see the long processing times there as

discussed in chapter Long Processing Times for ldquoDB_ENTRY_QUEUEINGrdquo and Long Processing Times for

ldquoDB_SPLITTER_QUEUEINGrdquo

Looking at the steps above it is clear that tuning the queues will have a direct impact on the connected PI

components and also backend systems For example by increasing the number of parallel outbound queues

more mappings will be executed in parallel which will in turn put a greater load on the Java stack or more

messages will be forwarded in parallel to the backend system in case of a Proxy call Thus when tuning PI

queues you must always keep the implications for the connected systems in mind

The tuning of PI ABAP queue parameters is done in transaction SXMB_ADM Integration Engine

Configuration and by selecting the category TUNING

For productive usage we always recommend to use inbound and outbound queues (parameter

EO_INBOUND_TO_OUTBOUND=1) Otherwise only inbound queues will be used which are shared across

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

25

all interfaces Hence a problem with one single backend system will affect all interfaces running on the

system

The principle of less is sometimes more also applies for tuning the number of parallel PI queues Increasing

the number of parallel queues will result in more parallel queues with fewer entries per queue In theory this

should result in a lower latency if enough DIA WPs are available

But practically this is not true for high volume systems The main reason for this is the overhead involved in

the reloading of queues (see details below) Furthermore important tuning measures like PI packaging (see

Chapter ldquoPI Message Packagingrdquo) aim to increase the throughput based on a higher number of messages in

the queue Thus from a throughput perspective it is definitely advisable to configure fewer queues

If you have very high runtime requirements you should prioritize these interfaces and assign a different

parallelism for high priority queues only This can be done using parameters

EO_INBOUND_PARALLEL_HIGH_PRIO and EO_OUTBOUND_PARALLEL_HIGH_PRIO as described in

section ldquoMessage Prioritization on the ABAP Stackrdquo

To tune the parallelism of inbound and outbound queues the relevant parameters are

EO_INBOUND_PARALLEL and EO_OUTBOUND_PARALLEL have to be used

Below you can see a screenshot of SMQ2 You can see the PI inbound queues and outbound queues Also

ccBPM queues (XBQO$PE) are displayed and will be discussed in section 5

Procedure

Log on to your Integration Server call transaction SMQ2 and execute If you are running ABAP proxies on a

separate client on the same system enter lsquorsquo for the client Transaction SMQ2 provides snapshots only and

must therefore be refreshed several times to get viable information

o The first value to check is the number of queues concurrently active over a period of time Since each queue needs a dialog work process to be worked on the number of concurrent queues must not be too high compared to the number of dialog work processes usable for RFC communication (see Chapter qRFC Resources (SARFC)) The optimum throughput can be achieved if the number of active queues is equal to or slightly higher than the number of dialog work processes (provided that the number of DIA work processes can be handled by the CPU of the system see transaction ST06)

In case many of the queues are EOIO queues (eg because the serialization is done on the material number) try to reduce the queues by following Chapter EOIO tuning

o The second value of importance is the number of entries per queue Hit refresh a few times to check if the numbers increasedecreaseremain the same An increase of entries for all queues or a specific

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

26

queue points to a bottleneck or a general problem on the system The conclusion that can be drawn from this is not simple Possible reasons have been found to include

1) A slow step in a specific interface

Bad processing time of a single message or a whole interface can be caused by expensive processing steps as for example the mapping step or receiver determination This can be confirmed by looking at the processing time for each step as shown in ldquoAnalyzing the runtime of pipeline stepsrdquo

2) Backlog in Queues

Check if inbound or outbound queues face a backlog A backlog in a queue is generally nothing critical since the queues ensure that the PI components as well as the backend systems are not overloaded For instance batch triggered interfaces are usually causing high backlogs that get processed over time Only in case the backlog prevents you to meet the business requirements for an interface this should be analyzed

If the backlog is caused by a high volume of messages arriving in a short period of time one solution to minimize the backlog is to increase the number of queues available for this interface (if sufficient hardware is available) If one interface is having a backlog in the outbound queues you could eg specify EO_OUTBOUND_PARALLEL with a sub parameter specifying your interface to increase the parallelism for this interface

But in general a backlog can also be caused by a long processing time in one of the pipeline steps as discussed in 1) or a performance problem with one of the connected components like PI Java stack or connected backend systems Due to the steps performed in the outbound queues (mapping or call adapter) it is more likely that a backlog will be seen in the outbound queues Inbound queue processing is only expensive if Content Based Routing or Extended Receiver Determination is used To understand which step is taking long follow once more chapter ldquoAnalyzing the runtime of pipeline stepsrdquo

In addition to tuning of the number of inbound and outbound queues also ldquoMessage Prioritization on the ABAP Stackrdquo and ldquoPI Message Packagingrdquo can help

The screenshot below shows such a backlog situation in Wily Introscope Please Note that the information about the queues is only collected by Wily in 5 minutes intervals explaining the gaps between the measurement points On the dashboard ldquoTotal qRFC Inbound Queue Entriesrdquo you can see that the number of messages in SMQ2 is increasing The number of inbound queues does not increase at the same time It also offers a dashboard for the number of entries by queue name and the time entries remain in SMQ2 With the information provided here it is also possible to distinguish a blocking situation or a general resource problem after the problem is no longer present in the system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

27

3) Queues stay in status READY in SMQ2

To see the status of the queues use the filter button in SMQ2 as shown below

It is a normal situation to see many queues in READY status for a limited amount of time The reason for this is the way the QIN scheduler reloads LUWs (see point 2 below) A waiting time of up to 5 seconds per queuing step is considered normal during normal load situation (even if there is no backlog in the queue and enough resources are available) If you observe a situation where many queues are in READY status for longer period of time the following situations can apply

A resource bottleneck with regard to RFC resources Confirm by following section qRFC Resources (SARFC)

To avoid overloading a system with RFC load the QIN scheduler only reloads new queues after a certain amount of queues have finished processing This can lead to long waiting times in the queues as explained in SAP Note 1115861 - Behaviour of Inbound Scheduler after Resource Bottleneck Since PI is a non-user system we recommend setting rfcinb_sched_resource_threshold to 3 as described in SAP Note 1375656 ndash SAP NetWeaver PI System Parameters

Long DB loading times for RFC tables If using Oracle ensure that SAP Note 1020260 - Delivery of Oracle statistics is applied

Additional overhead during the pipeline executions occurs due to PI internal mechanisms Per default a lock is set during processing of each message This is no longer necessary and should

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

28

be switched off by setting the parameter LOCK_MESSAGE of category RUNTIME to 0 as described in SAP Note 1058915 In addition during the processing of an individual message the message repeatedly calls an enqueue which leads to a deterioration of the throughput This can be avoided by setting parameter CACHE_ENQUEUE to 0 as described in SAP Note 1366904

Sometimes a queue stays in READY status for a very long time while other queues are getting processed fine After manual unlocking the queue is processed but then stops after some time again A potential reason is described in Note 1500048 - SMQ2 inbound Queue remains in READY state In a PI environment queues in status RUNNING for more than 30 minutes usually happen due to one individual step taking long (as discussed in step 1) or a problem on the infrastructure (eg memory problems on Java or HTTP 200 lost on network) After 30 minutes the QIN scheduler removes this queue from the scheduling and therefore it remains in READY To solve such cases the root cause for the blocking has to be understood

4) Queue in READY status due to queue blacklisting

If a queue is for a very long time in status RUNNING (eg due to a problem with a mapping or due to a very slow backend) it is possible that the QIN scheduler excludes this queue from queue scheduling Hence the queue stays in status READY even though enough work processes are available The ldquoblacklistingrdquo of queues takes place when the runtime of a queue exceeds the ldquoMonitoring thresholdrdquo configured on this queue Notes 1500048 - queues stay in ready (schedule_monitor) and Note 1745298 - SMQR Inbound queue scheduler identify excluded entries provide more input on this topic The screenshot below shows a setting of 2400 seconds (40 minutes) for this parameter

Check if a queue stays in READY status for a long time while others are processing without any issue Ensure Note 1745298 is implemented and check the system log for the following exception ldquoQINEXCLUDE ltqueue namegt lttimedate at which the queue scheduler startedgt If this is the case check why the long runtime occurs (see section Analyzing the runtime of PI pipeline steps) or increase the schedule_monitor in exceptional cases only (if eg the runtime of an individual processing step cannot be improved further)

5) An error situation is blocking one or more queues

Click the Alarm Bell (Change View) push button once to see only queues with error status Errors delay the queue processing within the Integration Server and may decrease the throughput if for example multiple retries occur Navigate to the first entry of the queue and analyze the problem by forward navigation to the relevant message in SXMB_MONI

Often you see EO queues in SYSFAIL or RETRY due to the problems during the processing of an individual message This can be prevented by following the description of Chapter Prevent blocking of EO queues

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

29

44 Analyzing the runtime of PI pipeline steps

The duration of the pipeline steps is the ultimate answer to long processing times in the Integration Engine

since it describes exactly how much time was spent at which point The recommended way to retrieve the

duration of the pipeline steps is the RWB and will be described below Advanced users may use the

Performance Header of the SOAP message using transaction SXMB_MONI but the timestamps are not

easy to read If you still prefer the latter method here is the explanation 20110409092656165 must be read

as yyyymmddhh(min)(min)(sec)(sec)(microseconds) that is the timestamp corresponds to April 9 2011 at

092656 and 165 microseconds Please note that these timestamps in Performance Header of

SXI_MONITOR transaction are stored and displayed in UTC time therefore conversion to system time must

be done when analyzing them

In case PI message packaging is configured the performance header will always reflect the processing time

per package Hence a duration of 50 seconds does not mean a single message took 50 seconds but

eventually the package contained 100 messages so that every message took 05 ms More details about

this can be found in section PI Message Packaging

Procedure

Log on to your Integration Server and call transaction SXMB_IFR In the browser window follow the link to

the Runtime Workbench In there click Performance Monitoring Change the display to Detailed Data

Aggregated and choose an appropriate time interval for example the last day For this selection you have to

enter the details of the specific interface you want to monitor

You will now see in the lowerndashpart of the screen a split of the processing time into single steps Check the time difference between the steps Does any step take longer than expected In the example in the screenshot below the DB_ENTRY_QUEUEING starts after 0032 seconds and ends after 0243 seconds which means it took 0211 seconds (211ms)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

30

Compare the processing times for the single steps for different measurements as outlined in Chapter

Pipeline Steps (SXMB_MONI or RWB) For example is a single step only long if many messages

are processed or if a single message is processed This helps you to decide if the problem is a

general design problem (single message has long processing step) or if it is related to the message

volume (only for a high number of messages this process step has large values)

Each step has different follow-up actions that are described next

441 Long Processing Times for ldquoPLSRV_RECEIVER_ DETERMINATIONrdquo PLSRV_INTERFACE_DETERMINATION

Steps that can take a long time in inbound processing are the Receiver and Interface Determination In these

steps the receiver system and interface is calculated Normally this is very fast but PI offers the possibility of

enhanced receiver determinations In these cases the calculation is based on the payload of a message

There are different implementation options

o Content-Based Routing (CBR)

CBR allows defining XPath expressions that are evaluated during runtime These expressions can be combined by logical operators for example to check the value for multiple fields The processing time of such a request directly correlates to the number and complexity of the conditions defined

No tuning options exist for the system in regard to CBR The performance of this step can only be changed by reducing the number of rules by changing the design of the interfaces Another option is to use an extended receiver determination (executing a mapping program) since this is faster than CBR using many rules

o Mapping to determine Receivers

A standard PI mapping (ABAP Java XSLT) can also be used to determine the receiver If you observe high runtimes in such a receiver determination follow the steps outlined in the next section since the same mapping runtime is used

442 Long Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo

Before analyzing the mapping you must understand which runtime is used Mappings can be implemented

in ABAP as graphical mappings in the Enterprise Service Builder as self-developed Java mappings or

XSLT mappings One interface can also be configured to use a sequence of mappings executed

sequentially In such a case analysis is more difficult because it is not clear which mapping in the sequence

is taking a long time

Normal debugging and tracing tools (transaction ST12) can be used for mappings executed in ABAP Any

type of mapping can be tested from ABAP using transaction SXI_MAPPING_TEST Knowing

senderreceiver partyserviceinterfacenamespace and source message payload it is possible to check the

target message (after mapping execution) and detailed trace output (similar to contents of trace of the

message in SXMB_MONI with TRACE_LEVEL = 3) It can also be used for debugging at runtime by using

the standard debugging functionality

For ABAP based XSLT mappings it is also possible via transaction XSLT_TOOL to test trace and debug

XSLT transformations on the ABAP stack

In general the mapping response time is heavily influenced by the complexity of the mapping and the

message size Therefore to analyze a performance problem in the mapping environment you should

compare the mapping runtime during the time of the problem with values reported several days earlier to get

a better understanding

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

31

If no AAE is used the starting point of a mapping is the Integration Engine that will send the request by RFC

destination AI_RUNTIME_JCOSERVER to the gateway There it will be picked up by a registered server

program The registered server program belongs to the J2EE Engine The request will be forwarded to the

J2EE Engine by a JCo call and then executed by the Java runtime When the mapping has been executed

the result is sent back to the ABAP pipeline There are therefore multiple places to check when trying to

determine why the mapping step took so long

Before analyzing the mapping runtime of the PI system check if only one interface is affected or if you face a

long mapping runtime for different interfaces To do so check the mapping runtime of messages being

processed at the same time in the system

The best tool for such an analysis is Wily Introscope which offers a dashboard for all mappings being

executed at a given time Each line in the dashboard represents one mapping and shows the average

response time and the number of invocations

In the screenshot below you can see that many different mapping steps have required around 500 seconds

for processing Comparing the data during the incident with the data from the day before will allow you to

judge if this might be a problem of the underlying J2EE engine as described in section J2EE Engine

Bottleneck

If there is only one mapping that faces performance problems there would be just one line sticking out in the

Wily graphs If you face a general problem that affects different interfaces you can choose a longer

timeframe that allows you to compare the processing times in a different time period and verify if it is only a

ldquotemporaryrdquo issue ndash this would for example indicate an overload of the mapping runtime

If you have found out that only one interface is affected then it is very unlikely to be a system problem but

rather a problem in the implementation of the mapping of that specific interface

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

32

Check the message size of the mapping in the runtime header using SXMB_MONI Verify if the message size is larger than usual (which would explain the longer runtime)

There can also be one or many lookups to a remote system (using RFC or JDBC) causing the long processing time of a mapping Together with the application you then have to check if the connection to the backend is working properly Such a mapping with lookups can be analyzed using Wily Transaction Trace as explained in the appendix section Wily Transaction Trace

If not one but several interfaces are affected a potential system bottleneck occurs and this is described in

the following

o Not enough resources (registered server programs) available That could either be the case if too many mapping requests are issued at the same time or if one J2EE server node is down and has not registered any server programs

To check if there were too many mapping requests for the available registered server programs compare the number of outbound queues that are concurrently active with the number of registered server programs The number of outbound queues that are concurrently active can be monitored with transaction SMQ2 and counting queues with the names XBTO amp XBQOXB2O XBTA and XBQAXB2A (high priority) XBTZ and XBQZXB2Z (low priority) and XBTM (large messages) In addition to this you have to take into account mapping steps that are executed by synchronous messages or in a ccBPM process Mappings executed by a ccBPM process are not queued but the ccBPM process calls the mapping program directly using tRFC The number of registered server programs can be determined with transaction SMGW Goto Logged On Clients and filtering by the program (ldquoTP Namerdquo) AI_RUNTIME_ltSIDgt In general we recommend configuring 20 parallel mapping connections per server node

If the problem occurred in the past it is more difficult to determine the number of queues that are concurrently active One way is to use transaction SXMB_MONI and specify every possible outbound queue (field ldquoQueue IDrdquo) in the advanced selection criteria (Note wildcard search not available) The timeframe can be restricted in the upper-part of the advanced selection criteria Advanced users could use transaction SE16 to search directly in table SXMSPMAST using the field ldquoQUEUEINTrdquo

The other option is to use the Wily ldquoqRFC Inbound Queue Detailrdquo dashboard as described in qRFC Queue Monitoring (SMQ2)

To check if the J2EE is not available or if the server program is not registered for a different reason use transaction SM59 to test the RFC destination AI_RUNTIME_JCOSERVER If the problem occurred in the past search the gateway trace (transaction SMGW GoTo Trace Gateway Display File) and the RFC developer traces dev_rfc files available in the work directory for the string ldquoAI_RUNTIMErdquo In addition check the std_serverltngtout in the work directory for restarts of the J2EE Engine

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

33

o Depending on the checks above there are two options to resolve the bottleneck Of course if the J2EE was not available the reason for this has to be found and prevented in the future

The two options for tuning are

1) the number of outbound queues that are concurrently active and

2) the number of mapping connections from the J2EE server to the ABAP gateway

The increase of mapping connections is only recommended for strong J2EE servers since each mapping thread needs resources like CPU and Heap memory Too many mapping connections mean too many mappings being executed on the J2EE Engine In turn this might reduce the performance of other parts of the PI system for example the pipeline processing in the Integration Engine or the processing within the adapters of the AFW or can even cause out-of-memory errors on the J2EE engine Add additional J2EE server nodes to balance the parallel requests Each J2EE server node will

register new destinations at the gateway and will therefore take a part of the mapping load

Another option is to reduce the number of outbound queues that are concurrently active to solve the bottleneck This will certainly result in higher backlogs in the queue and is therefore only applicable for less runtime critical interfaces This option can be used if the backlog is caused by a master data interface with high volume but no critical processing time requirements In such a case you can only lower the number of parallel queues for this interface by using a sub-parameter of EO_OUTBOUND_PARALLEL in transaction SXMB_ADM By doing so you slow down one interface to ensure other (more critical) interfaces have sufficient resources available

o If multiple application servers configured on the PI system ensure that all the server nodes are connected to their local gateway (local bundle option) as described in the guide How To Scale PI

443 Long Processing Times for ldquoPLSRV_CALL_ADAPTERrdquo

Call adapter is the last step executed in the PI pipeline and forwards the message to the next component

along the message flow In most cases (local Adapter Engine IDoc or BPE) this is a local call The IDoc

adapter only puts the message on the tRFC layer (SM58) of the PI system so that the actual transfer of the

IDoc is not included in the performance header For ABAP Proxies plain HTTP or calls to a central or

decentral Adapter Engine an HTTP call is made so that network time can have an influence here

Looking at the processing time we have to distinguish asynchronous (EO and EOIO) and synchronous (BE)

interfaces

In asynchronous interfaces the call adapter step includes the transfer of the message by the network and

the initial steps on the receiver side (in most cases this just means persisting the message on the receiverrsquos

database) A long duration can therefore have two reasons

o Network latency For large messages of several MB in particular the network latency is an important factor (especially if the receiving ABAP proxy system is located on another continent for example) Network latency has to be checked in case of a long call adapter step In case HTTP load balancers are used they are considered as part of the network

o Insufficient resources on receiver side Enough resources must be available at the receiver side to ensure quick processing of a message For instance enough ICM threads and dialog work processes must be available in the case of an ABAP proxy system Therefore the analysis of long call adapter steps always has to include the relevant backend system

For synchronous messages (requestresponse behavior) the call adapter step also includes the processing

time on the backend to generate the response message Therefore the call adapter for synchronous

messages includes the time of the transfer of the request message the calculation of the corresponding

response message at the receiver side and the transfer back to PI Therefore the processing time to

process a request at the receiving target system for synchronous messages must always be analyzed to find

the most costly processing steps

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

34

444 Long Processing Times for ldquoDB_ENTRY_QUEUEINGrdquo

The value for DB_ENTRY_QUEUEING describes the time that a message has spent waiting in a PI inbound

queue before processing started In case of errors the time also includes the wait time for restart of the LUW

in the queue The inbound queues (XBTI XBT1 XBT9 XBTL for EO messages and XBQIXB2I

XBQ1XB21 XBQ9XB29 for EOIO messages) process the pipeline steps for the Receiver

Determination the Interface Determination and the Message Split (and optionally XML inbound validation)

Thus if the step DB_ENTRY_QUEUEING has a high value the inbound queues have to be monitored using

transactions SMQ2 and SARFC The reasons are similar as those outlined in chapter qRFC Resources

(SARFC) and qRFC Queues (SMQ2)

o Not enough resources (DIA work processes for RFC communication) available

o A backlog in the queues caused by one of the following reasons

The number of parallel inbound queues is too low to handle the incoming messages in parallel Note that a simple increase of the parameter EO_INBOUND_PARALLEL might not always be the solution (as enough work processes also have to be available to process the queues in parallel) The number of work processes is in turn restricted by the number of CPUs that are available for your system If you would like to separate critical from non-critical sender systems define a dedicated set of inbound queues by specifying a sender ID as a sub parameter of EO_INBOUND_PARALLEL For example by doing so you can separate master data interfaces from business critical interfaces Please note that the separation on inbound queues only works on Business System level and not on interface level

The inbound queues were blocked by the first LUW Use transaction SMQ2 to check if that is still the case To identify if such a situation occurred in the past use the Wily Introscope dashboard ldquoABAP System qRFC Inbound Detailldquo It is generally not easy to find the one message that is blocking the queue It might be achieved by checking RFC traces or by searching for a single message with a long pipeline processing step For EO queues there is no business reason to block the queues in case of a single message error To prevent this follow the description in section Prevent blocking of EO queues

In case you have a different runtime of the messages in the queues (eg due to complex extended Receiver Determinations) the number of messages per queue might be uneven distributed between the queues In PI 73 a new balancing option is available as discussed in Avoid uneven backlogs with queue balancing

Problems when loading the LUW by the QIN scheduler as described in qRFC Queues (SMQ2)

Note If your system uses the parameter EO_INBOUND_TO_OUTBOUND = 0 you must also read chapter

445 for analyzing the reasons EO_INBOUND_TO_OUTBOUND only determines whether inbound queues

(value lsquo0rsquo) or inbound and outbound queues (value lsquo1rsquo) are used for the pipeline processing in the integration

server Check the value with transaction SXMB_ADM Integration Engine Configuration Specific

Configuration The default value is lsquo1rsquo meaning the usage of inbound and outbound queues (the

recommended behavior)

445 Long Processing Times for ldquoDB_SPLITTER_QUEUEINGrdquo

The value for DB_SPLITTER_QUEUEING describes the time that a message has spent waiting in a PI

outbound queue until a work process was assigned In case of errors the time also includes the wait time for

the restart of the LUW in the queue

The outbound queues (XBTO XBTA XBTZ XBTM for EO messages and XBQOXB2O

XBQAXB2A XBQZXB2Z for EOIO messages) process the pipeline steps for the mapping outbound

binding and call adapter

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

35

Thus if the step DB_SPLITTER_QUEUEING has a high value the outbound queues have to be monitored

using transactions SMQ2 and SARFC The reasons are similar as outlined in chapters qRFC Resources

(SARFC) and qRFC Queues (SMQ2) or described in the section above for PI inbound queues

o Not enough resources (DIA work processes for RFC communication) available

o A backlog in the queues caused by one of the following reasons

The number of parallel outbound queues is too low to handle the outgoing messages in parallel Note that simply increasing the parameter EO_OUTBOUND_ PARALLEL might not always be the solution as enough work processes have to be available to process the queues in parallel The number of work processes in turn is limited by the number of CPUs available for the system Also the parameter EO_OUTBOUND_PARALLEL is used differently than the parameter EO_INBOUND_ PARALLEL because it determines the degree of parallelism for each receiver system Thus it is possible to increase the number of parallel outbound queues for specific receivers whereas other receivers are handled with a lower degree of parallelism Note that not every receiver backend is able to handle a high degree of parallelism so that in case of an ABAP Proxy system it might not indicate a problem on PI in case you observe a high value for DB_SPLITTER_QUEUING

The outbound queues were blocked by the first LUW To identify if such a situation occurred in the past use the Wily Introscope dashboard ldquoABAP System qRFC Inbound Detailldquo In general it is not easy to find the one message that is blocking the queue It might be achieved by checking RFC traces or by searching for a single message with a long pipeline processing step For EO queues there is no business reason to block the queues in case of a single message error To prevent this follow the description in section Prevent blocking of EO queues

Since the outbound queues include a processing step that is bound to take longer than the other steps (that is the mapping and the call adapter step) this might mean a long waiting time for all subsequent messages As an example assume that a mapping takes 1 second and that 100 messages are waiting in a queue The first message gets executed at once the second message has to wait about 1 second (a bit more since more steps are executed than just the mapping but this should be ignored at the moment) the third takes 3 seconds and so on The 100

th message

has to wait 100 seconds until it is processed that is the value for DB_SPLITTER_QUEUING is as high as 100 For a given outbound queue you would therefore see an increase for the DB_SPLITTER_QUEUING value over time If you experience this situation proceed with Chapter 442

In case you have a different runtime of the messages in the queues (eg due to different message size) the number of messages per queue might be uneven distributed between the queues In PI 73 a new balancing option is available as discussed in Avoid uneven backlogs with queue balancing

o Problems when loading the LUW by the QIN scheduler as described in qRFC Queues (SMQ2)

Note The number of parallel outbound queues is also connected with the ability of the receiving system to

process a specific amount of messages for each time unit

In section Adapter Parallelism default restrictions of the specific adapters of the J2EE Engine are explained Some adapters (JDBC File Mail) can process only one message for each server node at a time (this can be changed) It does not make sense for such adapters to have too many parallel qRFC queues since this will only move the message backlog from ABAP to Java

For messages that are directed to the IDoc outbound adapter the value for EO_OUTBOUND_PARALLEL is connected to the MAXCONN value of the corresponding outbound destination You can define the maximum number of connections using transaction SMQS ndash the default value is 10 parallel connections If applicable SAP highly recommends that you use IDoc packaging in the outbound IDoc adapter to reduce the load during the transmission of tRFC calls from the PI system to the receiving system

For ABAP proxy systems the number of outbound queues directly determines the number of messages sent in parallel to the receiving system Thus you have to ensure that the resources

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

36

available on PI are aligned with those on the sendingreceiving ABAP proxy system

446 Long Processing Times for ldquoLMS_EXTRACTIONrdquo

Lean Message Search can be configured for newer PI releases as described in the Online Help After

applying Note 1761133 - PI runtime Enhancement of performance measurement an additional header for

LMS will be written to the performance header

The header could look like this indicating that around 25 seconds were spent in the LMS analysis

When using trace level two additional timestamps are written to provide details about this overall runtime

LMS_EXTRACTION_GET_VALUES

This timestamp describes the evaluation of the message according to user-defined filter criteria

LMS_EXTRACTION_ADJUST_VALUES

This timestamp describes the post-processing of the previously found values in the BAdI (as of PI 730)

The runtime of the LMS heavily depends on the number of payload attributes to be extracted Also the

payload size and the complexity of the XPath expression has a direct impact on the performance of the LMS

In general the number of elements to be indexed should be kept at a minimum and very deep and complex

XPath expressions should be avoided

If you want to minimize the impact of LMS on the PI message processing time you can define the extraction

method to use an external job In that case the messages will be indexed after processing only and it will

therefore have no performance impact during runtime Of course this method imposes a delay in the

indexing (based on the frequency of job SXMS_EXTRACT_MESSAGES) so that newly processed

messages cannot be searched using LMS If this delay is acceptable for the monitoring responsible the

messages should be indexed using an external job

447 Other step performed in the ABAP pipeline

As discussed earlier there are additional steps in the ABAP pipeline that can be executed based on the

configuration of your scenario Per default these steps will not be activated and should therefore not

consume any time All these steps are traced in the performance header of the message Below you can see

the details for

- XML validation

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

37

- Virus Scan

In case one of these steps is taking long you have to check the configuration of your scenario In the

example of the Virus scan the problem might be related to the external virus scanner used and tuning has to

happen on that side

45 PI Message Packaging for Integration Engine

PI message packaging was introduced with SAP NetWeaver PI 70 SP13 and is per default activated in PI

71 and higher Message packaging in BPE is independent of message packaging in PI (see chapter 56)

but they can be used together

To improve performance and message throughput asynchronous messages can be assembled to packages

and processed as one entity For this the qRFC scheduler was enabled to process a set of messages (a

package) in one LUW Thus instead of sending one message to the different pipeline steps (for example

mappingrouting) a package of messages will be sent that will reduce the number of context switches that is

required Furthermore access to databases is more efficient since requests can be bundled in one database

operation Depending on the number and size of messages in the queue this procedure improves

performance considerably In return message packaging can increase the runtime for individual messages

(latency) due to the delay in the packaging process

Message packaging is only applicable for asynchronous scenarios (QoS EO and EOIO) Due to the potential

latency of individual messages packaging is not suitable for interfaces with very high runtime requirements

In general packaging has the highest benefit for interfaces with high volume and small message size

The performance improvement achieved directly relates to the number of the messages bundled in each

package Message packaging must not solely be used in the PI system Tests have shown that the

performance improvement significantly increases if message packaging is configured end-to-end - that is

from the sending system using PI to the receiving system Message packaging is mainly applicable for

application systems connected to PI by ABAP proxy

From the runtime perspective no major changes are introduced when activating packaging Messages

remain individual entities in regards to persistence and monitoring Additional transactions are introduced

allowing the monitoring of the packaging process

Messages are also treated individually for error handling If an error occurs in a message processed in a

package then the package will be disassembled and all messages will be processed as single messages Of

course in a case where many errors occur (for example due to interface design) this will reduce the benefits

of message packaging SAP systems connected to PI via ABAP Proxy can also make use of the message

packaging to reduce the HTTP calls necessary to transmit messages between the systems Other sending

and receiving applications in particular will not see any changes because they send and receive individual

messages and receive individual commits

The PI message packaging can be adapted to meet your specific needs In general the packaging is

determined by three parameters

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

38

1) Message count Maximum number of messages in a package (default 100)

2) Maximum package size Sum of all messages in kilobytes (default 1 MB)

3) Delay time Time to wait before the queue is processed if the number of messages does not reach

the message count (default 0 meaning no waiting time)

These values can be adjusted using transactions SXMS_BCONF and SXMS_BCM To better use the

performance improvements offered by packaging you could for example define a specific packaging for

interfaces with very small messages to allow up to 1000 messages for each package Another option could

be to increase the waiting time (only if latency is not critical) to create bigger packages

See SAP Note 1037176 - XI Runtime Message Packaging for more details about the necessary

prerequisites and configuration of message packaging More information is also available at

httphelpsapcom SAP NetWeaver SAP NetWeaver PIMobileIdM 71 SAP NetWeaver Process

Integration 71 Including Enhancement Package 1 SAP NetWeaver Process Integration Library

Function-Oriented View Process Integration Integration Engine Message Packaging

46 Prevent blocking of EO queues

In general messages for interfaces using Quality of Service Exactly Once are independent of each other In

case of an error in one message there is no business reason to stop the processing of other messages But

exactly this happens in case an EO queue goes into error due to an error in the processing of a single

message The queue will then be automatically retried in configurable intervals This retry will cause a delay

of all other messages in the queue which cannot be processed due to the error of the first message in the

queue

To avoid this a new error handling was introduced in SAP Notes 1298448 - XI runtime No automatic retry

for EO message processing (set EO_RETRY_AUTOMATIC = 0 and schedule restart job

RSXMB_RESTART_MESSAGES) and SAP Note 1393039 - XI runtime Dump stops XI queue After

applying this Note EO queues will not go in SYSFAIL status any longer These Notes are recommended for

all PI systems and are per default activated on PI 73 systems By specifying a receiver ID as sub parameter

this behavior can be configured for individual interfaces only

In PI 73 you can also configure the number of messages that would go into error before the whole queue

would be stopped For permanent errors (eg due to a problem with the receiving backend) the queue would

go into SYSFAIL so that not too many messages are going into error This also reduces the number of alerts

for Message Based Alerting To define the threshold the parameter EO_RETRY_AUT_COUNT can be set

The default value is 0 and indicates that the number of messages is not restricted

47 Avoid uneven backlogs with queue balancing

From PI 73 on a new mechanism is available to address the distribution of messages in queues Per default

in all PI versions the messages are getting assigned to the different queues randomly In case of different

runtimes of LUWs caused by eg different message sizes or different runtimes in mappings this can cause

an uneven distribution of messages within the different queues This can increase the latency of the

messages waiting at the end of that queue

To avoid such an unequal distribution the system checks the queue length before putting a new message in

the queue Hence it tries to achieve an equal balancing during inbound processing A queue which has a

higher backlog will get less new messages assigned Queues with fewer entries will get more messages

assigned Therefore it is different than the old BALANCING parameter of category TUNING) which was used

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

39

to rebalance messages already assigned to a queue The assignment of messages to queues is shown in

the diagram below

To activate the new queue balancing you have to set the parameters EO_QUEUE_BALANCING_READ and

EO_QUEUE_BALANCING_SELECT of category TUNING in SXMB_ADM If the value of the parameter

EO_QUEUE_BALANCING_READ is 0 (default value) then the messages are distributed randomly across

the queues that are available If however EO_QUEUE_BALANCING_READ is set to a value n greater than

zero then on average the current fill level of the queue is determined after every nth message and stored in

the shared memory of the application server This data is used as the basis for determining the queue (see

description of parameter EO_QUEUE_BALANCING_SELECT) relevant for balancing

The parameter EO_QUEUE_BALANCING_SELECT specifies the relative fill levels of the queues in percent

Relative here means in relation to the maximum filled queue If there are queues with a lower fill level than

defined here then only these are taken into consideration for distribution If all queues have a higher fill level

then all queues are taken into consideration

Please note that determining the fill level takes place using database access and therefore impacts system

performance The value of EO_QUEUE_BALANCING_READ should for this reason be oriented towards the

message throughput and specific requirements or even distribution For higher volume system a higher value

should be chosen

Letrsquos look at an example In case you set the parameter EO_QUEUE_BALANCING_READ to 1000 after

every 1000th incoming message the queue distribution will be checked This requires a database access

and can therefore cause a performance impact The fill level for each queue will then be written to shared

memory In our example we configured EO_QUEUE_BALANCING_SELECT to 20 At a given point in time

you have three outbound queues on the system XBTO__A contains 500 entries XBTO__B 150 and

XBTO__C contains 50 messages Therefore XBTO__B has a fill level of 30 and XBTO__C of 10 That

means the only queue relevant for balancing is XBTO__C which will get more messages assigned

Based on this example you can see that it is important to find the correct values As a general guideline to

minimize performance impact you should set EO_QUEUE_BALANCING_READ to a rather large value The

correct value of EO_QUEUE_BALANCING_SELECT depends on the criticality of a delay caused by a

backlog

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

40

48 Reduce the number of parallel EOIO queues

As discussed in Chapter Parallelization of PI qRFC Queues it is important to keep the number of queues

limited As a rule of thumb the number of active queues should be equal to the number of available work

processes in the system It is especially crucial to not have a lot of queues containing only one message

since this will cause a high overhead on the QIN scheduler due to very frequent reloading of the queues

Especially for EOIO queues we often see a high number of parallel queues containing only one message

The reason for this is for example that the serialization has to be done on document number to eg ensure

that updates to the same material are not transferred out of sequence Typically during a batch triggered

load of master data this can result in hundreds of EOIO queues in SMQ2 just containing only one or two

messages This is very bad from a performance point of view and will cause significant performance

degradation for all other interfaces running at the same time To overcome this situation the overall number

of EOIO queues (for all interfaces) can be limited as of PI 71 as described in Change Number of EOIO

Queues

The maximum number of queues can be set in SXMB_ADM Configure Filter for Queue Prioritization

Number of EOIO queues

During runtime a new set of queues will be used with the name XB2 as can be seen below for the outbound

queues

These queues will be shared between all EOIO interfaces for the given interface priority and by setting the

number of queues the parallelization for all EOIO interfaces is limited Thus more messages will be using the

same EOIO queue and therefore PI message packaging will work better and also the reloading of the

queues by the QIN scheduler will show much better performance

In case of errors the messages will be removed from the XB2 queues and will be moved to the standard

XBQ queues All other messages for the same serialization context will be moved to the XBQ queue

directly to ensure the serialization is maintained This means that in case of an error the shared EOIO

queues will not be blocked and messages for other serialization contexts will be not delayed

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

41

49 Tuning the ABAP IDoc Adapter

Very often the IDoc adapter deals with very high message volume and tuning of this is very essential

491 ABAP basis tuning

As stated above the IDoc adapter uses the tRFC layer to transfer messages from PI to the receiver system

The IDoc adapter will only put the message in SM58 and from there the standard tRFC layer forwards the

LUW You therefore have to ensure that sufficient resources (mainly DIA WPs) are available for processing

tRFC requests

Wily Introscope also offers a dashboard (shown below) for the tRFC layer which shows the tRFC entries and

their different statuses With these dashboards you are also able to identify history backlogs on tRFC

In order to control the resources used when sending the IDocs from sender system to PI or from PI to the

receiver backend you can also think about registering the RFC destinations in SMQS in PI as known from

standard ALE tuning By changing the Max Conn or the Max Runtime values in SMQS you can limit or

increase the number of DIA WPs used by the tRFC layer to send the IDocs for a given destination This will

mitigate the risk of system overload on sender and receiver side

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

42

492 Packaging on sender and receiver side

One option for improving the performance of messages processed with the IDoc adapter is to use IDoc

packaging

The receiver IDoc adapter in PI (for sending messages from PI to the receiving application system) already

had the option for packaging IDocs in previous releases For this messages processed in the IDoc adapter

are bundled together before being sent out to the receiving system Thus only one tRFC call is required to

send multiple IDocs The message packaging that is being discussed in section PI Message Packaging uses

a similar approach and therefore replaces the ldquooldrdquo packaging of IDocs in the receiver IDoc adapter

Therefore we highly recommend configuring Message Packaging since this helps transferring data for the

IDoc adapter as well as the ABAP Proxy

For the sender IDoc adapter there was previously the option to activate packaging in the partner profile of

the sending system in case IDocs are not send immediately but in batch By specifying the ldquoMaximum

Number of IDocsrdquo in the report RSEOUT00 multiple IDocs are bundled in one tRFC calls At the PI side

these packages will be disassembled by the IDoc adapter and the messages will be processed individually

Therefore this will help mainly to reduce the tRFC resources involved in transferring the data between the

systems but not within PI

This behavior was changed with SAP NetWeaver PI 711 and higher If the sender backend sends an IDoc

package to PI a new option has been introduced which allows the processing of IDocs as a package on PI

as well You can specify an IDoc package size on the sender IDoc Communication Channel The necessary

configuration is described in detail in the following SDN blog - IDoc Package Processing Using Sender IDoc

Adapter in PI EhP1 A significant increase in throughput was recorded for high volume interfaces with small

IDocs

493 Configuration of IDoc posting on receiver side

As described in SAP Note 1828282 - IDoc adapter long runtime XI -gt IDoc-gt ALE also the way the IDocs

are posted on the receiver ERP backend will directly influence the processing of the LUW on the PI side In

general two options exist for IDoc posting

Trigger immediately The IDoc will be posted directly when it is received For this a free dialog workprocess will be required

Trigger by background program If this option is chosen the IDoc is persisted on the ERP side and the posting of the application data will happen asynchronously using report RBDAPP01 via a background job

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

43

When ldquotrigger immediatelyrdquo is configured the sender system (PI) has to wait till the IDoc is posted While the

workprocess will roll out you will see the LUW in ldquoTransaction Executingrdquo during this time For the

IDOC_AAE the RFC connection and the java threads will be kept busy until the IDOC is posted in the

backend therefore leading to backlogs on the IDOC_AAE

The coding for posting the application data can be very complex and therefore this operation can take a long

time consuming unnecessary resources also on the sender side With background processing of the IDocs

the sender and receiver systems are decoupled from each other

Due to this we generally recommend to use ldquotrigger immediatelyrdquo in exceptional cases where the data has to

be posted without any delay In all other cases like high volume replication of master data background

processing of IDocs should be chosen

With Note 1793313 and 1855518 a third option for posting IDocs on the ERP side was introduced This

option is using bgRFC on the ERP side to queue the IDocs As soon as the configured resources for the

bgRFC destination are available the IDocs will be posted By doing so the IDocs will be posted based on

resource availability on the receiver system and no additional background jobs will be required Furthermore

storing the LUWs in bgRFC will ensure that the sender and receiver systems are decoupled This option

therefore guarantees that the IDocs will be posted as soon as possible based on the available resources on

the receiver system without the requirement to schedule many background jobs This new option is therefore

the recommended way of IDoc posting in systems that fulfill the necessary requirements

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

44

410 Message Prioritization on the ABAP Stack

A performance problem is often caused by an overload of the system for example due to a high volume

master interface If business critical interfaces with a maximum response time are running at the same time

then they might face delays due to a lack of resources Furthermore as described in qRFC Queues (SMQ2)

messages of different interfaces use the same inbound queue by default which means that critical and less-

critical messages are mixed up in the same queue

We highly recommend using message prioritization on the ABAP stack to prevent such delays for critical

interfaces (for Java see Interface Prioritization in the Messaging System )

For more information about PI prioritization see httphelpsapcom and navigate to SAP NetWeaver SAP

NetWeaver PIMobileIdM 71 SAP NetWeaver Process Integration 71 Including Enhancement Package 1

SAP NetWeaver Process Integration Library Function-Oriented View Process Integration

Integration Engine Prioritized Message Processing

To balance the available resources further between the configured interfaces you can also configure a

different parallelization level for queues with different priorities Details for this can be found in SAP Note

1333028 ndash Different Parallelization for HP Queues

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

45

5 ANALYZING THE BUSINESS PROCESS ENGINE (BPE)

This chapter helps to analyze performance problems if Chapter Determining the Bottleneck has shown that

the BPE is the slowest step for a message during its time in the PI system Of course this type of runtime

engine is very susceptible to inefficient design of the implemented integration process Information about

best practices for designing BPE processes can be found in the documentation Making Correct Use of

Integration Processes In general the memory and resource consumption is higher than for a simple pipeline

processing in the Integration Engine As outlined in the document linked above every message that is sent

to BPE and every message that is sent from BPE is duplicated In addition work items are created for the

integration process itself as well as for every step More database space is therefore required than

previously expected and more CPU time is needed for the additional steps

51 Work Process Overview

The Business Process Engine executes the steps (work items) of an integration process using tRFC calls to

the RFC destination WORKFLOW_LOCAL_ltclientgt To check if there are enough resources available to

process the work items call transaction SM50 (on each application server) while one of the integration

processes is running

o Look for the user ldquoWF-BATCHrdquo and follow its activities by refreshing the transaction

o Are there always dialog work processes available

The most important point to keep in mind here is the concurrent activity of the Business Process Engine (for

ccBPM processes) and the Integration Engine (for pipeline processing) Both runtime environments need

dialog work processes Thus performance problems in one of the engines cannot be solved by restricting

the other Rather an appropriate balance between the two runtimes has to be found

The activity of the WF-BATCH user is not restricted by default It is highly recommended that you register the

RFC destination WORKFLOW_LOCAL_ltclientgt with transaction SMQS and assign a suitable amount of

maximum connections Otherwise ccBPM activity may block the pipeline processing of the Integration

Server and reduce the throughput of PI drastically SAP Note 888279 - Regulatingdistributing the Workflow

Load gives more details

In addition it might help to use specific servers (dialog instances) to carry out a given workflow or to use load

balancing Of course both options only apply for larger PI systems that consist of at least a central instance

and one dialog instance

52 Duration of Integration Process Steps

As described in section Processing Time in the Business Process Engine with PI 73 there is a new monitor

available in ldquoConfiguration and Monitoring Homerdquo on PI start page It shows the duration of individual steps

of an integration process in a very easy way and is the now tool to analyze performance related issues on

the ccBPM

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

46

For releases prior than PI 73 check in the workflow log for long-running steps as described in the previous

chapter Use the ldquoList with Technical Detailsrdquo to do so Since the design of integration processes varies

significantly there is no general rule for the duration of a step or for specific sequences of steps Instead

you have to get a feeling for your implemented integration processes

Did a specific process step decrease in performance over a period of time

Does one specific process step stick out with regard to the other steps of the same integration process

Do you observe long durations for a transformation step (ldquomappingrdquo)

Do you observe long durations of synchronous sendreceive steps in the integration process which correlates for example to a slow response from a connected remote system

Use the SAP Notes database to learn about recommendations and hints SAP Note 857530 - BPE

Composite Note Regarding Performance acts as a central note for all performance notes and might be used

as the entry point

Once you are able to answer the above questions there are several paths to follow

o Two common reasons should be taken into consideration for a performance decrease over time Firstly the load on PI might have increased over time as well Check the hardware resources (chapter 9) and carefully monitor the work process availability (chapter 51) Alternatively the underlying databases might slow down the integration processes Use chapter 54 to check this possibility

o If it is a specific process step that sticks out the process design itself has to be reviewed The SAP Notes mentioned above might help here

o If the mapping step takes a long time it is worth having a look at chapter 442 since the BPE and the IE use the same resources (mapping connections from the ABAP gateway to the J2EE server) to execute their mapping

o If there is a long processing time for a synchronous sendreceive step check the connected backend system for any performance issues

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

47

53 Advanced Analysis Load Created by the Business Process Engine (ST03N)

Since the Business Process Engine runs on the ABAP part of the PI Integration Server you can use the

statistical data written by the dialog work processes to analyze a performance problem

Procedure

Log in to your Integration Server and call transaction ST03N Switch to ldquoExpert Moderdquo and choose the

appropriate timeframe (this could be the ldquoLast Minutersquos Loadrdquo from the Detailed Analysis menu or a specific

day or week from the Workload menu) In the Analysis View navigate to User and Settlement Statistics and

then to User Profile The user you are interested in is the WF-BATCH user who does all ccBPM-related work

Compare the total CPU Time (in seconds) to the time frame (in seconds) of the analysis In the above example an analysis time frame of 1 hour has been chosen which corresponds to 3600 seconds To be exact these 3600 seconds have to be multiplied by the number of available CPUs Out of these 3600CPU seconds the WF-BATCH user used up 36 seconds

o The above value gives you a good idea of how much load the Business Process Engine creates on your PI Integration Server By comparison with the other users (as well as the CPU utilization from transaction ST06) you can determine if the Integration Server is able to handle this load with the available number of CPUs or not This does not help you to optimize the load but it does provide support when deciding how to distribute the workload of the BPE and the workload of the Integration Server (see Chapter 51)

Experienced ABAP administrators should also analyze the database time wait time and compare these values

54 Database Reorganization

The database is accessed many times during the processing of an Integration Process This is not usually

connected with performance problems but if a specific database table is large then statements may take

longer than for small database tables

Use transaction ST05 to collect a SQL trace while the integration process in question is being executed Restrict the trace to user WF-BATCH When the integration process is finished stop the trace and view it Look for high values in the duration column especially for SELECT statements

Use transaction SE16 to determine the number of entries for the typical workflow tables SWW_WI2OBJ and SWFRXI There are many more database tables used by BPE but if the number of entries is high for the above tables they are high for the rest as well See SAP Note 872388 - Troubleshooting Archiving and Deletion in XI and the links given in this note if you find high amounts of entries

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

48

55 Queuing and ccBPM (or increasing Parallelism for ccBPM Processes)

Until recently ccBPM processes accepted messages from only one queue (one queue XBQO$PEWS for

each workflow) If there was a high message throughput for this workflow then a high backlog occurred for

these queues A couple of enhancements have therefore been implemented to improve scalability if the

ccBPM runtime

o Inbound processing without buffering

o Use of one configurable queue that can be prioritized or assigned to a dedicated server

o Use of multiple inbound queues (queue name will be XBPE_WS)

o Configure transactional behavior of ccBPM process steps to reduce number of persistency steps

Details about the configuration and tuning of the ccBPM runtime are described at

httpswwwsdnsapcomirjsdnhowtoguides SAP Netweaver 70 End-to-End Process Integration

o How to Configure Inbound Processing in ccBPM Part I Delivery Mode

o How to Configure Inbound Processing in ccBPM Part II Queue Assignment

o How to Configure ccBPM Runtime Part III Transactional Behavior of an Integration Process

Procedure

Use transaction SMQ2 to check if the queues of type XBQO$PE show a high backlog

o Based on this information verify if the ccBPM process is suitable to run with multiple queues This is especially the case for collect patterns realized in BPE If you can use multiple queues configure the workflow based on the above guides

56 Message Packaging in BPE

Inbound processing takes up the most amount of processing time in many scenarios within BPE

Message packaging in BPE helps to improve performance by delivering multiple messages to BPE process

instances in one transaction This can lead to an increased throughput which means that the number of

messages that can be processed in a specific amount of time can increase significantly Message packaging

can also increase the runtime for individual messages (latency) due to the delay in the packaging process

The sending of packages can be triggered when the packages exceed a certain number of messages a

specific package size (in kB) or a maximum waiting time The extent of the performance improvement that

can be obtained depends on the type of scenario Scenarios with the following prerequisites are generally

most suitable

o Many messages are received for each process instance

o Messages that are sent to a process instance arrive together in a short period of time

o Generally high load on the process type

o Messages do not have to be delivered immediately

For example collect scenarios are particularly suitable for message packaging The higher the number of

messages is in a package the higher the potential performance improvements will be

Tests that have been performed have shown a high potential for throughput improvements up to factor 47 in

tested Collect Scenarios depending on the packaging size that has been configured For more details about

the functionality of Message Packaging in BPE and its configuration see SAP Note 1058623 - BPE

Message Packaging

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

49

Message packaging in BPE is independent of message packaging in PI (see Chapter 45) but they can be

used together

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

50

6 ANALYZING THE (ADVANCED) ADAPTER ENGINE

This chapter describes further checks and possible reasons for bottlenecks in the (Advanced) Adapter

Engine Although this is not the place to describe the architecture of the Adapter Framework in detail the

following aspects are important to know for the analysis of a performance problem on the AFW

o The AFW contains the Messaging System that is responsible for the persistence and queuing of messages Within the flow of a message it is placed in between the sender adapter (for example a file adapter that polls from a folder) and the Integration Server as well as between the Integration Server and the receiver adapter (for example a JMS adapter that sends a message out to an external system)

o The Messaging System uses 4 queues for each adapter type and these are responsible for receiving and sending messages To separate the message transfer each adapter (for example JMS SOAP JDBC and so on) has its own set of queues One queue for receiving synchronous messages (Request queue) one queue for sending synchronous messages (Call queue) one queue for receiving asynchronous messages (Receive queue) and one queue for sending asynchronous messages (Send queue) The name of the queue always states the adapter type and the queue name (for example JMS_httpsapcomxiXISystemReceive for the JMS receiver asynchronous queues)

o A configurable number of consumer threads are assigned on each queue These threads work on the queue in parallel meaning that the Messaging Queues are not strictly FirstIn-FirstOut Five threads are assigned by default to each queue and thus five messages can be processed in parallel But as will be discussed later the parallel execution of requests to the backend depends heavily on the adapter being used since not every adapter can work in parallel by default

o SAP NetWeaver PI 71 introduced a new queue The dispatcher queue The dispatcher queue is used to realize prioritization on the AAE (see chapter Prioritization in the Messaging System) Every message in the PI system is first assigned to the dispatcher queue before it is assigned to the adapter-specific queues

Viewed from the perspective of a message that enters the PI system using a J2EE Engine adapter (for

example File) and leaves the PI system using a J2EE Engine adapter (for example JDBC) asynchronously

the steps are as follows

1) Enter the sender adapter

2) Put into the dispatcher queue of the messaging system

3) Forwarded to the File send queue of the messaging system (based on Interface priority)

4) Retrieved from the send queue by a consumer thread

5) Sent from the messaging system to the pipeline of the ABAP Integration Engine (if no Integrated Configuration (AAE) is used

6) Processed by the pipeline steps of the Integration Engine

7) Sent to the messaging system by the Integration Engine

8) Put into the dispatcher queue of the messaging system

9) Forwarded to the JDBC receive queue of the messaging system (based on Interface priority)

10) Retrieved from the receive queue (based on the maxReceivers)

11) Sent to the receiver adapter

12) Sent to the final receiver

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

51

All the analysis is based on the information provided in the Audit Log With SAP NetWeaver PI 71 the audit

log is not persisted for successful messages in the database by default to avoid performance overhead

Therefore the audit log is only available in the cache for a limited period of time (based on the overall

message volume) More details can be found in chapter Persistence of Audit Log information in PI 710 and

higher

As described in chapter Adapter Engine Performance monitor in PI 731 and higher a new performance

monitor is available from PI 731 SP4 With this monitor you can display the processing time per interface in

a given interval and identify the processing steps in which a lot of time is spend

With 731 SP10 and 74 SP5 a download functionality for the performance monitor is provided as described

in SAP Note 1886761

61 Adapter Performance Problem

Compared to a bottleneck in the Messaging System it is relatively easy to detect a performance

problembottleneck in the adapter Since the timestamps for inbound and outbound messages are written at

different points in time the procedure is described separately for sender and receiver adapters

Note It is not generally possible for all sender and receiver adapters to work in parallel (discussed in section

Adapter Parallelism)

611 Adapter Parallelism

As mentioned before not all adapters are able to handle requests in parallel Thus increasing the number of

threads working on a queue in the messaging system will not solve a performance problembottleneck

There are 3 strategies to work around these restrictions

1) Create additional communication channels with a different name and adjust the respective senderreceiver agreements to use them in parallel

2) Add a second server node that will automatically have the same adapters and communication channel running as the first server node This does not work for polling sender adapters (File JDBC or Mail) since the adapter framework scheduler assigns only one server node to a polling communication channel

3) Install and use a non-central Adapter Framework for performance-critical interfaces to achieve better separation of interfaces

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

52

Some of the most frequently-used adapters and the possible options are discussed below

o Polling Adapters (JDBC Mail File)

At the sender side these adapters use an Adapter Framework Scheduler which assigns a server node that does the polling at the specified interval Thus only one server node in the J2EE cluster does the polling and no parallelization can be achieved Therefore scaling via additional server node is not possible More details will be presented in section Adapter Framework Scheduler For example since the channels are doing the same SELECT statement on the database or pick up files with the same file name parallel processing will only result in locking problems To increase the throughput for such interfaces the polling interval has to be reduced to avoid a big backlog of data that is to be polled If the volume is still too high you should think about creating a second interface for example the new interface would poll the data from a different directory or database table to avoid locking

At the receiver side the adapters work sequentially on each server node by default For example only one UPDATE statement can be executed for JDBC for each Communication Channel (independent of the number of consumer threads configured in the Messaging System) and all the other messages for the same Communication Channel will wait until the first one is finished This is done to avoid blocking situations on the remote database On the other hand this can cause blocking situations for whole adapters as discussed in section Avoid Blocking Caused by Single SlowHanging Receiver Interface

To allow better throughput for these adapters you can configure the degree of parallelism at the receiver side In the Processing tab of the Communication Channel in the field ldquoMaximum Concurrencyrdquo enter the number of messages to be processed in parallel by the receiver channel For example if you enter the value 2 then two messages are processed in parallel on one J2EE server node The parallel execution of these statements at database level of course depends on the nature of the statements and the isolation level defined on the database If all statements update the same database record database locking will occur and no parallelization can be achieved

o JMS Adapter

The JMS adapter is per default using a push mechanism on the PI sender side This means the data is pushed by the sending MQ provider Per default every Communication channel has one JMS connection established per J2EE server node On each connection it processes one message after the other so that the processing is sequential Since there is one connection per server node scaling via additional server nodes is an option

With Note 1502046 - JMS Adapter message pull mechanism the JMS protocol can be changed to use a pull mechanism By doing so you can specify a polling interval in the PI Communication Channel and PI will be the initiator of the communication Also here the Adapter Framework Scheduler is used which implies sequential processing But in contrary to JDBC or File sender channels the JMS polling sender channel will allow parallel processing on all server nodes of the J2EE cluster Therefore scaling via additional J2EE server nodes is an option

The JMS receiver side supports parallel operation out of the box Only some small parts during message processing (pure sending of the message to the JMS provider) are synchronized Therefore to enable parallel processing on the JMS receiver side no actions are necessary

o SOAP Adapter

The SOAP adapter is able to process requests in parallel The SOAP sender side has in general no limitations in the number of requests it can execute in parallel The limiting factor here is the FCAThreads available to process the incoming HTTP calls This is discussed in more detail in chapter Tuning SOAP sender adapter On the receiver side the parallelism depends on the number of the threads defined in the messaging system and the ability of the receiving system to cope with the load

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

53

o RFC Adapter

The RFC adapter offers parameters to adjust the degree of parallelism by defining the number of initial and maximum connections to be used The initial threads are allocated from the Application thread pool directly and are therefore not available for any other tasks Therefore the number of initial thread should be kept minimal To avoid bottlenecks during the peak time the maximum connections can be used A bottleneck is indicated by the following exception in the Audit Log

comsapaiiaframsapiDeliveryException error while processing message to remote systemcomsapaiiafrfccoreclientRfcClientException resource error could not get a client from JCOPool comsapmwjcoJCO$Exception (106) JCO_ERROR_RESOURCE Connection pool RfcClient hellip is exhausted The current pool size limit (max connections) is 1 connections

Also this should be done carefully since these threads will be taken from the J2EE application thread pool Therefore a very high value there can cause a bottleneck on the J2EE engine and therefore cause a major instability of the system As per RFC adapter online help the maximum number of connections are restricted to 50

o IDoc_AAE Adapter

The IDoc adapter on Java (IDoc_AAE) was introduced with PI 73 On the sender side the parallelization depends on the configuration mode chosen In Manual Mode the adapter works sequential per server node For channels in ldquoDefault Moderdquo it depends on the configuration of the inbound Resource Adapter (RA) Via the parameter MaxReaderThreadCount of the inbound RA you can configure how many threads are globally available for all IDoc adapters running in Default Mode Hence this determines the overall parallelization of the IDoc sender adapter per Java server node Currently the recommended maximum number of threads is 10

The receiver side of the Java IDoc adapter is working parallel per default

The table below gives a summary of the parallelism for the different adapter types

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

54

612 Sender Adapter

The first step(s) to be taken for a sender adapter depend on the type of adapter itself Afterwards however

the message flow is always the same First the message is processed in the Module Processor and

afterwards put into the queue of the Messaging System If the message is synchronous then it is the call

queue if the message is asynchronous that it is the send queue The final action of the Adapter Framework

is then to send the message from the messaging system to the Integration Server (for Non ICO interfaces)

Only the first entries of the audit log are of interest to establish if the sender adapter has a performance

problem From the first entry to the entry ldquoMessage successfully put into the queuerdquo These steps logically

belong to the sender adapter and the Module Processor while the subsequent steps belong to the

Messaging System The sender adapter uses one J2EE Engine application thread to execute its work This

is generally the same thread for all steps involved

Select the ldquoDetailsrdquo of a message in the AFW The first entry of the audit log will be the beginning of the activity of the sender adapter The end of the adapterrsquos activity is marked by the entry ldquoThe application sent the message asynchronously using connection Returning to applicationrdquo

6121 Tuning SOAP sender adapter As described above in general there is no limitation in the parallelization of the SOAP sender adapter itself

The incoming SOAP requests are limited by the available FCA Threads for HTTP processing The FCA

Threads are discussed in more detail in FCA Server Threads

Per default all interfaces using the SOAP adapter are using the same set of FCA Threads since they are all

using the same URL httplthostgtltportgtMessageServlet In case one interface is facing a very high load or

slow backend connections this can cause a blockage of the available FCA Threads and could cause a heavy

impact on the processing time of other interfaces

To avoid this SAP Note 1903431 allows the usage of different URLs for specific interfaces This way you

could isolate interfaces with a very high volume on a dedicated entry point using its own set of FCA Threads

In case of high load for this interface the other SOAP sender interfaces would not be affected by a shortage

of FCA Threads

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

55

To use this new feature you can specify on the sender system the following two entry points

o MessageServletInternal

o MessageServletExternal

More information about the URL of the SOAP adapter can be found here

613 Receiver Adapter

For a receiver adapter it is just the other way around when compared to the sender adapter First the

message is received from the Integration Server and then put into a queue of the Messaging System this

time either the request or the receive queue for asynchronous and the request queue for synchronous

messages The last step is then the activity of the receiver adapter itself

Select the ldquoDetailsrdquo of a message in the AFW The adapter activity starts after the entry ldquoThe message status set to DLNGrdquo The message will then be forwarded to the entry point of the AFW ldquoDelivering to channel ltchannel_namegtrdquo enters the Module Processor ldquoMP Entering Module Processorrdquo and ends with ldquoThe message was successfully delivered to the application using connection rdquo

Depending on the type of the adapter you now have to investigate why the receiver adapter needs a high amount of processing time This could be a complicated operating system command a bad network connection or a slow backend system among many other reasons

Here too custom modules can be the reason for a prolonged processing time Check Performance of Module Processing for more details

Note From 71 the audit log is not persisted any longer for performance reasons as described in Persistence of Audit Log information in PI 710 and higher

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

56

614 IDoc_AAE adapter tuning

A comprehensive and up-to date description about the tuning of the IDoc adapter is summarized in Note

1641541 - IDoc_AAE Adapter performance tuning Please refer to this Note for details

Similar to all other adapters running on the Adapter Engine also the IDOC_AAE adapter is using the Messaging System queues and the explanations of the next chapter Messaging System Bottleneck are also valid for the IDOC_AAE adapter Note that the IDOC_AAE adapter only supports async Scenarios ndash so it is only using the ldquoSendrdquo queue in Java-only scenarios and the ldquoSendrdquo and ldquoReceiverdquo queue in dual-stack scenariosrdquo

For the IDoc_AAE sender you can configure in addition bulk support as indicated in the screenshot below By doing so all messages received by PI in one RFC call from ERP will be processed as one message This is similar to the mechanism of the ABAP IDoc adapter described in the chapter Packaging on sender and receiver side except with the difference that the number of messages in one bulk cannot be further limited

It is also essential to ensure that the size of the IDocs or IDoc packages for sender or receiver IDocs is not

getting too large to avoid any negative impact on the Java heap memory and Garbage Collection In the

case of the IDoc_AAE adapter the XML size of the IDoc is less critical it is the number of segments in an

IDoc or the overall number of segments in all IDocs of an IDoc package that will influence the memory

allocation during message processing Based on the current implementation per IDoc segment around 5 KB

of memory during processing is required A package of 5 IDocs with 2000 segments for each IDoc would

consume roughly 500 MB during processing In such cases it is important to eg lower the package size

andor use large message queues as outlined in chapter Large message queues on PI Adapter Engine

615 Packaging for Proxy (SOAP in XI 30 protocol) and Java IDoc adapter

Due to message packaging in the Integration Engine (see PI Message Packaging for Integration Engine) the

ABAP based IDoc and Proxy adapter was always using packaging when sending asynchronous messages

to the receiver system With PI 731 packaging was also introduced for Java based IDoc and Proxy (SOAP in

XI 30 protocol) adapter The aim of packaging is to achieve similar benefits to packaging already existing in

ABAP pipeline

o Less calls to backend consuming less parallel Dialog Work Processes on the receiver

o Less overhead due to context switches authentication when calling the backend

Especially for high volume interfaces packaging should be enabled to avoid overloading of the backend

systems and impact on users operating on the system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

57

Important

In the ABAP stack the packaging is active per default (in 71 and higher) as described in chapter PI

Message Packaging for Integration Engine Experience shows that packaging can have a very high impact

on the DIA Work process usage on the ERP receiver side Especially for migration projects from dual-stack

PI to AEXPO it is important to evaluate packaging to avoid any negative impact on the receiving ECC due

to missing packaging

The packaging on Java works in general different then on ABAP While in ABAP the aggregation is done on

the individual qRFC queue level (which always contains messages to one receiver system only) this is not

possible on the Adapter Engine since messages for all receivers are located in the same queue Therefore

packages are built outside of the Adapter Engine queues The messages to be packaged will be forwarded

to a bulk handler which waits till either the time for the package is exceeded or the number of messages or

data size is reached After this the message is send by a bulk thread to the receiving backend system

Packaging on Java is always done for a server node individually If you have a high number of server nodes

packaging will work less efficiently due to the load balancing of messages across the available server nodes

When enabling message packaging the message status will stay in status ldquoDeliveringrdquo throughout all steps

described above In the audit log you can see the time it spends in packaging The audit log shown below

shows eg that the message was almost waiting one minute before the package was built and 9 IDocs were

sent with this package

Packaging can be enabled globally by setting the following parameters in Messaging System service as

described in Note 1913972

o messagingsystemmsgcollectorenabled Enable packaging or disable it globally

o messagingsystemmsgcollectormaxMemPerBulk The maximum size of a bulk message

o messagingsystemmsgcollectorbulkTimeout The wait time per bulk - default 60 seconds

o messagingsystemmsgcollectormaxMemTotal Maximum memory which can be used by message collector

o messagingsystemmsgcollectorpoolSize Number of parallel threads used to send out packages to the backend This value has to be tuned based on the general performance of the receiver system the network latency between the systems or the configuration of the IDoc posting on the ERP sender side (described in chapter IDoc posting on receiver side) Therefore tuning of these threads might be very important These threads can unfortunately not be monitored yet with any PI tool or Wily

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

58

Introscope To verify that not all packaging threads are in use you can use the Thread monitoring in the SAP Management Console and check for threads named ldquoBULK_EXECUTORrdquo as shown below

If you would like to adapt the packaging for specific communication channel this can also be done using the

configuration options in the Integration Directory

While in the SOAP adapter in XI 30 protocol you can define three different packaging KPIs this is currently

not possible for the IDoc adapter There you can only specify the package size based on number of

messages

616 Adapter Framework Scheduler

For polling adapters like File JDBC and JMS the Adapter Framework Scheduler determines on which server

node the polling takes place Per default a File or JDBC sender only works on one server node and the

Adapter Framework (AFW) Scheduler determines on which one the job is scheduled

The Adapter Frameworks scheduler had several performance and reliability issues in a cluster environment

especially in systems having many Java server nodes and many polling communication channels Based on

this and the symptoms described in Note 1355715 - AF Scheduler to avoid using cluster communication a

new scheduler was released We highly recommend using the new version of the AFW scheduler To

activate the new scheduler you have to set the parameter schedulerrelocMode of the Adapter framework

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

59

service property to some negative value For instance by setting the value to -15 after every 15th polling

interval a rebalancing of the channel within the J2EE cluster might happen This can allow for better

balancing of the incoming load across the available server nodes if the files are coming in on regular

intervals The balancing is achieved by doing a servlet call Based on the HTTP load balancing the channel

might then be dispatched to another server node To avoid the balancing overhead this value should not be

set too low A value between -10 and -15 should be acceptable in most cases For very short polling intervals

(eg every second) even lower values (eg -50) can be configured

In case many files are put at one given time on the directory (by eg using a batch process) all these files will

be processed by one polling process only Hence a proper load balancing cannot be achieved by the AFW

scheduler In such a case the only option is to use write the files with different names or different directories

so that you can configure multiple sender channels to pick up the files

Starting from PI 731 you can monitor the Adapter Framework Scheduler in pimon Monitoring

Background Job Processing Monitor Adapter Framework Scheduler Jobs There you can check on which

server node the Communication Channel is polling (Status=rdquoActiverdquo) Also you can see the time the channel

was polling the last time and will poll next time

You cannot influence the server node on which the channel is polling This is determined via the AFW

scheduler You can only influence the frequency after which a channel can potentially move to another

server node by tuning the parameter relocMode as outlined above

Prior to PI 731 you have to use the following URL to monitor the Adapter Framework Scheduler

httpserverportAdapterFrameworkschedulerschedulerjsp

Above you see an example of a File sender channel The type value determines if the channel runs only on

one server node (type = L (local)) or on all server nodes (Type = G (global)) The time in the brackets

[60000] determines the polling interval in ms ndash in this case 60 seconds The Status column shows the

status of the channel

o ldquoONrdquo Currently polling

o ldquoonrdquo Currently waiting for the next polling

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

60

o ldquooffrdquo not scheduled any longer (eg channel deactivated or scheduled on another server node)

62 Messaging System Bottleneck

As described earlier the Messaging System is responsible for persisting and queuing messages that come

from any sender adapter and go to the Integration Server pipeline or Java only interfaces (ICO) or that come

from the Integration Server pipeline and go to any receiver adapter The time difference between receiving

the message from the one side and delivering it to the other side is the value that can be used to analyze

bottlenecks in the Messaging System The following chapters describe how to determine the time difference

for both directions

A bottleneck (or rather a ldquofakerdquo bottleneck) in the Messaging System can usually be closely connected with a

performance problem in the receiver adapter You must therefore make sure you execute the check

described in chapter 613 If the receiver adapter needs a lot of time to process messages then of course

the messages get queued in the messaging system and remain there for a long time since the adapter is not

ready to process the next one yet It looks like the messaging system is not fast enough but actually the

receiver adapter is the limiting factor

Execute the checks described in chapter 621 and 622 to get an overview of the situation you should do this for multiple messages

Decide if a specific receiver adapter is causing long waiting times in the messaging system by executing the check described in chapter 613 An additional hint at a receiver adapter problem is if only messages to specific receivers show a long wait time in the queue

Use the queue monitoring in the Runtime Workbench (RWB) to analyze how many threads are active for each of the queue types and how large the queue size currently is To do so choose Component Monitoring Adapter Engine Press the Engine Status button and choose tab ldquoAdditional Datardquo This will open the page below showing the number of messages in a queue the threads currently working and the maximum available threads Note This view only shows the Messaging System of one J2EE server node To get the whole picture you have to check all the server nodes that are available in your system via the drop down list box

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

61

o Since checking all the server nodes with this browser frontend can be very tedious it is better to use Wily Introscope since it allows easy graphical analysis of the queues and thread usage of the messaging system across Java server nodes

The starting point in Wily Introscope is usually the PI triage showing all the backlogs in the queue In the screenshot below a backlog in the dispatcher queue (for 71 and higher only) can be seen

The dispatcher queue is not the problem here since it will forward messages to the adapter queue if any free

threads are available on the adapter specific consumer threads The analysis must start in the PI inbound

queues By using the navigation button in the upper right corner of the inbound queue size you can directly

jump to a more detailed view where you can see that the file adapter was causing the backlog

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

62

To see the consumer thread usage you can then follow the link to the file adapter In the screenshot below

you can see that all consumer threads were exceeded during the time of the backlog Thus increasing the

number of consumer threads could be a solution if the resulting delay is not acceptable for business

o If it is obvious that the messaging system does not have enough resources as in the case above increase the number of consumers for the queue in question Use the results of the previous check to decide which adapter queues need more consumers

The adapter specific queues in the messaging system have to be configured in the NWA using

service ldquoXPI Service AF Corerdquo and property messagingconnectionDefinition The default

values for the sending and receiving consumer threads are set as follows

(name=global messageListener=localejbsAFWListener exceptionListener=

localejbsAFWListener pollInterval=60000 pollAttempts=60

SendmaxConsumers=5 RecvmaxConsumers=5 CallmaxConsumers=5

RqstmaxConsumers=5)

To set individual values for a specific adapter type you have to add a new property set to the default set with the name of the respective adapter type for example

(name=JMS_httpsapcomxiXISystem

messageListener=localejbsAFWListener

exceptionListener=localejbsAFWListener pollInterval=60000

pollAttempts=60 SendmaxConsumers=7 RecvmaxConsumers=7

CallmaxConsumers=7 RqstmaxConsumers=7)

The name of the adapter specific queue is based on the pattern ltadapter_namegt_ ltnamespacegt for

example RFC_httpsapcomxiXISystem

Note that you must not change parameters such as pollInterval and pollAttempts For more details see SAP Note 791655 - Documentation of the XI Messaging System Service Properties

o Not all adapters use the parameter above Some special adapters like CIDX RNIF or Java Proxy can be changed by using the service ldquoXPI Service Messaging Systemrdquo and property messagingconnectionParams by adding the relevant lines for the adapter in question as described above

o A performance problem on the Messaging System can also be caused by a high logging as described in chapter Logging Staging on the AAE (PI 73 and higher)

o Important Make sure that your system is able to handle the additional threads that is monitor the CPU usage and application thread availability as well as the memory of the J2EE Engine after you have applied the change

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

63

o Important Keep in mind that not all adapters are able to handle requests in parallel even if you assign more threads on the Messaging System queues In such a case adding additional threads will not speed up processing significantly See chapter Adapter Parallelism for details

o A backlog of messages in the messaging system is highly critical for synchronous scenarios due to timeouts that might occur Therefore the number of synchronous threads always has to be tuned so that no bottleneck occurs Often a backlog in synchronous queues is caused due to bad performance on the receiving backend system

o More information is provided in the blog Tuning the PI Messaging System Queues

621 Messaging System in between AFW Sender Adapter and Integration Server (Outbound)

Open the Audit Log of a message and look for the following entry ldquoThe message was successfully retrieved

from the send queuerdquo Note that the queue name would be ldquoCall Queuerdquo if the message was synchronous

To determine how long the message waited to be picked up from the queue compare the timestamp of the

above entry with the timestamp for the step ldquoMessage successfully put into the queuerdquo A large time

difference between those two timestamps would therefore indicate a bottleneck for consumer threads on the

sender queues in the Messaging System

Instead of checking the latency of single messages we recommend using Wily Introscope to monitor the

queue behavior as shown above

Asynchronous messages only use one thread for processing in the messaging system Multiple threads are

used for synchronous messages The adapter thread puts the message into the messaging system queue

and will wait till the messaging system delivers the response The adapter thread is therefore not available

for other tasks until the response is returned A consumer thread on the call queue sends the message to the

Integration Engine The response will be received by a third thread (consumer thread of send queue) which

correlates the response with the original request After this the initiating adapter thread will be notified to

send the response to the original sender system This correlation can be seen in the audit log of the

synchronous message below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

64

622 Messaging System in Between Integration Server and AFW Receiver Adapter (Inbound)

Open the Audit Log of a message and look for the following entry ldquoMessage successfully put into the

queuerdquo Compare this timestamp with the timestamp for the step ldquoThe message was successfully retrieved

from the receive queuerdquo The time difference between these two timestamps is the time that the message

waited in the messaging system for a free consumer thread A large time difference between those two

timestamps indicates a bottleneck in the adapter specific inbound queues of the Messaging System

623 Interface Prioritization in the Messaging System

SAP NetWeaver PI 71 and higher offers the possibility for prioritization of interfaces similar to the ABAP

stack This feature is especially helpful if high volume interfaces run at the same time with business critical

interfaces To minimize the delay for critical messages the Messaging System now makes it possible for you

to define high medium and low priority processing at interface level Based on the priority the Dispatcher

Queue of the messaging system (which will be the first entry point for all messages) will forward the

messages to the standard adapter-specific queues The priority assigned to an interface determines the

number of messages that are forwarded once the adapter-specific queues have free consumer threads

available This is done based on a weighting of the messages to be reloaded The weights for the different

priorities are as follows High 75 Medium 20 Low 5

Based on this approach you can ensure that more resources can be used for high priority interfaces The

screenshot below shows the UI for message prioritization available in pimon Configuration and

Administration Message Prioritization

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

65

The number of messages per priority can be seen in a dashboard in Wily as shown below

You can find more details on configuring usage prioritization within the AAE at SAP Online Help navigate to

SAP NetWeaver SAP NetWeaver PI SAP NetWeaver Process Integration 71 Including Enhancement

Package 1 SAP NetWeaver Process Integration Library Function-Oriented View Process Integration

Process Integration Monitoring Component Monitoring Prioritizing the Processing of Messages

624 Avoid Blocking Caused by Single SlowHanging Receiver Interface

As already discussed above some receiver adapters process messages sequentially on one server node by

default This is independent of the number of consumer threads defined for the corresponding receiver

queue in the messaging system For example even though you have configured 20 threads for the JDBC

Receiver one Communication Channel will only be able to send one request to the remote database at a

given time If there are many messages for the same interface all of them will get a thread from the

messaging system but will be blocked when performing the adapter call

In the case of a slow message transmission for example for EDI interfaces using ISDN technology or

remote system with small network bandwidth this behavior can cause a ldquohangingrdquo situation for a specific

adapter since one interface can block all messaging threads As a result all the other interfaces will not get

any resources and will be blocked

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

66

Based on this a parameter messagingsystemqueueParallelismmaxReceivers was introduced which can

be set in NWA in service XPI Service Messaging System With this parameter you can restrict the maximum

number of receiver consumer threads that can be used by one interface In older releases (lower 731 SP11

and 74 SP6) this is a global parameter at the moment that affects all adapters It should therefore not be set

too restrictively If you need high parallel throughput one option is to set the maxReceiver parameter to 5 (so

that each interface can use 5 consumer threads for each server node) and increase the overall number of

threads on the receive queue for the adapter in question (for example JDBC) to 20 With this configuration it

will be possible for four interfaces to get resources in parallel before all threads are blocked For more

information see Note 1136790 - Blocking Receiver Channel May Affect the Whole Adapter Type and SDN

blog Tuning the PI Messaging System Queues

In the screenshot below you see the impact of maxReceivers parameter when a backlog for one interface

occurs Even though there are more free SOAP threads available they are not consumed Hence the free

SOAP threads could be assigned to other interfaces running in parallel

This parameter has a direct impact on the queue where message backlogs occur As mentioned earlier

usually a backlog should occur on the Dispatcher queue only since it is responsible to dispatch threads only

if there are free consumer threads in the adapter specific queue When setting maxReceiver you restrict the

threads per interface and this means that less consumer threads are blocked When one interface is facing a

high backlog there will always be free consumer threads and therefore the dispatcher queue can dispatch

the message to the adapter specific queue Hence the backlog will appear in the adapter specific queue so

that the message prioritization would not work properly any longer

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

67

Per default the maxReceiver parameter is only relevant for asynchronous message processing (ICo and

classical dual-stack scenarios) The reason here is that a wait time for synchronous scenarios should be

avoided by all means and therefore additional restrictions in the number of available threads can be very

critical Therefore it is usually advisable to not limit the threads per interface but increase the overall number

of available threads In case you have many high volume synchronous scenarios with different priority that

run in parallel it might be advisable to limit the threads each interface can use To do so you have to set the

parameter messagingsystemqueueParallelismqueueTypes to Recv ICoAll as described in SAP Note

1493502 - Max Receiver Parameter for Integrated Configurations

Enhancement with 731 SP11 and 74 SP6

With Note 1916598 - NF Receiver Parallelism per Interface an enhancement was introduced that allows

the specification of the maximum parallelization on more granular level This new feature has to be activated

by setting the parameter messagingsystemqueueParallelismperInterface to true Using a configuration UI

you can specify rules to determine the parallelization for all interfaces of a given receiver service If no rule

for a given interface is specified the global maxReceivers value will be considered If the receiver service

corresponds to a technical business system this configuration would help to restrict the parallel requests

from PI to that system This also works across protocols so that eg it could be used to restrict the number of

parallel requests using Proxy or IDoc to the same ERP receiver system Below you can find a screenshot of

the configuration UI in NWA SOA Monitoring

With the improvement mentioned above also the dispatching mechanism in the dispatcher queue is

changed so that it is aware of the maxReceiver settings This means that now again the backlog will now be

placed in the dispatcher queue and the prioritization will work properly

625 Overhead based on interface pattern being used

Also the interface pattern configured can cause some overhead When choosing the interface pattern

ldquoStateless Operationrdquo each message is parsed against its data type during inbound processing This can

cause performance and memory overhead but gives a syntactical check of the data received

When choosing ldquoStateless (XI30-Compatible)rdquo no check of the data type is performed and therefore the

overhead is avoided Therefore for interface with good data quality and high message throughput this

interface pattern should be chosen

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

68

63 Performance of Module Processing

If your analysis in chapter Adapter Performance Problem pointed to a problem in the Module Processing

then you must analyze the runtime of the modules used in the Communication Channel Adapter modules

can be combined so that one CC calls multiple modules in a defined sequence Adapter modules can be

custom-developed or SAP standard and can be used for many different purposes ndash like to transform the EDI

message before sending it to the partner or to change header attributes of a PI message before forwarding it

to a legacy application Due to the flexible usage of adapter modules there is also no standard way to

approach performance problems

In the audit log shown below you can see two adapter modules One module is a customer-developed

module called SimpleWaitModule The next module called CallSAPAdapter which is a standard module that

inserts the data into the messaging system queues

In the audit log you will get a first impression of the duration of the module In the example above you can

see that the SimpleWaitModule requires 4 seconds of processing time

Once again Wily Introscope offers a better overview of the processing time of modules In the screenshot

below you can see a dashboard showing the cumulativeaverage response times and number of invocations

of different modules If there is one that has been running for a very long time then it would be very easy to

identify since there will be a line indicating a much higher average response time The tooltip displays the

name of the module

Module

Processing

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

69

Once you have identified the module with the long running time you have to talk to the responsible developer

to understand why it is taking that long Eventually the module executes a look-up to a remote system using

JCo or JDBC which could be responsible for the delay In the best case the module will print out information

in addition to the audit log to detect such steps If not use the Wily Introscope transaction trace as explained

in appendix Wily Transaction Trace

64 Java only scenarios Integrated Configuration objects

From SAP NetWeaver PI 71 on you can configure scenarios that only run on the Java stack using the

Advance Adapter Engine (AAE) when only Java-based adapters are used A new configuration object is

used for this that is called Integrated Configuration When using this the steps executed so far in the ABAP

pipeline (Receiver Determination Interface Determination and Mapping) are also executed by the services in

the Adapter Engine

641 General performance gain when using Java only scenarios

The major advantage of AAE processing is a reduced overhead due to the context switches between ABAP

and Java Thus the overall throughput can be increased significantly and the overall latency of a PI

message can be reduced greatly If possible interfaces should always use the Integrated Configuration to

achieve best performance Therefore the best tuning option is to change a scenario that is using Java based

sender and receiver adapters from a classical ABAP-based scenario to an AAE-based one

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

70

Based on SAP internal measurements performed on 71 releases the throughput as well as the response

time could be improved significantly as shown in the diagrams below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

71

Based on these measurements the following statements can be made

1) Significant performance improvements can be achieved always with local processing (if available) for a certain scenario

2) This is valid for all adapter types huge mapping runtimes slow networks slow (receiver) applications reduce the throughput and therefore rated for the overall scenario also reduce the benefit of local processing in terms of overall throughput and response time measurements

3) Greatest benefit is recognized for small payloads (comparison 10k 50k 500k) and asynchronous messages

642 Message Flow of Java only scenarios

All the Java based tuning options mentioned in chapter Analyzing the (Advanced) Adapter Engine and Long

Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo are still valid Just the calling sequence is different

The following describes the steps used in Integrated Configuration to help you better understand the

message flow of an interface The example is JMS to Mail

1) Enter the JMS sender adapter

2) Put into the dispatcher queue of the messaging system

3) Forwarded to the JMS send queue of the messaging system

4) Message is taken by JMS send consumer thread

a No message split used

In this case the JMS consumer thread will do all the necessary steps previously done in the ABAP pipeline (like receiver determination interface determination and mapping) and will then also transfer the message to the backend system (remote database in our example) Thus all the steps are executed by one thread only

b Message split used (1n message relation)

In this case there is a context switch The thread taking the message out of the queue will process the message up to the message split step It will create the new message and put it in the Send queue again from where it will be taken by different threads (which will then map the child message and finally send it to the receiving system)

As we can see in this example for Integrated Configuration only one thread does all the different steps of a

message The consumer thread will not be available for other messages during the execution of these steps

The tuning of the Send queue (Call for synchronous messages) is therefore much more important for

scenarios using Integrated Configuration than for ABAP-based scenarios

The different steps of the message processing can be seen in the audit log of a message If for example a

long-running mapping or adapter module is indicated then you can use the relevant chapter of this guide

There is no difference in the analysis except that for mappings no JCo connection is required since the

mapping call is done directly from the Java stack

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

72

The example audit logs below are based on a Java only synchronous SOAP to SOAP scenario

In the highlighted areas you can see that all the steps are very fast except the mapping call which lasts

around 7 seconds To analyze the long duration of the mapping now the same steps as discussed in chapter

Long Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo have to be followed

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

73

643 Avoid blocking of Java only scenarios

Like classical ABAP based scenarios Java only scenarios can face backlog situations in case of a slow or

hanging receiver backend Since the Java only interfaces are only using send queues the restriction of

consumer threads on the receiver queue as described in chapter Avoid Blocking Caused by Single

SlowHanging Receiver Interface is no solution

Based on that an additional property messagingsystemqueueParallelismqueueTypes of service SAP XI AF

MESSAGING was introduced via Note 1493502 - Max Receiver Parameter for Integrated Configurations

The parameter can be set for synchronous and asynchronous interfaces Please keep in mind that

messagingsystemqueueParallelismmaxReceivers is a global parameter for all adapters and for all

configured Quality of Service This global restriction can be avoided starting with 731 SP11 740 SP06 as

per SAP Note 1916598 - NF Receiver Parallelism per Interface Usually a restriction for the parallelization

can be highly critical for synchronous interfaces Therefore we generally recommend to set

messagingsystemqueueParallelismqueueTypes in most cases to Recv IcoAsync only

644 Logging Staging on the AAE (PI 73 and higher)

Up to PI 711 on the PI Adapter Engine only one message version was persisted Especially for Java only

scenarios this was often not sufficient for troubleshooting since for instance the result of a mapping could not

be verified

Starting from PI 73 the persistence steps can be configured globally in the Adapter Engine for synchronous

and asynchronous interfaces This is similar to the LOGGING or LOGGING_SYNC on the ABAP stack In

later version of PI you will be able to do the configuration on interface level

The versions that can be persisted are shown in the diagram below (the abbreviations in green are the

values for the configuration)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

74

In the Messaging System we generally distinguish Staging (versioning) and Logging An overview is given

below

For details about the configuration please refer to the SAP online help Saving Message Versions

Similar to the LOGGING on the ABAP Integration Server persisting several versions of the Java message

can cause a high overhead on the DB and can cause a decrease in performance

The persistence steps can be seen directly in the audit log of a message

Also the Message Monitor shows the persisted versions

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

75

While this additional monitoring attributes are extremely helpful for troubleshooting they should be used

carefully from a performance perspective Especially the number of message versions and the

loggingstaging mode you choose can have a severe impact on performance Therefore you have to find the

balance between business requirement and performance overhead Some guidelines on how to use staging

and logging are summarized in SAP Note 1760915 - FAQ Staging and Logging in PI 73 and higher

65 J2EE HTTP load balancing

With NetWeaver version 71 and higher the Java HTTP load balancing is no longer done by the Java

Dispatcher but by the ICM The default load balancing rules of the ICM are designed with a stateful

application in mind giving priority to the balancing of HTTP sessions These default rules are not optimal for

stateless applications such as SAP PI

In the past we could observe an unequal load distribution across Java server nodes caused by this In case

of high backlogs this can cause a delay in the overall message processing of the interface because one

server node has more messages assigned than others and therefore not all available resources are used

A typical example can be seen in the Wily screenshot below

SAP Note 1596109 ndash Uneven distribution of HTTP requests on Java server nodes introduces new load

balancing rules for stateless applications like PI Please follow the description in the Note to ensure the

messages are distributed equally across the available server nodes In the meantime these load balancing

rules can also be executed using the CTC templates as described in Notes 1756963 and 1757922 Also this

has been included in the PI Initial Setup wizard in order to execute this task automatically as a post-

installation step (see Note 1760700 - PI CTC Add HTTP loadbalancing to initial setup)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

76

For the example given above we could see a much better load balancing after the new load balancing rules

were implemented This can be seen in the following screenshot

Please Note The load balancing rules mentioned above are only responsible to balance the messages

across the available server nodes of one instance HTTP load balancing across application servers is done

by the SAP Web Dispatcher More information about this is provided in the Guide How To Scale SAP PI 71

66 J2EE Engine Bottleneck

All tuning that can be done in the AFW is limited by the ability of the J2EE Engine to handle a given amount

of threads and to provide enough memory that is needed to process all requests Of course the CPU is also

a limiting factor but this will be discussed in Chapter 91

From SAP NetWeaver PI 71 onwards the SAP VM is used on all platforms Therefore the analysis of all

platforms can use the same tools

661 Java Memory

The first analysis of the J2EE memory is usually done using the garbage collection (GC) behavior of the

Java Virtual Machine (JVM) To get the information of the GC written to the standard log file the JVM has to

be started with the parameter ndashverbosegc

Analyze the file std_serverltngtout for the frequency of GCs their duration and the allocated memory

Look for unusual patterns indicating an Out-of-Memory error or an unintentional restart of your J2EE engine This would be visible if the allocated memory drops to 0 at a given point in time A healthy Garbage Collection output would display a saw tooth pattern ndash that is the memory usage would increase over time but then go down to its original value as soon as a full Garbage Collection is triggered You should see the memory usage go down to initial values during low volume times (night-time)

Pay attention to GCs with a high duration This is important because no PI message can be mapped or processed by the J2EE Adapter Framework during a GC in a JAVA VM One of the many reasons for long GCs that should be mentioned here is paging (see chapter 92) If the heap space of the J2EE Engine is exhausted all objects in the heap are evaluated for their references If there is still a reference the object is kept If there is no reference the object is deleted If some or all of the objects have been swapped out due to insufficient RAM then they have to be swapped in to be evaluated This leads to very long runtimes for GCs GCs of more than 15 minutes have been observed on swapping systems

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

77

Different tools exist for the Garbage Collection Analysis

1) Solution Manager Diagnostics

Solution Manager Diagnostics (SMD) offers a memory analysis application that is able to read the GC output in the std_serverltngtout files To start call the Root Cause Analysis section In the Common Task list you find the entry Java Memory Analysis which will give you the chance to upload your std_serverltngtout file It shows the memory allocated after the GC and also prints the duration of each GC (black dots with duration scale on right axis) A normal GC output should show a saw tooth pattern where the memory always goes down to an initial value (especially during low volume times) This is shown in the example screenshot below

2) Wily Introscope

Again Wily Introscope offers different dashboards that can be used to check the Garbage Collection behavior on the J2EE engine The dashboard can be found using the J2EE Overview J2EE GC Overview and shows important KPIs for the GC such as the count of GC or the duration for each interval The GC Time dashboards shows the ratio of time spent in GC An example is shown in the screenshot below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

78

3) Netweaver Administrator

The NWA also offers a view to monitor the Java heap usage However the important information about the duration of the GC is not available This monitor can be found by navigating to Availability and Performance Management Java System Reports Choose Report System Health

In NetWeaver 73 you find this data in Availability and Performance Management Resource Monitoring History Reports There you can build your own Memory report of the monitoring data provided The screenshot below eg shows the Used Memory per server node

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

79

4) SAP JVM Profiler

The SAP JVM profiler allows the analysis of the extensive GC information stored in the sapjvm_gcprf files of the server node folders or to directly connect to the server node to retrieve this information More information can be found at httpwikiscnsapcomwikidisplayASJAVAJava+Profiling

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

80

Procedure

o Search for restarts of the J2EE Engine by searching for the string ldquois startingrdquo in the file std_serverltngtout Apart from the first of these entries (which simply marks the initial start of the J2EE Engine) the second and any later entries mark a fatal situation after which the J2EE server node had to be restarted This could for example be an outndashof-memory situation Search the log above those entries for a possible reason

o Search for exceptions and fatal entries Was an out-of-memory error reported (search for pattern OutOfMemory) Was a thread shortage reported

o This document does not deal with tuning the J2EE Engine If at this point it becomes clear that the J2EE Engine is the limiting factor proceed with SAP Notes or open a customer incident Only some very basic recommendations are given here

o The parameters for the SAP VM are defined by Zero Admin templates From time to time new Zero Admin templates might be released which change important parameters for the SAP VM Thus you should check SAP Note 1248926 - AS Java VM Parameters for NetWeaver 71-based Products regularly and apply the changes accordingly

o The problem might be caused by too high parallelization when processing large messages This should be restricted to avoid overloading of the Java heap memory

1) In case the interfaces are using the ABAP Integration Server use the large message queue filters to restrict the parallelization of mapping calls from ABAP queues To do so set parameter EO_MSG_SIZE_LIMIT to eg 5000 to direct all large messages to dedicated XBTL or XBTM queues

2) To restrict the parallelization on the Java Messaging System you can use the large message queue mechanism described in chapter Large message queues on the Messaging System

o If it is already obvious that the memory of your J2EE Engine is getting short and nothing can be changed on the scenarios itself you have several options for scaling your PI landscape Especially for large PI installations with high message throughput or large messages SAP Note 1502676 - Scaling up large PI installations describes potential options

1) Increase the Java Heap of your server node If sufficient physical memory is available increase the default heap size of 2 GB to a higher value such as 3 or 4 GB Experience shows that the SAP VM can handle these heap sizes without major impact on performance Larger heap sizes can cause longer processing time of GCs which in turn can affect the PI application negatively Increasing the maximum heap size on the J2EE engine will automatically adapt the new area of the heap due to a dynamic configuration (in newer J2EE versions this will be per default set to 16 of the overall heap) After increasing the heap size monitor the duration of the GC execution

2) Add an additional server node to distribute the load over more processes SAP recommends that productive systems have at least 2 Java server nodes per instance Prerequisite is that your physical host has enough RAM and CPU To minimize the overhead for J2EE cluster communication it is in general recommended configuring larger server nodes (as described above) then a high number of server nodes

3) Add an additional application server consisting of a double stack Web Application Server (ABAP and J2EE Engine) to distribute the load to an additional hardware Find details on the scaling of PI with multiple instances in the Guide How To Scale up SAP NetWeaver Process Integration

4) Use a non-central Adapter Engine to separate large messages that cause the memory problem from business critical scenarios

662 Java System and Application Threads

A system or application thread is required for every task that the J2EE Engine executes Each of these

thread types are maintained using a thread pool If the pool of available threads is expired no more requests

can be processed It is therefore important that you have enough threads available at all times

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

81

o In general SAP recommends that you increase the system and application thread count for a PI system As described in SAP Note 937159 - XI Adapter Engine is stuck the application threads should be increased to 350 and the system threads to lsquo120rsquo for all J2EE server nodes

o There are two options for checking the thread usage

1) Wily Introscope

Wily Introscope provides a dashboard in the J2EE Overview section called J2EE CPU and Memory Detail that can also be used to monitor the thread usage on a PI engine Different views exist for Application and System Threads as shown in the screenshot below

2) NetWeaver Administrator (NWA)

The NWA also offers a monitor similar to the one available for the memory mentioned above This monitor can be found by navigating to Availability and Performance Management Java System Reports Choose Report System Health and looking at the windows shown in the screenshot below

Again in NetWeaver 73 and higher the monitor can be found at Availability and Performance

Management Resource Monitoring History Report An example for the Application Thread

Usage is shown below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

82

Check if the ThreadPoolUsageRate is below 80 for both the application threads and the system threads Also check the maximum value by choosing a longer history in Wily Introscope

Has the thread usage increased over the last daysweeks

Is there one server node that shows a higher thread usage than others

3) SAP Management Console

The SAP Management Console is an essential tool to analyze different aspects of the J2EE engine It also offers a thread dump view similar to SM50 in ABAP showing the activity of all the threads of a given Java Application Server More information about the SAP MC can be found in Note 1014480 - SAP Management Console (SAP-MC) and the online help The SAP Management Console can be started as Java web-start application using httpltservergt5ltInstanc_numbergt13

Navigate to AS Java Threads and check for any threads in red status (gt20 secs running time) This indicates long running threads and you should check for the related thread stacks The example below shows a long running thread related to the PI directory You can see the user doing the request and can see that the thread is waiting for an HTTP response from the CPACache

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

83

If you are not sure what is causing the problem you should take multiple thread dumps (at least 3 every 30 seconds) for all server nodes This can be done in the Management Console navigating to AS Java Process Table

The Thread Dump Viewer tool (TDV) from Note 1020246 - Thread Dump Viewer for SAP Java Engine can be used to analyze the produced thread dumps (inside the results archive that can be downloaded to your local PC)

663 FCA Server Threads

An additional type of thread was introduced with SAP NetWeaver PI 71 that is responsible for HTTP traffic

FCA Server Threads The FCA Server Threads are responsible for receiving HTTP calls at the Java side

(after the Java dispatcher is no longer available) Also FCA Threads use a Thread Pool Fifteen FCA

Threads are configured by default but based on SAP Note 1375656 - SAP Netweaver PI System Parameters

we recommend that you increase it to 50 This can be done in the NWA by changing the parameter

FCAServerThreadCount of service HTTP Provider

FCA Server Threads are particularly crucial for synchronous message transfer and for HTTP-based

scenarios like WebServices or calls from the ABAP engine to the Adapter Engine If there is a long duration

of the HTTP call due to a slow backend system then the thread is blocked for the whole time and not

available to send other HTTP requests (other PI messages)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

84

The configured FCAServerThreadCount represents the maximum number of threads working on a single

entry point An entry point in the PI sense could eg be the SOAP Adapter Servlet (share with all channels

as described in Tuning SOAP sender adapter)

o If response times for specific entry point are high (gt15 seconds) additional FCA threads will be spawned to be available for parallel HTTP based incoming requests using different entry points only This ensures that in case of problems with one application not all other applications will be blocked constantly

o An overall maximum of 20xFCAServerThreadCount may be used in the J2EE engine

There is currently no standard monitor available for FCA Threads except the thread view in the SAP

Management Console A new dashboard will be integrated into Wily Introscope soon and will also show the

FCA Server Threads that are in use

664 Switch Off VMC

The Virtual Machine Container (VMC) is enabled by default for an ABAP WebAS 71 Since PI does not use

VMC at runtime the VMC can be switched off on a PI system to avoid resource overhead This

recommendation is based on SAP Note 1375656 - SAP NetWeaver PI System Parameters

You can activate the VM Container setting the profile parameters vmcjenable = off which can be changed

using profile maintenance (transaction RZ10) Once you have made the change you need to restart the SAP

Web Application Server instance

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

85

7 ABAP PROXY SYSTEM TUNING

Every ABAP WebAS gt 620 includes an ABAP Proxy runtime This enables a SAP system to talk native PI

protocol (XI-SOAP) so that no costly transformation is necessary

In general ABAP proxy is just a framework that is triggered by (sender) or triggers (receiver) application-

specific coding It is not possible to give a general tuning recommendation because the applications and use

cases of ABAP proxy can differ so greatly

In this section we would like to highlight the system tuning options that can be applied to improve the

throughput at the ABAP Proxy backend side

In general you can increase the throughput of EO interfaces by changing the queue parallelization Two

different parameters exist to change the number of qRFC queues on the ABAP Proxy backend As in PI

ABAP Proxy processing is based on qRFC inbound queues (SMQ2) with specific names A sender proxy

uses XBTS queues (10 parallel by default) and a receiver proxy XBTR (20 parallel by default)

The sender proxy only forwards the message to PI by executing an HTTP call that must not take too much

time and which is not resource-critical On the other hand the receiver proxy executes the inbound

processing of the messages based on the application context (and which can be very time-consuming) It is

therefore more likely to face a backlog situation in the Receiver Proxy queues In the screenshot below you

can see the performance header of a receiver proxy message that required around 20 minutes in the

PLSRV_CALL_INBOUND_PROXY (which corresponds to the posting of the application data)

Of course such a long-running message will block the queue and all messages behind it will face a higher

latency Since this step is purely application-related it is only possible to perform tuning at the application

side

The number of sender and receiver ABAP Proxy queues can be changed in transaction SXMB_ADM on the

proxy backend with parameter EO_INBOUND_PARALLEL and sub parameter SENDER (XBTS queues)

and RECEIVER (XBTR queues) The prerequisite for increasing the number of queues is that there are

enough qRFC resources are available at the proxy system (as discussed in section qRFC Resources

(SARFC))

Also as described in chapter Message Prioritization on the ABAP Stack prioritization can be configured in

the ABAP Proxy system This can be used to separate runtime-critical interfaces from interfaces with a long

application processing (as shown above)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

86

71 New enhancements in Proxy queuing

The queuing mechanism was enhanced for sender and receiver proxy systems with Note 1802294

(Receiver) and Note 1831889 (Sender) After implementing these two Notes interface specific queues will be

used and tuning of these queues on interface level will be possible This can be very helpful in cases where

one receiver interface shows very long posting time in the application coding that cannot be further

improved Other messages for more business critical interfaces will eventually be blocked by this message

due to the common usage of the XBTR queues On the sender side of a Proxy system the processing of the

queues call the central PI hub In general the processing time there should be fast But in case of high

volume interfaces you might want to slow down less business critical interfaces to avoid overloading of your

central PI hub The prioritization mechanisms available prior to these Notes did not provide such options

This is achieved by adding an interface specific identifier to the queue-name In the screenshot below you

see a comparison in the queue names for the old framework (red) and the new framework (blue)

For the sender queues the following new parameters are introduced

o Parameter EO_QUEUE_PREFIX_INTERFACE with sub parameter SENDERSENDER_BACK Should be set to lsquo1rsquo to active interface specific queue This is a global switch affecting all sender interfaces

o Parameter EO_INBOUND_PARALLEL_SENDER without sub parameter determines the general number of queues per interface

Using a sub parameter ltsender IDgt the parallelization of individual interfaces can be changed This can eg be used to assign more queues for high priority interfaces

Wildcards can be used in the ltsender IDgt to group interfaces sharing eg the same service name or the same interface name but different services

For the receiver queues the following new parameters are introduced

o Parameter EO_QUEUE_PREFIX_INTERFACE with sub parameter RECEIVER Should be set to lsquo1rsquo to active interface specific queue This is a global switch affecting all receiver interfaces

o Parameter EO_INBOUND_PARALLEL_RECEIVER without sub parameter determines the general number of queues per interface

Using a sub parameter ltsender IDgt the parallelization of individual interfaces can be changed This can eg be used to assign more queues for high priority interfaces

Wildcards can be used in the ltsender IDgt to group interfaces sharing eg the same service name or the same interface name but different services

This new feature also replaces the currently existing prioritization since it is in general more flexible and

powerful We therefore recommend adjusting existing prioritization rules using the new queuing possibilities

described above

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

87

8 MESSAGE SIZE AS SOURCE OF PERFORMANCE PROBLEMS

The message size directly influences the performance of an interface The size of a PI message depends on

two elements The PI header elements with a rather static size and the payload which can vary greatly

between interfaces or over time for one interface (for example larger messages during year-end closing)

The size of the PI message header can cause a major overhead for small messages of only a few kB and

can cause a decrease in the overall throughput of the interface Furthermore many system operations (like

context switches or database operations) are necessary for only a small payload The larger the message

payload the smaller the overhead due to the PI message header On the other hand large messages

require a lot of memory on the Java stack which can cause heavy memory usage on ABAP or excessive

garbage collection activity (see section Java Memory) that will also reduce the overall system performance

Very large messages can even crash the PI system by causing an Out-of-Memory exception for example

You therefore have to find a compromise for the PI message size

Below you see throughput measurements performed by SAP In general the best throughput was identified

for messages sizes of 1 to 5 MB

The message size in these measurements corresponds to the XML message size processed in PI and not

the size of the file or IDoc being sent to PI You can use the Runtime Header of the ABAP stack to check the

message size Below you can see an example of a very small message While the MessageSizePayload

field describes the size of the payload in bytes (here 433 bytes) the MessageSizeTotal describes the total

message size (header + payload) In the example this is around 14 KB demonstrating the overhead by the PI

header for small messages The next two lines describe the payload size before and after the mapping In

the example below the mapping reduces the payload size The last two lines determine the size of the

response message that is sent back to PI before and after the response mapping for synchronous

messages

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

88

Based on the above observations we highly recommend that you use a reasonable message size for your

interfaces During the design and implementation of the interface we therefore recommend using a message

size of 1 to 5 MB if possible Messages can be collected at the sender side or by using IDoc packaging as

described in Tuning the IDoc Adapter to achieve this In case of large messages a split has to be performed

by changing the sender processing or by using the split functions available in the structure conversion of the

File adapter

81 Large message queues on PI ABAP

In case the interfaces are using the ABAP Integration Server use the large message queue filters to restrict

the parallelization of mapping calls from ABAP queues To do so set the parameter EO_MSG_SIZE_LIMIT of

category TUNING to eg 5000 to direct all messages larger 5 MB to dedicated XBTL or XBTM queues

The value of the parameter depends on the number of large messages and the acceptable delay that might

be caused due to a backlog in the large message queue

To reduce the backlog the number of large message queues can also be configured via parameter

EO_MSG_SIZE_LIMIT_PARALLEL of category TUNING The default value is 1 so that all messages larger

than the defined threshold will be processed in one single queue Naturally the parallelization should not be

set higher than 2 or 3 to avoid overloading of the Java memory due to parallel large message requests

82 Large message queues on PI Adapter Engine

SAP Note 1727870 - Handling of large messages in the Messaging System introduces large message

queues (virtual queues no adapter behind) also for the Java based Adapter Engine Contrary to the

Integration Engine it is not the size of a single large message only that determines the parallelization

Instead the sum of the size of the large messages across all adapters on a given Java server node is limited

to avoid overloading the Java heap This is based on so called permits that define a threshold of a message

size Each message larger than the permit threshold is considered as large message The number of permits

can be configured as well to determine the degree of parallelization Per default the permit size is 10 MB and

10 permits are available This means that large messages will be processed in parallel if 100 MB are not

exceeded

To show this let us look at an example using the default values Let us assume we have 5 messages waiting

to be processed (status ldquoTo Be Deliveredrdquo on one server node Message A has 5 MB message B has 10

MB message C has 50 MB message D 150 MB message E 50 MB and message F 40 MB Message A is

not considered large since the size is smaller than the permit size and is not considered large and can be

immediately processed Message B requires 1 permit message C requires 5 Since enough permits are

available processing will start (status DLNG) Hence for message D all available 10 permits would be

required Since the permits are currently not available it cannot be scheduled If blacklisting is enabled the

message will be put to error status (NDLV) since it exceed the maximum number of defined permits In that

case the message would have to be restarted manually Message E requires 5 permits and can also not be

scheduled But since there are 4 permits left message F is put to DLNG Due to the smaller size message B

and message F finish first releasing 5 permits This is sufficient to schedule message E which requires 5

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

89

permits Only after message E and C have finished message D can be scheduled consuming all available

permits

The example above shows a potential delay a large message could face due to the waiting time for the

permits But the assumption is that large messages are not time critical and therefore additional delay is less

critical than potential overload of the system

The large message queue handling is based on the Messaging System queues This means that restricting

the parallelization is only possible after the initial persistence of the message in the Messaging System

queues Per default this is only done after the Receiver Determination Therefore if you have a very high

parallel load of incoming large requests this feature will not help Instead you would have to restrict the size

of incoming requests on the sender channel (eg file size limit in the file adapter or the

icmHTTPmax_request_size_KB limit in the ICM for incoming HTTP requests) If you have very complex

extended receiver determination or complex content based routing it might be useful to configure staging in

the first processing step of the Messaging System (BI=3) as described in Logging Staging on the AAE (PI

73 and higher)

The number of permits consumed can be monitored in PIMON Monitoring Adapter Engine Status The

number of threads corresponds to the number of consumed permits

In newer Wily versions there is also a dashboard showing the number of consumed permits as well The

Worker Threads on the right hand side of the screenshot correspond to the permits

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

90

9 GENERAL HARDWARE BOTTLENECK

During all tuning actions discussed in the previous chapters you must keep in mind that the limitation of all

activities is set by the underlying CPU and memory capacity The physical server and its hardware have to

provide resources for three PI runtimes the Adapter Engine the Integration Engine and the Business

Process Engine Tuning one of the engines for high throughput leaves fewer resources for the remaining

engines Thus the hardware capacity has to be monitored closely

91 Monitoring CPU Capacity

When monitoring the CPU capacity it is not enough to use the average that is displayed in the ldquoPrevious

Hoursrdquo section of transaction ST06 Instead you should start your interface and monitor the CPU while it is

running The best procedure is to use operating system tools to monitor the CPU usage (particularly if using

hardware virtualization)

The SMD Host Agent also reports CPU data to Wily Introscope which can be viewed using dashboards as

shown below This example shows two systems where one is facing a temporary CPU overload and the

other a permanent one

Also the NWA offers a view on the CPU activity from NetWeaver 73 on via Availability and Performance

Management Resource Monitoring History Reports There you can build your own report based on the

ldquoCPU utilizationrdquo data as shown in the screenshot

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

91

You can also monitor the CPU usage in ST06 as described below

Procedure

Log on to your Integration Server and call transaction ST06 Take a look at the snapshot that is provided on

the entry screen than navigate to ldquoDetailed Analysis Menurdquo

Snapshot view Is the idle time (upper left corner) at a reasonable value at all times for example around 10 or higher

Snapshot view The load average is a good indicator of the dimension of the bottleneck It describes the number of processes for each CPU that are in a wait queue before they are assigned to a free CPU As long as the average remains at one process for each available CPU the CPU resources are sufficient Once there is an average of around three processes for each available CPU there is a bottleneck at the CPU resources In connection with a high CPU usage a high value here can indicate that too many processes are active on the server In connection with a low CPU usage a high value here can indicate that the main memory is too small The processes are then waiting due to excessive paging

Detailed Analysis View TOP CPU Which thread uses up the highest amount of CPU Is it the J2EE Engine the work processes or the database

o There are different follow-up actions to be taken depending on the findings of the second check The first option is of course to simply increase the hardware This should be accompanied by a thorough sizing (for which SAP provides the Quicksizer in the Service Marketplace httpservicesapcomquicksizer )

o The second option is to identify the CPU-consuming threads and to think about a reduction of the load in this component For example if the J2EE Engine is the largest consumer then it is worth re-evaluating the number of concurrent mapping threads andor consumer queues as described in the previous chapters

o SAP Note 742395 - Analyzing High CPU Usage by the J2EE Engine provides a good entry point if the J2EE Engine consumes too much CPU

o If it is the work processes that consume a lot of CPU use transaction ST03N to figure out if it is the Integration Server or the Business Process Engine You can do that by looking at the CPU usage per user BPE uses the user WF-BATCH while the Integration Server uses the last user who activated a qRFC queue (typically PIAFUSER or PIAPPLUSER) Use the previous chapter to restrict these activities

92 Monitoring Memory and Paging Activity

As stated above paging is very critical for the Java stack since it influence the Java GC behavior directly

Therefore paging should be avoided in every case for a Java-based system

The OS tools are the most reliable for monitoring the paging activity on your system Transaction ST06 can

also be used to perform a snapshot analysis Take a look at the snapshot that is provided on the entry

screen

Snapshot view Is there enough physical memory available

Snapshot view Is there considerable paging Paging can have a negative influence on your J2EE Engine If it does this can be seen in long GC times as described in Chapter 661

93 Monitoring the Database

Monitoring database performance is a rather complex task and requires information to be gathered from a

variety of sources Only a basic set of indicators is given for the purpose of this guide A more detailed

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

92

analysis is required if these indicators point to a major performance problem in the database If assistance is

needed open a support message under the component BC-DB-ORA (or MSS SDB DB6 respectively)

931 Generic J2EE database monitoring in NWA

The next sections of this chapter will be mainly based on ABAP transactions to do a detailed analysis of the

database performanceThe NWA however offers possibilities to monitor the database performance for the

Java based tables This is especially helpful for Java only AEXPO or Non-Central AAEs In the NWA you

can use the ldquoOpen SQL Monitorrdquo in the Troubleshooting section of NWA The official documentation can be

found in the Online Help Monitoring DB via DBACOCKPIT transaction (ABAP based) is possible also for the

Java-only systems from SAP Solution Manager (DBA Cockpit must be configured in the Managed System

Setup)

It will be too much to outline all the available functionalities here Instead only a few key capabilities will be

demonstrated If you want to eg see the number of select update or insert statements in your system you

can use the ldquoTable Statistics Monitorrdquo It clearly shows you which tables in your system are accessed the

most frequently

To see the processing time of the different statements on your system you can use the ldquoOpen SQL statisticsrdquo

monitor There you can see the count the total average and max processing time of the individual SQL

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

93

statements and can therefore identify the expensive statements on your system

Per default the recorded period is always from the last restart of the system If you would like to just look at

the statistics for a specific time period (eg during a test) you have the option of resetting the statistics in the

individual monitor prior to the test

932 Monitoring Database (Oracle)

Procedure

Log on to your Integration Server and call transaction ST04

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

94

Many of the numbers in the screenshot above depend on each other The following checklist names a few

key performance figures of your Oracle database

The data buffer quality for instance is based on the ratio of physical reads versus the total number of reads The lower the ratio the better the buffer quality The data buffer quality should be better than 94 the statistics should be based on 15 million total reads This number of reads ensures that the database is in an equilibrated state

Ratio of user and recursive calls A good performance is indicated by ratio values greater than 2 Otherwise the number of recursive calls compared to the number of user calls is too high Over time this ratio always declines because more and more SQL statements get parsed in the meantime

Number of reads for each user call If this value exceeds 30 blocks for each user call this indicates an expensive SQL statement

Check the value of TimeUser call Values larger than 15 ms often indicate an optimization issue

Compare busy wait time versus CPU time A ratio of 6040 generally indicates a well-tuned system Significantly higher values (for example 8020) indicate room for improvement

The DD-cache quality should be better than 80

933 Monitoring Database (MS SQL)

Procedure

To display the most important performance parameters of the database call Transaction ST04 or choose

Tools Administration Monitor Performance Database Activity An analysis is only meaningful if

the database has been running for several hours with a typical workload To ensure a significant database

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

95

workload we recommend a minimum of 500 CPU busy seconds Note The default values displayed in

section Server Engine are relative values To display the absolute values press button Absolute values

Check the values in (1)

The cache hit ratio (2) which is the main performance indicator for the data cache shows the

average percentage of requested data pages found in the cache This is the average value since startup The value should always be above 98 (even during heavy workload) If it is significantly below 98 the data cache could be too small To check the history of these values use Transaction ST04 and choose Detail Analysis Menu Performance Database A snapshot is collected every 2 hours

Memory setting (5) shows the memory allocation strategy used and shows the following

FIXED SQL Server has a constant amount of memory allocated which is set by SQL Server configuration parameters min server memory (MB) = max server memory (MB)

RANGE SQL Server dynamically allocates memory between min server memory (MB) lt gt max server memory (MB)

AUTO SQL Server dynamically allocates memory between 4 MB and 2 PB which is set by min server memory (MB) = 0 and max server memory (MB) = 2147483647

FIXED-AWE SQL Server has a constant amount of memory allocated which is set by min server memory (MB) = max server memory (MB) In addition the address windowing extension functionality of Windows 2000 is enabled

934 Monitoring Database (DB2)

Procedure

To get an overview of the overall buffer pool usage catalog and package cache information go to

transaction ST04 choose Performance Database section Buffer Pool (or section Cache respectively)

1 1

1 2

1 3

1 5

1 4

gt 6h

gt 98

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

96

Buffer Pools Number Number of buffer pools configured in this system

Buffer Pools Total Size The total size of all configured buffer pools in KB If more than one buffer pool is used choose Performance Buffer Pools to get the size and buffer quality for every single buffer pool

Overall buffer quality This represents the ratio of physical reads to logical reads of all buffer pools

Data hit ratio In addition to overall buffer quality you can use the data hit ratio to monitor the database (data logical reads - data physical reads) (data logical reads) 100

Index hit ratio In addition to overall buffer quality you can use the index hit ratio to monitor the database (index logical reads - index physical reads) (index logical reads) 100

Data or Index logical reads The total number of read requests for data or index pages that went through the buffer pool

Data or Index physical reads The total number of read requests that required IO to place data or index pages in the buffer pool

Data synchronous reads or writes Read or write requests performed by db2agents

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

97

Catalog cache size Maximum size of the catalog cache that is used to maintain the most frequently accessed sections of the catalog

Catalog cache quality Ratio of catalog entries (inserts) to reused catalog entries (lookups)

Catalog cache overflows Number of times that an insert in the catalog cache failed because the catalog cache was full (increase catalog cache size)

Package cache size Maximum size of the package cache that is used to maintain the most frequently accessed sections of the package

Package cache quality Ratio of package entries (inserts) to reused package entries (lookups)

Package cache overflows Number of times that an insert in the package cache failed because the package cache was full (increase package cache size)

935 Monitoring Database (MaxDB SAP DB)

Procedure

As with the other database types call transaction ST04 to display the most important performance

parameters

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

98

With the Database Performance Monitor (transaction ST04) and its submonitors the CCMS in the SAP R3

System allows you to view in the SAP R3 System all of the information that can be used to identify

bottleneck situations

The SQL Statements section provides information about the number of SQL statements executed and related sizes

The IO Activity section lists physical and logical read and write accesses to the database

The Lock Activity section (in transaction ST04) allows you to identify possible bottlenecks in the assignment of locks by the database

The Logging Activity section combines information about the log area

The Scan and sort activity section can be helpful in identifying that suitable indexes are missing

The Cache Activity section provides information about the usage of the caches and the associated hit rates

o The data cache Hit rate should be 99 in a balanced system If this is not the case you need to check if the low hit rate is due to the size of the data cache being too small or due to an unsuitable SQL statement If necessary you can increase the data cache by increasing the MaxDB parameter Data_Cache_Size (lower than 74) or Cache_Size (74 or higher)

o The catalog cache hit rate should be 85 in a balanced system It is not a cause for concern if the catalog hit rate is temporarily lower since accesses to the system tables and an active command monitor can temporarily impair the hit rate If the catalog cache is busy the pages are paged out in the data cache rather than on the hard disk

o Log entries are not written directly to the log volume but first into a buffer (LOG_IO_QUEUE) so as to be able to write several log entries together asynchronously with one IO on the hard disk Only once a LOG_IO_QUEUE page is full is it written to the hard disk However there are situations in which you need to write LOG entries from the LOG_QUEUE onto the hard disk as quickly as possible for example if transactions are completed (COMMITROLLBACK) The transactions wait until the log writer reports the OK informing them that the log entry is on the hard disk Firstly this means that is important to use the quickest possible hard disks for the log volume(s) and secondly you must ensure that no LOG_QUEUE_OVERFLOWS occur in production operation If the LOG_IO_QUEUE is full then all transactions that want to write LOG entries must wait until free memory becomes available again in the LOG_IO_QUEUE At LOG_QUEUE_OVERFOLWS (Transaction DB50 -gt Current Status -gt Activities Overview -gt LOG_IO_QUEUE Overflow) you need to increase the LOG_IO_QUEUE parameter

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

99

o In accordance with Note 819324 - FAQ MaxDB SQL optimization check which SQL statements are responsible for the most disk accesses and whether they can be optimized You should also optimize SQL statements that have a poor runtime and poor selectivity

Bottleneck Analysis

The Bottleneck Analysis function is available in transaction ST04 under the Problem Analyzes

Performance Database Analyzer Bottlenecks You can use this function to activate the corresponding

analysis tool dbanalyzer

The dbanalyzer then collects important performance data every 15 minutes and evaluates this for possible

bottlenecks

The result of the bottleneck analysis is output in text format to provide a quick overview of possible causes of

performance problems For a more detailed description of the bottleneck messages see the online

documentation (httphelpsapcom) in the SAP Web Application Server area and search for SAP DB

bottleneck analysis messages

The dbanalyzer tool can also be started at operating system level It logs its analysis results in files with date

stamps in the subdirectory ltrundirectorygtanalyzer (you can find the actual directory by double-clicking on

lsquoPropertiesrsquo in DB50) of the relevant database instance

94 Monitoring Database Tables

Some of the tables of SAP PI grow very quickly and can cause severe performance problems if archiving or

deletion does not take place frequently For troubleshooting see SAP Note 872388 ndash Troubleshooting

Archiving and Deletion in PI

Procedure

Log on to your Integration Server and call transaction SE16 Enter the table names as listed below and execute In the following screen simply press the button ldquoNumber of Entriesrdquo The most important tables are SXMSPMAST (cleaned up by XML message archivingdeletion)

If you use the switch procedure you have to check SXMSPMAST2 as well

SXMSCLUR SXMSCLUP (cleaned up by XML message archivingdeletion)

If you use the switch procedure you have to check SXMSPCLUR2 and SXMSCLUP2 as well

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

100

SXMSPHIST (cleaned up by the deletion of history entries)

If you use the switch procedure you have to check SXMSPHIST2 as well

SXMSPFRAWH and SXMSPFRAWD (cleaned up by the performance jobs see SAP Note 820622 ndash Standard Jobs for XI Performance Monitoring)

SWFRXI (cleaned up by specific jobs see SAP Note 874708 - BPE HT Deleting Message Persistence Data in SWFRXI)

SWWWIHEAD (cleaned up by work item archivingdeletion)

Use an appropriate database tool for example SQLPLUS for Oracle for the database tables of the J2EE schema Main tables that could be affected of growth

PI Message tables BC_MSG and BC_MSG_AUDIT (when audit log persistence is enabled)

PI message logging information when stagingloging is used BC_MSG_LOG_VERSION

XI IDoc tables (if IDoc persistence is activated) XI_IDOC_IN_MSG and XI_IDCO_OUT_MSG

Check for all tables if the number of entries is reasonably small or remains roughly constant over a period of time If that is not the case check your archivingdeletion setup

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

101

10 TRACES LOGS AND MONITORING DECREASING PERFORMANCE

101 Integration Engine

There are three important locations for the configuration of tracing logging in the Integration Engine The

pipeline settings of PI itself the Internet Communication Manager (ICM) and the gateway All three are

heavily involved in message processing and therefore their tracing or logging settings can have an impact on

performance On top of this you might have changed the SM59 RFC destinations and have switched on the

trace in order to analyze a problem If so then this needs to be reset as well

Procedure

Start transaction SXMB_ADM and navigate to Integration Engine Configuration Specific Configuration and search for the following entries

Category Parameter Subparameter Current Value Default

RUNTIME TRACE_LEVEL ltnonegt ltyour valuegt 1

RUNTIME LOGGING ltnonegt ltyour valuegt 0

RUNTIME LOGGING_SYNC ltnonegt ltyour valuegt 0

Set the above parameters back to the default value which is the recommended value by SAP

Start transaction SMICM and check the value of the Trace Level that is displayed in the overview

(third line) Set the trace level to lsquo1rsquo if not already the case You can do so by navigating to Goto Trace Level Set

Start transaction SMGW navigate to Goto Parameters Display and check the parameter Trace Level Set the trace level to lsquo1rsquo if not already the case You can do so by navigating to Goto Trace Gateway Reduce Level (or Increase Level respectively)

Start transaction SM59 and search the following RFC destination for the flag ldquoTracerdquo on the tab Special Options and remove the Flag ldquoTracerdquo if it has been set

AI_RUNTIME_JCOSERVER

AI_DIRECTORY_JCOSERVER

LCRSAPRFC

SAPSLDAPI It is possible that other RFC destinations are used for sending out IDocs and similar

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

102

102 Business Process Engine

You only have to check the Event Trace in the Business Process Engine

Procedure

Call transaction SWELS and check if the Event Trace is switched on The recommended setting is lsquooffrsquo (as shown in the screenshot below)

103 Adapter Framework

Since the Adapter Framework runs on the J2EE Engine the tracing and logging can be conveniently

checked and controlled using NetWeaver Administrator To improve the performance SAP recommends that

you set all trace levels to default Higher trace levels are only acceptable in productive usage during problem

analysis Below you will find a description on how to set the default trace level for all locations at once It is of

possible to do it for every location separately but this way you might forget to reset one of the locations

Procedure

Start the NetWeaver Administrator and navigate to Problem Management (or Troubleshooting in NW 73 and higher) Logs and Traces and choose Log Configuration In the Show drop down list box choose Tracing Locations Select the Root Location in the tree and press the button Default Configuration This will set the default configuration for all locations

Start the NetWeaver Administrator and navigate to Troubleshooting Logviewer (direct link nwalinks) Open the View ldquoDeveloper Tracerdquo and check if you have very frequent reoccurring exceptions that fill up the trace Analyze the exception and check what is causing this (eg a wrong configuration of a Communication Channel causing repeated traces If you see very frequent errors for which you cannot find the root cause open a customer incident at SAP

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

103

An aggregated view of Java exceptions (by number date instance server node etc) are reported and available in SAP Solution Manager Root Cause Analysis Exception Analysis functionality

1031 Persistence of Audit Log information in PI 710 and higher

With SAP NetWeaver PI 71 the audit log is not persisted for successful messages in the database by default

to avoid performance overhead Therefore the audit log is only available in the cache for a limited period of

time (based on the overall message volume)

Audit log persistence can be activated temporarily for performance troubleshooting where audit log

information is required To do so use the NWA and change the parameter

ldquomessagingauditLogmemoryCacherdquo to false in service XPI Service Messaging System by setting the

parameter to false Details can be found in SAP Note 1314974 - PI 71 AF Messaging System audit log

persistence Only do so temporarily during the time of troubleshooting to avoid any performance problems

from the additional persistence

After implementing Note 1611347 - New data columns in Message Monitoring additional information like

message processing time in ms and server node is visible in the message monitor

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

104

11 ERRORS AS SOURCE OF PERFORMANCE PROBLEMS

If errors occur within messages or within technical components of the SAP Process Integration then this can

have a severe impact on the overall performance Thus to solve performance problems it is sometimes

necessary to analyze the log files of technical components and to search for messages with errors To

search for messages with errors use the PI Admin Check which is available using SAP Note 884865 - PIXI

Admin Check

Procedure

ICM (Internet Communication Manager)

Start transaction SMICM open the log file (Shift + F5 or GoTo Trace File Show All) and check for errors

Gateway

Start transaction SMGW open the Log File (CTRL + Shift + F10 or Goto Trace Gateway Display File) and check for errors

System Log

Start transaction SM21 choose an appropriate time interval and search the log entries for errors Repeat the procedure for remote systems if you are using dialog instances

ABAP Runtime Errors

Start transaction ST22 choose an appropriate time interval and search for PI-related dumps In general the number of ABAP dumps should be very small on a PI system and therefore all occurring dumps should be analyzed

Alerts CCMS

Start transaction RZ20 and search for recent alerts

Work Process and RFC trace

In the PI work directory check all files which begin with dev_rfc or dev_w for errors

J2EE Engine

In the PI work directory search for errors in file dev_server0out If you are using more than one J2EE server node check dev_serverltngtout as well

Applications running on the J2EE Engine

The traces for application-related errors are the defaulttrc files located in j2eeclusterserverltngtlog directory To monitor exceptions across multiple server nodes the best tool to use is the LogViewer service of the NWA This is available via Problem Management Logs and Traces and choose Log Viewer Search for exceptions in the area of the performance problem for example the PI Messaging system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

105

APPENDIX A

A1 Wily Introscope Transaction Trace

As mentioned above the Wily Transaction trace can be used to trace expensive steps that were noticed

during the mapping or module processing The transaction trace will allow you to drill down further into Java

performance problems and to distinguish if it is a pure coding problem or caused by a look-up to a remote

system or a slow connection to the local database

The Wily Transaction trace can be used to trace all steps that exceed a specific duration on the Java stack If

a user is known (for example for SAP Enterprise Portal) the tracing can be restricted to a specific user In PI

if you encounter a module that lasts several seconds then you can restrict the tracing as shown below

Note Starting the Transaction Trace increases the data collection on the satellite system (PI) and is

therefore only recommended in productive environment for troubleshooting purposes

With the selection above the trace will run for 10 minutes (this is the maximum and it can be canceled earlier)

and will trace all steps exceeding 3 seconds for the agents of the PI system If such long-running steps are

found then a new window will be displayed listing these steps

In the screenshot below you can see the result in the Trace View The Trace View shows the elapsed time

from left to rightndash in the example below around 48 seconds From top to bottom we can see the call stack of

the thread In general we are interested in long-running threads on the bottom of the trace view A long-

running block at the bottom means that this is the lowest level coding that was instrumented and which is

consuming all the time

In the example below we see a mapping call that is performing many individual database statements ndash this

will become visible by highlighting the lowest level In such a case you have to review the coding of the

mapping to see if the high amount of database calls can be summarized in one call

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

106

Another case that is often seen is that a lookup using JDBC to a remote database or RFC to an ABAP

system takes a long time in the mapping or adapter module In such a case there will be one long block at

the bottom of the transaction trace that also gives you some details about the statement that was executed

A2 XPI inspector for troubleshooting and performance analysis

The XPI Inspector is an additional tool that provides enhanced logging and tracing capabilities with a main

focus on troubleshooting But the tool can also be used for troubleshooting of performance issues

General information about the tool can be found at Note 1514898 - XPI Inspector for troubleshooting PI The

tool can be called using the following URL on your system httphostjava-portxpi_inspector

To analyze performance problems typically the example ldquo51 ndash ldquoPerformance Problem)rdquo is used As a basic

measurement it allows you to take multiple thread dumps in specific time intervals These thread dumps can

be analyzed later by your administrators or SAP support to identify what is causing the problem The Thread

Dump Viewer tool (TDV) from Note 1020246 - Thread Dump Viewer for SAP Java Engine can be used to

analyze the produced thread dumps (inside the results archive that can be downloaded to your local PC)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

107

In addition the tool allows you to do JVM profiling be either doing JVM Performance tracing or JVM Memory

Allocation tracing This can help to understand in detail which steps of the processing are taking a long time

As output the tool provides prf files that can be loaded into the JVM Profiler More information can be found

at httpwikiscnsapcomwikidisplayASJAVAJava+Profiling Please Note that these traces (especially the

memory profiling) will cause a high overhead on the J2EE engine and are therefore not recommended to be

used in production Instead it is advisable to reproduce the problem on your QA system and use the tool

there

Document Version 30 March 2014

wwwsapcom

copy 2014 SAP AG All rights reserved

SAP R3 SAP NetWeaver Duet PartnerEdge ByDesign SAP

BusinessObjects Explorer StreamWork SAP HANA and other SAP

products and services mentioned herein as well as their respective

logos are trademarks or registered trademarks of SAP AG in Germany

and other countries

Business Objects and the Business Objects logo BusinessObjects

Crystal Reports Crystal Decisions Web Intelligence Xcelsius and

other Business Objects products and services mentioned herein as

well as their respective logos are trademarks or registered trademarks

of Business Objects Software Ltd Business Objects is an SAP

company

Sybase and Adaptive Server iAnywhere Sybase 365 SQL

Anywhere and other Sybase products and services mentioned herein

as well as their respective logos are trademarks or registered

trademarks of Sybase Inc Sybase is an SAP company

Crossgate mgic EDDY B2B 360deg and B2B 360deg Services are

registered trademarks of Crossgate AG in Germany and other

countries Crossgate is an SAP company

All other product and service names mentioned are the trademarks of

their respective companies Data contained in this document serves

informational purposes only National product specifications may vary

These materials are subject to change without notice These materials

are provided by SAP AG and its affiliated companies (SAP Group)

for informational purposes only without representation or warranty of

any kind and SAP Group shall not be liable for errors or omissions

with respect to the materials The only warranties for SAP Group

products and services are those that are set forth in the express

warranty statements accompanying such products and services if

any Nothing herein should be construed as constituting an additional

warranty

wwwsapcom

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

4

TABLE OF CONTENTS

1 INTRODUCTION 6

2 WORKING WITH THIS DOCUMENT 10

3 DETERMINING THE BOTTLENECK 12 31 Integration Engine Processing Time 12 32 Adapter Engine Processing Time 13 321 Adapter Engine Performance monitor in PI 731 and higher 15 33 Processing Time in the Business Process Engine 16

4 ANALYZING THE INTEGRATION ENGINE 20 41 Work Process Overview (SM50SM66) 21 42 qRFC Resources (SARFC) 22 43 Parallelization of PI qRFC Queues 23 44 Analyzing the runtime of PI pipeline steps 29 441 Long Processing Times for ldquoPLSRV_RECEIVER_ DETERMINATIONrdquo PLSRV_INTERFACE_DETERMINATION 30 442 Long Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo 30 443 Long Processing Times for ldquoPLSRV_CALL_ADAPTERrdquo 33 444 Long Processing Times for ldquoDB_ENTRY_QUEUEINGrdquo 34 445 Long Processing Times for ldquoDB_SPLITTER_QUEUEINGrdquo 34 446 Long Processing Times for ldquoLMS_EXTRACTIONrdquo 36 447 Other step performed in the ABAP pipeline 36 45 PI Message Packaging for Integration Engine 37 46 Prevent blocking of EO queues 38 47 Avoid uneven backlogs with queue balancing 38 48 Reduce the number of parallel EOIO queues 40 49 Tuning the ABAP IDoc Adapter 41 491 ABAP basis tuning 41 492 Packaging on sender and receiver side 42 493 Configuration of IDoc posting on receiver side 42 410 Message Prioritization on the ABAP Stack 44

5 ANALYZING THE BUSINESS PROCESS ENGINE (BPE) 45 51 Work Process Overview 45 52 Duration of Integration Process Steps 45 53 Advanced Analysis Load Created by the Business Process Engine (ST03N) 47 54 Database Reorganization 47 55 Queuing and ccBPM (or increasing Parallelism for ccBPM Processes) 48 56 Message Packaging in BPE 48

6 ANALYZING THE (ADVANCED) ADAPTER ENGINE 50 61 Adapter Performance Problem 51 611 Adapter Parallelism 51 612 Sender Adapter 54 613 Receiver Adapter 55 614 IDoc_AAE adapter tuning 56 615 Packaging for Proxy (SOAP in XI 30 protocol) and Java IDoc adapter 56 616 Adapter Framework Scheduler 58 62 Messaging System Bottleneck 60 621 Messaging System in between AFW Sender Adapter and Integration Server (Outbound) 63 622 Messaging System in Between Integration Server and AFW Receiver Adapter (Inbound) 64 623 Interface Prioritization in the Messaging System 64 624 Avoid Blocking Caused by Single SlowHanging Receiver Interface 65 625 Overhead based on interface pattern being used 67 63 Performance of Module Processing 68 64 Java only scenarios Integrated Configuration objects 69

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

5

641 General performance gain when using Java only scenarios 69 642 Message Flow of Java only scenarios 71 643 Avoid blocking of Java only scenarios 73 644 Logging Staging on the AAE (PI 73 and higher) 73 65 J2EE HTTP load balancing 75 66 J2EE Engine Bottleneck 76 661 Java Memory 76 662 Java System and Application Threads 80 663 FCA Server Threads 83 664 Switch Off VMC 84

7 ABAP PROXY SYSTEM TUNING 85 71 New enhancements in Proxy queuing 86

8 MESSAGE SIZE AS SOURCE OF PERFORMANCE PROBLEMS 87 81 Large message queues on PI ABAP 88 82 Large message queues on PI Adapter Engine 88

9 GENERAL HARDWARE BOTTLENECK 90 91 Monitoring CPU Capacity 90 92 Monitoring Memory and Paging Activity 91 93 Monitoring the Database 91 931 Generic J2EE database monitoring in NWA 92 932 Monitoring Database (Oracle) 93 933 Monitoring Database (MS SQL) 94 934 Monitoring Database (DB2) 95 935 Monitoring Database (MaxDB SAP DB) 97 94 Monitoring Database Tables 99

10 TRACES LOGS AND MONITORING DECREASING PERFORMANCE 101 101 Integration Engine 101 102 Business Process Engine 102 103 Adapter Framework 102 1031 Persistence of Audit Log information in PI 710 and higher 103

11 ERRORS AS SOURCE OF PERFORMANCE PROBLEMS 104

APPENDIX A 105 A1 Wily Introscope Transaction Trace 105 A2 XPI inspector for troubleshooting and performance analysis 106

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

6

1 INTRODUCTION

SAP NetWeaver Process Integration (PI) consists of the following functional components Integration Builder

(including Enterprise Service Repository (ESR) Service Registry (SR) and Integration Directory) Integration

Server (including Integration Engine Business Process Engine and Adapter Engine) Runtime Workbench

(RWB) and System Landscape Directory (SLD) The SLD in contrast to the other components may not

necessarily be part of the PI system in your system landscape because it is used by several SAP NetWeaver

products (however it will be accessed by the PI system regularly) Additional components in your PI

landscape might be the Partner Connectivity Kit (PCK) a J2SE Adapter Engine one or several non-central

Advanced Adapter Engines or an Advanced Adapter Engine Extended An overview is given in the graphic

below The communication and accessibility of these components can be checked using the PI Readiness

Check (SAP Note 817920)

Processing at runtime is carried out by the Integration Server (IS) with the aid of the SLD The IS in this

graphic stands for the Web Application Server 70 or higher In the classic environment PI is installed as a

double stack an ABAP and a Java stack (J2EE Engine) The ABAP stack is responsible for pipeline

processing in the Integration Engine (Receiver Determination Interface Determination and so on) and the

processing of integration processes in the Business Process Engine Every message has to pass through

the ABAP stack The Java stack executes the mapping of messages (with the exception of ABAP mappings)

and hosts the Adapter Framework (AFW) The Adapter Framework contains all XI adapters except the plain

HTTP WSRM and IDoc adapters

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

7

With 71 the so called Advanced Adapter Engine (AAE) was introduced The Adapter Engine was enhanced

to provide additional routing and mapping call functionality so that it is possible to process messages locally

This means that a message that is handled by a sender and receiver adapter based on J2EE does not need

to be forwarded to the ABAP Integration Engine This saves a lot of internal communication reduces the

response time and significantly increases the overall throughput The deployment options and the message

flow for 71 based systems and higher are shown below Currently not all the functionalities available in the

PI ABAP stack are available on the AAE but each new PI release closes the gap further

In SAP PI 73 and higher the Adapter Engine Extended (AEX) was introduced In addition to the AAE the

Adapter Engine Extended also allows local configuration of the PI objects The AEX can therefore be seen

as a complete PI installation running on Java only From the runtime perspective no major differences can be

seen compared to the AAE and therefore no differentiation is made in this guide

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

8

Looking at the different involved components we get a first impression of where a performance problem

might be located In the Integration Engine itself in the Business Process Engine or in one of the Advanced

Adapter Engines (central non-central plain PCK) Of course a performance problem could also occur

anywhere in between for example in the network or around a firewall Note that there is a separation

between a performance issue ldquoin PIrdquo and a performance issue in any attached systems If viewed ldquothrough

the eyesrdquo of a message (that is from the point of view of the message flow) then the PI system technically

starts as soon as the message enters an adapter or (if HTTP communication is used) as soon as the

message enters the pipeline of the Integration Engine Any delay prior to this must be handled by the

sending system The PI system technically ends as soon as the message reaches the target system for

example a given receiver adapter (or in case of HTTP communication the pipeline of the Integration Server)

has received the success message of the receiver Any delay after this point in time must be analyzed in the

receiving system

There are generally two types of performance problems The first is a more general statement that the PI

system is slow and has a low performance level or does not reach the expected throughput The second is

typically connected to a specific interface failing to meet the business expectation with regard to the

processing time The layout of this check is based on the latter First you should try to determine the

component that is responsible for the long processing time or the component that needs the highest absolute

time for processing Once this has been clarified there is a set of transactions that will help you to analyze

the origin of the performance problem

If the recommendations given in this guide are not sufficient SAP can help you to optimize the service via

SAP consulting or SAP MaxAttention services SAP might also handle smaller problems restricted to a

specific interface if you describe your problem in a SAP customer incident Support offerings like ldquoSAP

Going Live Analysis (GA)rdquo and ldquoSAP Going Live Verification (GV)rdquo can be used to ensure that the main

system parameters are configured according to SAP best practice You can order any type of service using

SAP Service Marketplace or your local SAP contacts

If you have already worked with the PI Admin Check (SAP Note 884865) you will recognize some of the

transactions This is because SAP considers the regular checking of the performance to be an important

administrational task However this check tries to show the methodology to approach performance problems

to its reader Also it offers the most common reasons for performance problems and links to possible follow-

up actions but does not refer to any regular administrational transactions

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

9

Wily Introscope is a prerequisite for analyzing performance problems at the Java side It allows you to

monitor the resource usage of multiple J2EE server nodes and provides information about all the important

components on the Java stack like mapping runtime messaging system or module processor Furthermore

the data collected by Wily is stored for several days so it is still possible to analyze a performance problem at

a later date Wily Introscope is provided free-of-charge for SAP customers It is delivered as part of Solution

Manager Diagnostics but can also be installed separately For more information see

httpservicesapcomdiagnostics or SAP Note 797147

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

10

2 WORKING WITH THIS DOCUMENT

If you are experiencing performance problems on your Process Integration system start with Chapter 3

Determining the Bottleneck It helps you to determine the processing time for the different runtimes within PI

Integration Engine Business Process Engine and Adapter Engine or connected Proxy System Once you

have identified the area in which the bottleneck is most probably located continue with the relevant chapter

This is not always easy to do because a long processing time is not always an indication for a bottleneck It

can make sense to do this if a complicated step is involved (Business Process Engine) if an extensive

mapping is to be executed (IS) or if the payload is quite large (all runtimes) For this reason you need to

compare the value that you retrieve from Chapter 3 with values that you have received previously for

example The history data provided for the Java components by Wily Introscope is a big help If this is not

possible you will have to work with a hypothesis

Once the area of concern has been identified (or your first assumption leads you there) Chapters 4 5 6

and 7 will help you to analyze the Integration Engine the Business Process Engine the PI Adapter

Framework and the ABAP Proxy runtime

After (or preferably during) the analysis of the different process components it is important to keep in mind

that the bottlenecks you have observed could also be caused by other interfaces processing at the same

time It will lead you to slightly different conclusions and tuning measures You therefore have to distinguish

the following cases based on where the problem occurs

A with regard to the interface itself that is it occurs even if a single message of this interface is processed

B or with regard to the volume of this interface that is many messages of this interface are processed at the same time

C or with regard to the overall message volume processed on PI

This can be analyzed by repeating the procedure of Chapter 3 for A) a single message that is processed

while no other messages of this interface are processed and while no other interfaces are running Then

compare this value with B) the processing time for a typical amount of messages of this interface not simply

one message as before If the values of measurement A) and B) are similar repeat the procedure with C) a

typical volume of all interfaces that is during a representative timeframe on your productive PI system or

with the help of a tailored volume test

These three measurements ndash A) processing time of a single message B) processing time of a typical

amount of messages of a single interface and C) processing time of a typical amount of messages of all

interfaces ndash should enable you to distinguish between the three possible situations

o A specific interface has long processing steps that need to be identified and improved The tuning options are usually limited and a re-design of the interface might be required

o The mass processing of an interface leads to high processing times This situation typically calls for tuning measures

o The long processing time is a result of the overall load on the system This situation can be solved by tuning measures and by taking advantage of PI features for example to establish separation of interfaces If the bottleneck is hardware-related it could also require a re-sizing of the hardware that is used

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

11

Chapters 8 9 10 and 11 deal with more general reasons for bad performance such as heavy

tracinglogging error situations and general hardware problems They should be taken into account if the

reason for slow processing cannot be found easily or if situation C from above applies (long processing times

due to a high overall load)

Important

Chapter 9 provides the basic checks for the hardware of the PI server It should not only be used when

analyzing a hardware bottleneck due to a high overall load andor an insufficient sizing but should also be

used after every change made to the configuration of PI to ensure that the hardware is able to handle the

new situation This is important because a classical PI system uses three runtime engines (IS BPE AFW)

Tuning one engine for high throughput might have a direct impact on the others With every tuning action

applied you have to be aware of the consequences on the other runtimes or the available hardware

resources

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

12

3 DETERMINING THE BOTTLENECK

The total processing time in PI consists of the processing time in the Integration Server processing time in

the Business Process Engine (if ccBPM is used) as well as the processing time in the Adapter Framework (if

adapters except IDoc plain HTTP or ABAP proxies are used) In the subsequent chapters you will find a

detailed description of how to obtain these processing times

For reasons of completeness we will also have a look at the ABAP Proxy runtime and the available

performance evaluations

31 Integration Engine Processing Time

You can use an aggregated view of the performance (details about activation in Note 820622 - Standard jobs

for XI performance monitoring) data for the Integration Engine as shown below

Procedure

Log on to your Integration Engine and call transaction SXMB_IFR In the browser window follow the link to

the Runtime Workbench In there click ldquoPerformance Monitoringrdquo Change the display to ldquoDetailed Data

Aggregatedrdquo select lsquoISrsquo as the Data Source (or lsquoPMIrsquo if you have configured the Process Monitoring

Infrastructurersquo) and ldquoXIIntegrationServerltyour_hostgtrdquo as the component Then choose an appropriate time

interval for example the last day You have to enter the details of the specific interface you want to monitor

here

The interesting value is the ldquoProcessing Time [s]rdquo in the fifth column It is the sum of the processing time of the single steps in the Integration Server For now you are not interested in the single steps that are listed on the right side since you are still trying to find out the part of the PI system where the most time is lost Chapter 44 will work with the individual steps

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

13

If you do not know which interface is affected yet you first have to get an overview Instead of navigating to

Detailed Data Aggregated in the Runtime Workbench choose Overview Aggregated Use the button Display

Options and check the options for sender component receiver component sender interface and receiver

interface Check the processing times as described above

32 Adapter Engine Processing Time

Procedure

Log on to your Integration Server and call transaction SXMB_IFR In the browser window follow the link to

the Runtime Workbench In there click Message Monitoring choose Adapter Engine lthostgt from Database

from the drop down lists and press Display Enter your interface details and ldquoStartrdquo the selection Select one

message using the radio button and press Details

Calculate the difference between the start and end timestamp indicated on the screen

Do the above calculation for the outbound (AFW IS) as well as the inbound (IS AFW) messages

The audit log of successful message is no longer persisted in SAP NetWeaver PI 71 by default in order to minimize the load on the database The audit log has a high impact on the database load since it usually consists of many rows that are inserted individually in the database In newer releases therefore a cache was implemented that keeps the audit log information for a period of time only Thus no detailed information is available any longer from a historical point of view Instead you should use Wily Introscope for historical analyses of the performance of successful messages

The audit log can be persisted on the database to allow historical analysis of performance problems

on the Java stack To do so change the parameter ldquomessagingauditLogmemoryCacherdquo from

true to false for service ldquoXPI Service Messaging Systemrdquo using NetWeaver Administrator (NWA) as described in SAP Note 1314974 Note Only do this temporarily if you have identified the bottleneck on the AFW First try using Wily to determine the root cause of the problem

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

14

Due to above mentioned limitations Wily Introscope is the right tool for performance analysis of the Adapter Engine Wily persists the most important steps like queuing module processing and average response time in historical dashboards (aggregated data per intervals for all server nodes of a system or all systems with filters possible etc)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

15

Especially for Java-Only runtime using Integrated Configuration Wily shows the complete processing time after the initial sender adapter steps (receiver determination mapping and time in the backend call) as can be seen in the Message Processing Time dashboard below

Using the ldquoShow Minimum and Maximumrdquo functionality deviations from the average processing time can be seen In the example below a message processing step takes up to 22 seconds The reason for this could either be a slow mapping or a delay in the call to the receiver system (in the example below Java IDoc adapter is used to transfer the data to ERP)

In case of high processing times in one of your steps further analysis might be required using thread dumps or the Java Profiler as outlined in the appendix section A2 XPI inspector for troubleshooting and performance analysis

321 Adapter Engine Performance monitor in PI 731 and higher

Starting from PI 731 SP4 there is a new performance monitor available for the Adapter Engine More

information on the activation of the performance monitor can be found at Note 1636215 ndash Performance

Monitoring for the Adapter Engine Similar to the ABAP monitor in the RWB the data is displayed in an

aggregated format On the PI start page use the link ldquoConfiguration and Monitoring Homerdquo Go to ldquoAdapter

Enginerdquo and select ldquoPerformance Monitorrdquo The data is persisted on an hourly basis for the last 24 hours and

on a daily basis for the last 7 days The data is further grouped on interface level and per Java server node

With the information provided you can therefore see the minimum maximum and average response time of

an individual interface on a specific Java server node All individual steps of the message processing like

time spent in the Messaging System queues or Adapter modules are listed In the example below you can

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

16

see that most of the time (5 seconds) is spent in a module called SimpleWaitModule This would be the entry

point for the further analysis

Starting with 731 SP10 (740 SP05) this data can be exported to excel in two ways Overview or Detailed (details in SAP Note 1886761)

33 Processing Time in the Business Process Engine

Procedure

Log on to your integration server and call transaction SXMB_MONI Adjust your selection in a way that you

select one or more messages of the respective interface Once the messages are listed navigate to the

Outbound column (or Inbound if your integration process is the sender) and click on PE Alternatively you

can select the radio button for ldquoProcess Viewrdquo on the entranceselection screen of SXMB_MONI

To judge the performance of an integration process it is essential to understand the steps that are executed A long-running integration process in itself is not critical because it can wait for a trigger event or can include a synchronous call to a backend Therefore it is important to understand the steps that are included before analyzing an individual workflow log For example the screenshot below shows a long waiting time after the mapping This is perhaps due to an integrated wait step

Calculate the time difference between the first and the last entry Note that this time for the workflow does not include the duration of RFC calls for example To see this processing time navigate to the ldquoList with Technical Detailsrdquo (second button from the left on the screenshot below or shift + F9) Repeat this step for several messages to get an overview about the most time consuming steps

Please Note that the processing time mentioned above does not include the waiting time of the messages in the ccBPM queues Therefore it is essential to monitor a queue backlog in ccBPM queues as discussed in section ldquoQueuing and ccBPM (or increasing Parallelism for ccBPM Processes)rdquo

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

17

Via transaction SWI2_DURA you can display an aggregated performance overview across all ccBPM

processes running on your PI system In the initial screen you have to choose ldquo(Sub-) Workflow and the time

range you would like to look at

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

18

The results here allow you to compare the average performance of a process It shows you the processing

time based on barriers The 50 barrier means that 50 of the messages were processed faster than the

value specified Furthermore you also see a comparison of the processing time to eg the day before By

adjusting the selection criteria of the transaction you can therefore get a good overview about the normal

processing time of the Integration process and can judge if there is a general performance problem or just a

temporary one

The number of process instances per integration process can be easily checked via transaction

SWI2_FREQ The data shown allows you to check the volume of the Integration Processes and allow you to

judge if your performance problem is caused by an increase of volume which could cause a higher latency in

the ccBPM queues

New in PI 73 and higher

Starting from PI 73 a new monitoring for ccBPM processes is available This monitor can be started from

transaction SXMB_MONI_BPE Integration Process Monitoring (also available in ldquoConfiguration and

Monitoring Homerdquo on PI start page) This is new browser based view that allows a simplified and aggregated

view on the PI Integration Processes

On the initial screen you get an overview about all the Integration Processes executed in the selected time

interval Therefore you can immediately see the volume of each Integration Process

From there you can navigate to the Integration process facing the performance issues and look at the

individual process instances and the start and end time Furthermore there is a direct entry point to see the

PI messages that are assigned to this process

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

19

Choosing one special process instance ID you will see the time spend in each individual processing step

within the process In the example below you can see that most of the time is spend in the Wait Step

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

20

4 ANALYZING THE INTEGRATION ENGINE

If chapter Determining the Bottleneck has shown that the bottleneck is clearly in the Integration Engine there

are several transactions that can help you to analyze the reason for this To understand why this selection of

transactions helps to analyze the problem it is important to know that the processing within the Integration

Engine is done within the pipeline The central pipeline of the Integration Server executes the following main

steps

Central Pipeline Steps Description of Central Pipeline Steps

PLSRV_XML_VALIDATION_RQ_INB XML Validation Inbound Channel Request

PLSRV_RECEIVER_DETERMINATION Receiver Determination

PLSRV_INTERFACE_DETERMINATION Interface Determination

PLSRV_RECEIVER_MESSAGE_SPLIT Branching of Messages

PLSRV_MAPPING_REQUEST Mapping

PLSRV_OUTBOUND_BINDING Technical Routing

PLSRV_XML_VALIDATION_RQ_OUT XML Validation Outbound Channel Request

PLSRV_CALL_ADAPTER Transfer to Respective Adapter IDoc HTTP Proxy AFW

PLSRV_XML_VALIDATION_RS_INBXML Validation Inbound Channel Response

PLSRV_MAPPING_RESPONSE Response Message Mapping

PLSRV_XML_VALIDATION_RS_OUT Validation Outbound Channel Response

The last three steps highlighted in bold are available in case of synchronous messages only and reflect the

time spent in the mapping of the synchronous response message

The steps indicated in red are new in SAP NetWeaver PI 71 and higher and can be used to validate the

payload of a PI message against an XML schema These steps are optional and can be executed at different

times in the PI pipeline processing for example before and after a mapping (as shown above)

With PI 731 an additional option of using an external Virus Scan during PI message processing can be

activated as described in the Online Help The Virus Scan can be configured on multiple steps in the pipeline

ndash eg after the mapping (Virus Scan Outbound Channel Request) when receiving a synchronous response

(Virus Scan Inbound Channel Response) and after the response mapping (Virus Scan Outbound Channel

Response)

It is important to know is that these pipeline steps are executed based on qRFC Logical Unit of Work (LUW)

by using dialog work processes

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

21

With 711 and higher versions you can activate the Lean Message Search (LMS) to be able to search for

payload content This is described in more detail in chapter Long Processing Times for

ldquoLMS_EXTRACTIONrdquo

41 Work Process Overview (SM50SM66)

The work process overview is the central transaction to get an overview of the current processes running in

the Integration Engine For message processing PI is only using Dialog work processes (DIA WP) Therefore

it is essential to ensure that enough DIA WPs available

The process overview is only a snapshot analysis You therefore have to refresh the data multiple times to

get a feeling for the dynamics of the processes The most important questions to be asked are as follows

How many work processes are being used on average Use the CPU Time (clock symbol) to check that not all DIA WPs are used In case all DIA WPs have a high CPU time this indicates a WP bottleneck

Which users are using these processes It depends on the usage of your PI system All asynchronous messages are processed by the QIN scheduler who will start the message processing in DIA WPs The user that is shown in SM66 will be the one that triggered the QIN scheduler This can eg be a communication user for a connected backend (usually a copy of PIAPPLUSER) or the user used to send data from the Adapter Engine (per default PIAFUSER)

Activities related to the Business Process Engine (ccBPM) will be indicated by user WF-BATCH If you see high WF-BATCH activity take a look at chapter 51

Are there any long running processes and if so what are these processes doing with regard to the reports used and the tables accessed

o As a rule of thumb for the initial configuration the number of DIA work processes should be around six times the value of CPU in your PI system (rdispwp_no_dia = 6 to 8 times CPUs cores based on SAP Note 1375656 - SAP NetWeaver PI System Parameters)

If you would like to get an overview for an extended period of time without actually refreshing the transaction

at regular intervals use report SDFMON and schedule the Daily Monitoring It allows you to collect metrics

such as CPU usage amount of free Dialog work processes and most importantly a snapshot of the work

process activity as provided in transactions SM50 and SM66 The frequency for the data collection can be as

low as 10 seconds and the data can be viewed after the configurable timeframe of the analysis Note

SDFMON is only intended for analysis of performance issues and should only be scheduled for a limited

period of time

If you have Solution Manager Diagnostics installed and all the agents connected to PI you can also monitor

the ABAP resource usage using Wily Introscope As you can see in the dashboard below you can use it to

monitor the number of free work processes (prerequisite is that the SMD agent is running on the PI system)

The major advantage of Wily Introscope is that this information is also available from the past and allows

analysis after the problem has occurred in the system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

22

42 qRFC Resources (SARFC)

Depending on the configuration of the RFC server group not all dialog work processes can be used for

qRFC processing in the Integration Server As stated earlier all (ABAP based) asynchronous messages are

processed using qRFC Therefore tuning the RFC layer is one of the most important tasks to achieve good

PI performance

In case you have a high volume of (usually runtime critical) synchronous scenarios you have to ensure that

enough DIA WPs are kept free by the asynchronous interfaces running at the same time Since this is a very

difficult tuning exercise we usually recommend implementing runtime critical synchronous interfaces using

Java only configuration (ICO) whenever possible If this is not possible you have to ensure via SARFC tuning

that enough work processes are kept free to process the synchronous messages

The current resource situation in the system can be monitored using transaction SARFC

Procedure

First check which application servers can be used for qRFC inbound processing by checking the AS Group

assigned in transaction SMQR

Call transaction SARFC and refresh several times since the values are only snapshot values

Is the value for ldquoMax Resourcesrdquo close to your overall number of dialog work processes Max Resources is a fixed value and describes how many DIA work processes may be used for RFC communication If the value is too low your RFC parameters need tuning to provide the Integration Server with enough resource for the pipeline processing The RFC parameters can be displayed by double-clicking on ldquoServer Namerdquo

o A good starting point is to set the values as follows

Max no of logons = 90

Max disp of own logons = 90

Max no of WPs used = 90

Max wait time = 5

Min no of free WP = 3-10 (depending on the size of the application server)

These values are specified in SAP Note 1375656 ndash SAP NetWeaver PI System Parameters (this parameter must be increased if you plan to have synchronous interfaces) To avoid any bottleneck and blocking situation free DIA WPs should always be available The value should be adjusted based on the overall number of DIA WPs available at the system and the above requirements

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

23

Note You have to set the parameters using the SAP instance profile Otherwise the changes are lost

after the server is restarted

The field ldquoResourcesrdquo shows the number of DIA WPs available for RFC processing Is this value reasonably high for example above zero all the time If the ldquoResourcesrdquo value is zero then the Integration Server cannot process XML messages because no dialog work process is free to execute the necessary qRFC There are several conclusions that can be drawn from this observation as follows

1) Depending on the CPU usage (see Chapter ldquoMonitoring CPU Capacityrdquo) it might be necessary at this time to increase the number of dialog work processes in your system If the CPU usage however is already very high increasing the number of DIA will not solve the bottleneck

2) Depending on the number of concurrent PI queues (see Chapter qRFC Queues (SMQ2)) it might be necessary to decrease the degree of parallelism in your Integration Server because the resources are obviously not sufficient to handle the load The next section describes how to tune PI inbound and outbound queues

The data provided in SARFC is also collected and shown in a graphical and historical way using Solution

Manager Diagnostic and Wily Introscope (as can be seen in the screenshot below) This allows easy

monitoring of the RFC resources across all available PI application servers

43 Parallelization of PI qRFC Queues

The qRFC inbound queues on the ABAP stack are one of the most important areas in PI tuning Therefore it

is essential to understand the PI queuing concept

PI generally uses two types of queues for message processing ndash PI inbound and PI outbound queues Both

types are technical qRFC inbound queues and can therefore be monitored using SMQ2 PI inbound queues

are named XBTI (EO) or XBQI (EOIO) and are shared between all interfaces running on PI by default The

PI outbound queues are named XBTO (EO) and XBQO (EOIO) The queue suffix (in red XBTO0___0004)

specifies the receiver business system This way PI is using dedicated outbound queues for each receiver

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

24

system All interfaces belonging to the same receiver business system are contained in the same outbound

queue Furthermore there are dedicated queues for prioritization of separation of large messages To get an

overview about the available queues use SXMB_ADM Manage Queues

PI inbound and outbound queues execute different pipeline steps

PI Inbound Queues

Pipeline Steps Description of Pipeline Steps

PLSRV_XML_VALIDATION_RQ_INB XML Validation Inbound Channel Request

PLSRV_RECEIVER_DETERMINATION Receiver Determination

PLSRV_INTERFACE_DETERMINATION Interface Determination

PLSRV_RECEIVER_MESSAGE_SPLIT Branching of Messages

PI Outbound Queues

Pipeline Steps Description of Pipeline Steps

PLSRV_MAPPING_REQUEST Mapping

PLSRV_OUTBOUND_BINDING Technical Routing

PLSRV_XML_VALIDATION_RQ_OUT XML Validation Outbound Channel Request

PLSRV_CALL_ADAPTER Transfer to Respective Adapter IDoc HTTP Proxy AFW

Before the steps above are executed the messages are placed in a qRFC queue (SMQ2) The messages

will wait till they are first in queue and till a free DIA work process is assigned to a queue The wait time in the

queue are recorded in the performance header in DB_ENTRY_QUEUING (PI inbound queue) and

DB_SPLITTER_QUEUING (PI outbound queue) Most often you will see the long processing times there as

discussed in chapter Long Processing Times for ldquoDB_ENTRY_QUEUEINGrdquo and Long Processing Times for

ldquoDB_SPLITTER_QUEUEINGrdquo

Looking at the steps above it is clear that tuning the queues will have a direct impact on the connected PI

components and also backend systems For example by increasing the number of parallel outbound queues

more mappings will be executed in parallel which will in turn put a greater load on the Java stack or more

messages will be forwarded in parallel to the backend system in case of a Proxy call Thus when tuning PI

queues you must always keep the implications for the connected systems in mind

The tuning of PI ABAP queue parameters is done in transaction SXMB_ADM Integration Engine

Configuration and by selecting the category TUNING

For productive usage we always recommend to use inbound and outbound queues (parameter

EO_INBOUND_TO_OUTBOUND=1) Otherwise only inbound queues will be used which are shared across

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

25

all interfaces Hence a problem with one single backend system will affect all interfaces running on the

system

The principle of less is sometimes more also applies for tuning the number of parallel PI queues Increasing

the number of parallel queues will result in more parallel queues with fewer entries per queue In theory this

should result in a lower latency if enough DIA WPs are available

But practically this is not true for high volume systems The main reason for this is the overhead involved in

the reloading of queues (see details below) Furthermore important tuning measures like PI packaging (see

Chapter ldquoPI Message Packagingrdquo) aim to increase the throughput based on a higher number of messages in

the queue Thus from a throughput perspective it is definitely advisable to configure fewer queues

If you have very high runtime requirements you should prioritize these interfaces and assign a different

parallelism for high priority queues only This can be done using parameters

EO_INBOUND_PARALLEL_HIGH_PRIO and EO_OUTBOUND_PARALLEL_HIGH_PRIO as described in

section ldquoMessage Prioritization on the ABAP Stackrdquo

To tune the parallelism of inbound and outbound queues the relevant parameters are

EO_INBOUND_PARALLEL and EO_OUTBOUND_PARALLEL have to be used

Below you can see a screenshot of SMQ2 You can see the PI inbound queues and outbound queues Also

ccBPM queues (XBQO$PE) are displayed and will be discussed in section 5

Procedure

Log on to your Integration Server call transaction SMQ2 and execute If you are running ABAP proxies on a

separate client on the same system enter lsquorsquo for the client Transaction SMQ2 provides snapshots only and

must therefore be refreshed several times to get viable information

o The first value to check is the number of queues concurrently active over a period of time Since each queue needs a dialog work process to be worked on the number of concurrent queues must not be too high compared to the number of dialog work processes usable for RFC communication (see Chapter qRFC Resources (SARFC)) The optimum throughput can be achieved if the number of active queues is equal to or slightly higher than the number of dialog work processes (provided that the number of DIA work processes can be handled by the CPU of the system see transaction ST06)

In case many of the queues are EOIO queues (eg because the serialization is done on the material number) try to reduce the queues by following Chapter EOIO tuning

o The second value of importance is the number of entries per queue Hit refresh a few times to check if the numbers increasedecreaseremain the same An increase of entries for all queues or a specific

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

26

queue points to a bottleneck or a general problem on the system The conclusion that can be drawn from this is not simple Possible reasons have been found to include

1) A slow step in a specific interface

Bad processing time of a single message or a whole interface can be caused by expensive processing steps as for example the mapping step or receiver determination This can be confirmed by looking at the processing time for each step as shown in ldquoAnalyzing the runtime of pipeline stepsrdquo

2) Backlog in Queues

Check if inbound or outbound queues face a backlog A backlog in a queue is generally nothing critical since the queues ensure that the PI components as well as the backend systems are not overloaded For instance batch triggered interfaces are usually causing high backlogs that get processed over time Only in case the backlog prevents you to meet the business requirements for an interface this should be analyzed

If the backlog is caused by a high volume of messages arriving in a short period of time one solution to minimize the backlog is to increase the number of queues available for this interface (if sufficient hardware is available) If one interface is having a backlog in the outbound queues you could eg specify EO_OUTBOUND_PARALLEL with a sub parameter specifying your interface to increase the parallelism for this interface

But in general a backlog can also be caused by a long processing time in one of the pipeline steps as discussed in 1) or a performance problem with one of the connected components like PI Java stack or connected backend systems Due to the steps performed in the outbound queues (mapping or call adapter) it is more likely that a backlog will be seen in the outbound queues Inbound queue processing is only expensive if Content Based Routing or Extended Receiver Determination is used To understand which step is taking long follow once more chapter ldquoAnalyzing the runtime of pipeline stepsrdquo

In addition to tuning of the number of inbound and outbound queues also ldquoMessage Prioritization on the ABAP Stackrdquo and ldquoPI Message Packagingrdquo can help

The screenshot below shows such a backlog situation in Wily Introscope Please Note that the information about the queues is only collected by Wily in 5 minutes intervals explaining the gaps between the measurement points On the dashboard ldquoTotal qRFC Inbound Queue Entriesrdquo you can see that the number of messages in SMQ2 is increasing The number of inbound queues does not increase at the same time It also offers a dashboard for the number of entries by queue name and the time entries remain in SMQ2 With the information provided here it is also possible to distinguish a blocking situation or a general resource problem after the problem is no longer present in the system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

27

3) Queues stay in status READY in SMQ2

To see the status of the queues use the filter button in SMQ2 as shown below

It is a normal situation to see many queues in READY status for a limited amount of time The reason for this is the way the QIN scheduler reloads LUWs (see point 2 below) A waiting time of up to 5 seconds per queuing step is considered normal during normal load situation (even if there is no backlog in the queue and enough resources are available) If you observe a situation where many queues are in READY status for longer period of time the following situations can apply

A resource bottleneck with regard to RFC resources Confirm by following section qRFC Resources (SARFC)

To avoid overloading a system with RFC load the QIN scheduler only reloads new queues after a certain amount of queues have finished processing This can lead to long waiting times in the queues as explained in SAP Note 1115861 - Behaviour of Inbound Scheduler after Resource Bottleneck Since PI is a non-user system we recommend setting rfcinb_sched_resource_threshold to 3 as described in SAP Note 1375656 ndash SAP NetWeaver PI System Parameters

Long DB loading times for RFC tables If using Oracle ensure that SAP Note 1020260 - Delivery of Oracle statistics is applied

Additional overhead during the pipeline executions occurs due to PI internal mechanisms Per default a lock is set during processing of each message This is no longer necessary and should

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

28

be switched off by setting the parameter LOCK_MESSAGE of category RUNTIME to 0 as described in SAP Note 1058915 In addition during the processing of an individual message the message repeatedly calls an enqueue which leads to a deterioration of the throughput This can be avoided by setting parameter CACHE_ENQUEUE to 0 as described in SAP Note 1366904

Sometimes a queue stays in READY status for a very long time while other queues are getting processed fine After manual unlocking the queue is processed but then stops after some time again A potential reason is described in Note 1500048 - SMQ2 inbound Queue remains in READY state In a PI environment queues in status RUNNING for more than 30 minutes usually happen due to one individual step taking long (as discussed in step 1) or a problem on the infrastructure (eg memory problems on Java or HTTP 200 lost on network) After 30 minutes the QIN scheduler removes this queue from the scheduling and therefore it remains in READY To solve such cases the root cause for the blocking has to be understood

4) Queue in READY status due to queue blacklisting

If a queue is for a very long time in status RUNNING (eg due to a problem with a mapping or due to a very slow backend) it is possible that the QIN scheduler excludes this queue from queue scheduling Hence the queue stays in status READY even though enough work processes are available The ldquoblacklistingrdquo of queues takes place when the runtime of a queue exceeds the ldquoMonitoring thresholdrdquo configured on this queue Notes 1500048 - queues stay in ready (schedule_monitor) and Note 1745298 - SMQR Inbound queue scheduler identify excluded entries provide more input on this topic The screenshot below shows a setting of 2400 seconds (40 minutes) for this parameter

Check if a queue stays in READY status for a long time while others are processing without any issue Ensure Note 1745298 is implemented and check the system log for the following exception ldquoQINEXCLUDE ltqueue namegt lttimedate at which the queue scheduler startedgt If this is the case check why the long runtime occurs (see section Analyzing the runtime of PI pipeline steps) or increase the schedule_monitor in exceptional cases only (if eg the runtime of an individual processing step cannot be improved further)

5) An error situation is blocking one or more queues

Click the Alarm Bell (Change View) push button once to see only queues with error status Errors delay the queue processing within the Integration Server and may decrease the throughput if for example multiple retries occur Navigate to the first entry of the queue and analyze the problem by forward navigation to the relevant message in SXMB_MONI

Often you see EO queues in SYSFAIL or RETRY due to the problems during the processing of an individual message This can be prevented by following the description of Chapter Prevent blocking of EO queues

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

29

44 Analyzing the runtime of PI pipeline steps

The duration of the pipeline steps is the ultimate answer to long processing times in the Integration Engine

since it describes exactly how much time was spent at which point The recommended way to retrieve the

duration of the pipeline steps is the RWB and will be described below Advanced users may use the

Performance Header of the SOAP message using transaction SXMB_MONI but the timestamps are not

easy to read If you still prefer the latter method here is the explanation 20110409092656165 must be read

as yyyymmddhh(min)(min)(sec)(sec)(microseconds) that is the timestamp corresponds to April 9 2011 at

092656 and 165 microseconds Please note that these timestamps in Performance Header of

SXI_MONITOR transaction are stored and displayed in UTC time therefore conversion to system time must

be done when analyzing them

In case PI message packaging is configured the performance header will always reflect the processing time

per package Hence a duration of 50 seconds does not mean a single message took 50 seconds but

eventually the package contained 100 messages so that every message took 05 ms More details about

this can be found in section PI Message Packaging

Procedure

Log on to your Integration Server and call transaction SXMB_IFR In the browser window follow the link to

the Runtime Workbench In there click Performance Monitoring Change the display to Detailed Data

Aggregated and choose an appropriate time interval for example the last day For this selection you have to

enter the details of the specific interface you want to monitor

You will now see in the lowerndashpart of the screen a split of the processing time into single steps Check the time difference between the steps Does any step take longer than expected In the example in the screenshot below the DB_ENTRY_QUEUEING starts after 0032 seconds and ends after 0243 seconds which means it took 0211 seconds (211ms)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

30

Compare the processing times for the single steps for different measurements as outlined in Chapter

Pipeline Steps (SXMB_MONI or RWB) For example is a single step only long if many messages

are processed or if a single message is processed This helps you to decide if the problem is a

general design problem (single message has long processing step) or if it is related to the message

volume (only for a high number of messages this process step has large values)

Each step has different follow-up actions that are described next

441 Long Processing Times for ldquoPLSRV_RECEIVER_ DETERMINATIONrdquo PLSRV_INTERFACE_DETERMINATION

Steps that can take a long time in inbound processing are the Receiver and Interface Determination In these

steps the receiver system and interface is calculated Normally this is very fast but PI offers the possibility of

enhanced receiver determinations In these cases the calculation is based on the payload of a message

There are different implementation options

o Content-Based Routing (CBR)

CBR allows defining XPath expressions that are evaluated during runtime These expressions can be combined by logical operators for example to check the value for multiple fields The processing time of such a request directly correlates to the number and complexity of the conditions defined

No tuning options exist for the system in regard to CBR The performance of this step can only be changed by reducing the number of rules by changing the design of the interfaces Another option is to use an extended receiver determination (executing a mapping program) since this is faster than CBR using many rules

o Mapping to determine Receivers

A standard PI mapping (ABAP Java XSLT) can also be used to determine the receiver If you observe high runtimes in such a receiver determination follow the steps outlined in the next section since the same mapping runtime is used

442 Long Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo

Before analyzing the mapping you must understand which runtime is used Mappings can be implemented

in ABAP as graphical mappings in the Enterprise Service Builder as self-developed Java mappings or

XSLT mappings One interface can also be configured to use a sequence of mappings executed

sequentially In such a case analysis is more difficult because it is not clear which mapping in the sequence

is taking a long time

Normal debugging and tracing tools (transaction ST12) can be used for mappings executed in ABAP Any

type of mapping can be tested from ABAP using transaction SXI_MAPPING_TEST Knowing

senderreceiver partyserviceinterfacenamespace and source message payload it is possible to check the

target message (after mapping execution) and detailed trace output (similar to contents of trace of the

message in SXMB_MONI with TRACE_LEVEL = 3) It can also be used for debugging at runtime by using

the standard debugging functionality

For ABAP based XSLT mappings it is also possible via transaction XSLT_TOOL to test trace and debug

XSLT transformations on the ABAP stack

In general the mapping response time is heavily influenced by the complexity of the mapping and the

message size Therefore to analyze a performance problem in the mapping environment you should

compare the mapping runtime during the time of the problem with values reported several days earlier to get

a better understanding

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

31

If no AAE is used the starting point of a mapping is the Integration Engine that will send the request by RFC

destination AI_RUNTIME_JCOSERVER to the gateway There it will be picked up by a registered server

program The registered server program belongs to the J2EE Engine The request will be forwarded to the

J2EE Engine by a JCo call and then executed by the Java runtime When the mapping has been executed

the result is sent back to the ABAP pipeline There are therefore multiple places to check when trying to

determine why the mapping step took so long

Before analyzing the mapping runtime of the PI system check if only one interface is affected or if you face a

long mapping runtime for different interfaces To do so check the mapping runtime of messages being

processed at the same time in the system

The best tool for such an analysis is Wily Introscope which offers a dashboard for all mappings being

executed at a given time Each line in the dashboard represents one mapping and shows the average

response time and the number of invocations

In the screenshot below you can see that many different mapping steps have required around 500 seconds

for processing Comparing the data during the incident with the data from the day before will allow you to

judge if this might be a problem of the underlying J2EE engine as described in section J2EE Engine

Bottleneck

If there is only one mapping that faces performance problems there would be just one line sticking out in the

Wily graphs If you face a general problem that affects different interfaces you can choose a longer

timeframe that allows you to compare the processing times in a different time period and verify if it is only a

ldquotemporaryrdquo issue ndash this would for example indicate an overload of the mapping runtime

If you have found out that only one interface is affected then it is very unlikely to be a system problem but

rather a problem in the implementation of the mapping of that specific interface

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

32

Check the message size of the mapping in the runtime header using SXMB_MONI Verify if the message size is larger than usual (which would explain the longer runtime)

There can also be one or many lookups to a remote system (using RFC or JDBC) causing the long processing time of a mapping Together with the application you then have to check if the connection to the backend is working properly Such a mapping with lookups can be analyzed using Wily Transaction Trace as explained in the appendix section Wily Transaction Trace

If not one but several interfaces are affected a potential system bottleneck occurs and this is described in

the following

o Not enough resources (registered server programs) available That could either be the case if too many mapping requests are issued at the same time or if one J2EE server node is down and has not registered any server programs

To check if there were too many mapping requests for the available registered server programs compare the number of outbound queues that are concurrently active with the number of registered server programs The number of outbound queues that are concurrently active can be monitored with transaction SMQ2 and counting queues with the names XBTO amp XBQOXB2O XBTA and XBQAXB2A (high priority) XBTZ and XBQZXB2Z (low priority) and XBTM (large messages) In addition to this you have to take into account mapping steps that are executed by synchronous messages or in a ccBPM process Mappings executed by a ccBPM process are not queued but the ccBPM process calls the mapping program directly using tRFC The number of registered server programs can be determined with transaction SMGW Goto Logged On Clients and filtering by the program (ldquoTP Namerdquo) AI_RUNTIME_ltSIDgt In general we recommend configuring 20 parallel mapping connections per server node

If the problem occurred in the past it is more difficult to determine the number of queues that are concurrently active One way is to use transaction SXMB_MONI and specify every possible outbound queue (field ldquoQueue IDrdquo) in the advanced selection criteria (Note wildcard search not available) The timeframe can be restricted in the upper-part of the advanced selection criteria Advanced users could use transaction SE16 to search directly in table SXMSPMAST using the field ldquoQUEUEINTrdquo

The other option is to use the Wily ldquoqRFC Inbound Queue Detailrdquo dashboard as described in qRFC Queue Monitoring (SMQ2)

To check if the J2EE is not available or if the server program is not registered for a different reason use transaction SM59 to test the RFC destination AI_RUNTIME_JCOSERVER If the problem occurred in the past search the gateway trace (transaction SMGW GoTo Trace Gateway Display File) and the RFC developer traces dev_rfc files available in the work directory for the string ldquoAI_RUNTIMErdquo In addition check the std_serverltngtout in the work directory for restarts of the J2EE Engine

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

33

o Depending on the checks above there are two options to resolve the bottleneck Of course if the J2EE was not available the reason for this has to be found and prevented in the future

The two options for tuning are

1) the number of outbound queues that are concurrently active and

2) the number of mapping connections from the J2EE server to the ABAP gateway

The increase of mapping connections is only recommended for strong J2EE servers since each mapping thread needs resources like CPU and Heap memory Too many mapping connections mean too many mappings being executed on the J2EE Engine In turn this might reduce the performance of other parts of the PI system for example the pipeline processing in the Integration Engine or the processing within the adapters of the AFW or can even cause out-of-memory errors on the J2EE engine Add additional J2EE server nodes to balance the parallel requests Each J2EE server node will

register new destinations at the gateway and will therefore take a part of the mapping load

Another option is to reduce the number of outbound queues that are concurrently active to solve the bottleneck This will certainly result in higher backlogs in the queue and is therefore only applicable for less runtime critical interfaces This option can be used if the backlog is caused by a master data interface with high volume but no critical processing time requirements In such a case you can only lower the number of parallel queues for this interface by using a sub-parameter of EO_OUTBOUND_PARALLEL in transaction SXMB_ADM By doing so you slow down one interface to ensure other (more critical) interfaces have sufficient resources available

o If multiple application servers configured on the PI system ensure that all the server nodes are connected to their local gateway (local bundle option) as described in the guide How To Scale PI

443 Long Processing Times for ldquoPLSRV_CALL_ADAPTERrdquo

Call adapter is the last step executed in the PI pipeline and forwards the message to the next component

along the message flow In most cases (local Adapter Engine IDoc or BPE) this is a local call The IDoc

adapter only puts the message on the tRFC layer (SM58) of the PI system so that the actual transfer of the

IDoc is not included in the performance header For ABAP Proxies plain HTTP or calls to a central or

decentral Adapter Engine an HTTP call is made so that network time can have an influence here

Looking at the processing time we have to distinguish asynchronous (EO and EOIO) and synchronous (BE)

interfaces

In asynchronous interfaces the call adapter step includes the transfer of the message by the network and

the initial steps on the receiver side (in most cases this just means persisting the message on the receiverrsquos

database) A long duration can therefore have two reasons

o Network latency For large messages of several MB in particular the network latency is an important factor (especially if the receiving ABAP proxy system is located on another continent for example) Network latency has to be checked in case of a long call adapter step In case HTTP load balancers are used they are considered as part of the network

o Insufficient resources on receiver side Enough resources must be available at the receiver side to ensure quick processing of a message For instance enough ICM threads and dialog work processes must be available in the case of an ABAP proxy system Therefore the analysis of long call adapter steps always has to include the relevant backend system

For synchronous messages (requestresponse behavior) the call adapter step also includes the processing

time on the backend to generate the response message Therefore the call adapter for synchronous

messages includes the time of the transfer of the request message the calculation of the corresponding

response message at the receiver side and the transfer back to PI Therefore the processing time to

process a request at the receiving target system for synchronous messages must always be analyzed to find

the most costly processing steps

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

34

444 Long Processing Times for ldquoDB_ENTRY_QUEUEINGrdquo

The value for DB_ENTRY_QUEUEING describes the time that a message has spent waiting in a PI inbound

queue before processing started In case of errors the time also includes the wait time for restart of the LUW

in the queue The inbound queues (XBTI XBT1 XBT9 XBTL for EO messages and XBQIXB2I

XBQ1XB21 XBQ9XB29 for EOIO messages) process the pipeline steps for the Receiver

Determination the Interface Determination and the Message Split (and optionally XML inbound validation)

Thus if the step DB_ENTRY_QUEUEING has a high value the inbound queues have to be monitored using

transactions SMQ2 and SARFC The reasons are similar as those outlined in chapter qRFC Resources

(SARFC) and qRFC Queues (SMQ2)

o Not enough resources (DIA work processes for RFC communication) available

o A backlog in the queues caused by one of the following reasons

The number of parallel inbound queues is too low to handle the incoming messages in parallel Note that a simple increase of the parameter EO_INBOUND_PARALLEL might not always be the solution (as enough work processes also have to be available to process the queues in parallel) The number of work processes is in turn restricted by the number of CPUs that are available for your system If you would like to separate critical from non-critical sender systems define a dedicated set of inbound queues by specifying a sender ID as a sub parameter of EO_INBOUND_PARALLEL For example by doing so you can separate master data interfaces from business critical interfaces Please note that the separation on inbound queues only works on Business System level and not on interface level

The inbound queues were blocked by the first LUW Use transaction SMQ2 to check if that is still the case To identify if such a situation occurred in the past use the Wily Introscope dashboard ldquoABAP System qRFC Inbound Detailldquo It is generally not easy to find the one message that is blocking the queue It might be achieved by checking RFC traces or by searching for a single message with a long pipeline processing step For EO queues there is no business reason to block the queues in case of a single message error To prevent this follow the description in section Prevent blocking of EO queues

In case you have a different runtime of the messages in the queues (eg due to complex extended Receiver Determinations) the number of messages per queue might be uneven distributed between the queues In PI 73 a new balancing option is available as discussed in Avoid uneven backlogs with queue balancing

Problems when loading the LUW by the QIN scheduler as described in qRFC Queues (SMQ2)

Note If your system uses the parameter EO_INBOUND_TO_OUTBOUND = 0 you must also read chapter

445 for analyzing the reasons EO_INBOUND_TO_OUTBOUND only determines whether inbound queues

(value lsquo0rsquo) or inbound and outbound queues (value lsquo1rsquo) are used for the pipeline processing in the integration

server Check the value with transaction SXMB_ADM Integration Engine Configuration Specific

Configuration The default value is lsquo1rsquo meaning the usage of inbound and outbound queues (the

recommended behavior)

445 Long Processing Times for ldquoDB_SPLITTER_QUEUEINGrdquo

The value for DB_SPLITTER_QUEUEING describes the time that a message has spent waiting in a PI

outbound queue until a work process was assigned In case of errors the time also includes the wait time for

the restart of the LUW in the queue

The outbound queues (XBTO XBTA XBTZ XBTM for EO messages and XBQOXB2O

XBQAXB2A XBQZXB2Z for EOIO messages) process the pipeline steps for the mapping outbound

binding and call adapter

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

35

Thus if the step DB_SPLITTER_QUEUEING has a high value the outbound queues have to be monitored

using transactions SMQ2 and SARFC The reasons are similar as outlined in chapters qRFC Resources

(SARFC) and qRFC Queues (SMQ2) or described in the section above for PI inbound queues

o Not enough resources (DIA work processes for RFC communication) available

o A backlog in the queues caused by one of the following reasons

The number of parallel outbound queues is too low to handle the outgoing messages in parallel Note that simply increasing the parameter EO_OUTBOUND_ PARALLEL might not always be the solution as enough work processes have to be available to process the queues in parallel The number of work processes in turn is limited by the number of CPUs available for the system Also the parameter EO_OUTBOUND_PARALLEL is used differently than the parameter EO_INBOUND_ PARALLEL because it determines the degree of parallelism for each receiver system Thus it is possible to increase the number of parallel outbound queues for specific receivers whereas other receivers are handled with a lower degree of parallelism Note that not every receiver backend is able to handle a high degree of parallelism so that in case of an ABAP Proxy system it might not indicate a problem on PI in case you observe a high value for DB_SPLITTER_QUEUING

The outbound queues were blocked by the first LUW To identify if such a situation occurred in the past use the Wily Introscope dashboard ldquoABAP System qRFC Inbound Detailldquo In general it is not easy to find the one message that is blocking the queue It might be achieved by checking RFC traces or by searching for a single message with a long pipeline processing step For EO queues there is no business reason to block the queues in case of a single message error To prevent this follow the description in section Prevent blocking of EO queues

Since the outbound queues include a processing step that is bound to take longer than the other steps (that is the mapping and the call adapter step) this might mean a long waiting time for all subsequent messages As an example assume that a mapping takes 1 second and that 100 messages are waiting in a queue The first message gets executed at once the second message has to wait about 1 second (a bit more since more steps are executed than just the mapping but this should be ignored at the moment) the third takes 3 seconds and so on The 100

th message

has to wait 100 seconds until it is processed that is the value for DB_SPLITTER_QUEUING is as high as 100 For a given outbound queue you would therefore see an increase for the DB_SPLITTER_QUEUING value over time If you experience this situation proceed with Chapter 442

In case you have a different runtime of the messages in the queues (eg due to different message size) the number of messages per queue might be uneven distributed between the queues In PI 73 a new balancing option is available as discussed in Avoid uneven backlogs with queue balancing

o Problems when loading the LUW by the QIN scheduler as described in qRFC Queues (SMQ2)

Note The number of parallel outbound queues is also connected with the ability of the receiving system to

process a specific amount of messages for each time unit

In section Adapter Parallelism default restrictions of the specific adapters of the J2EE Engine are explained Some adapters (JDBC File Mail) can process only one message for each server node at a time (this can be changed) It does not make sense for such adapters to have too many parallel qRFC queues since this will only move the message backlog from ABAP to Java

For messages that are directed to the IDoc outbound adapter the value for EO_OUTBOUND_PARALLEL is connected to the MAXCONN value of the corresponding outbound destination You can define the maximum number of connections using transaction SMQS ndash the default value is 10 parallel connections If applicable SAP highly recommends that you use IDoc packaging in the outbound IDoc adapter to reduce the load during the transmission of tRFC calls from the PI system to the receiving system

For ABAP proxy systems the number of outbound queues directly determines the number of messages sent in parallel to the receiving system Thus you have to ensure that the resources

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

36

available on PI are aligned with those on the sendingreceiving ABAP proxy system

446 Long Processing Times for ldquoLMS_EXTRACTIONrdquo

Lean Message Search can be configured for newer PI releases as described in the Online Help After

applying Note 1761133 - PI runtime Enhancement of performance measurement an additional header for

LMS will be written to the performance header

The header could look like this indicating that around 25 seconds were spent in the LMS analysis

When using trace level two additional timestamps are written to provide details about this overall runtime

LMS_EXTRACTION_GET_VALUES

This timestamp describes the evaluation of the message according to user-defined filter criteria

LMS_EXTRACTION_ADJUST_VALUES

This timestamp describes the post-processing of the previously found values in the BAdI (as of PI 730)

The runtime of the LMS heavily depends on the number of payload attributes to be extracted Also the

payload size and the complexity of the XPath expression has a direct impact on the performance of the LMS

In general the number of elements to be indexed should be kept at a minimum and very deep and complex

XPath expressions should be avoided

If you want to minimize the impact of LMS on the PI message processing time you can define the extraction

method to use an external job In that case the messages will be indexed after processing only and it will

therefore have no performance impact during runtime Of course this method imposes a delay in the

indexing (based on the frequency of job SXMS_EXTRACT_MESSAGES) so that newly processed

messages cannot be searched using LMS If this delay is acceptable for the monitoring responsible the

messages should be indexed using an external job

447 Other step performed in the ABAP pipeline

As discussed earlier there are additional steps in the ABAP pipeline that can be executed based on the

configuration of your scenario Per default these steps will not be activated and should therefore not

consume any time All these steps are traced in the performance header of the message Below you can see

the details for

- XML validation

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

37

- Virus Scan

In case one of these steps is taking long you have to check the configuration of your scenario In the

example of the Virus scan the problem might be related to the external virus scanner used and tuning has to

happen on that side

45 PI Message Packaging for Integration Engine

PI message packaging was introduced with SAP NetWeaver PI 70 SP13 and is per default activated in PI

71 and higher Message packaging in BPE is independent of message packaging in PI (see chapter 56)

but they can be used together

To improve performance and message throughput asynchronous messages can be assembled to packages

and processed as one entity For this the qRFC scheduler was enabled to process a set of messages (a

package) in one LUW Thus instead of sending one message to the different pipeline steps (for example

mappingrouting) a package of messages will be sent that will reduce the number of context switches that is

required Furthermore access to databases is more efficient since requests can be bundled in one database

operation Depending on the number and size of messages in the queue this procedure improves

performance considerably In return message packaging can increase the runtime for individual messages

(latency) due to the delay in the packaging process

Message packaging is only applicable for asynchronous scenarios (QoS EO and EOIO) Due to the potential

latency of individual messages packaging is not suitable for interfaces with very high runtime requirements

In general packaging has the highest benefit for interfaces with high volume and small message size

The performance improvement achieved directly relates to the number of the messages bundled in each

package Message packaging must not solely be used in the PI system Tests have shown that the

performance improvement significantly increases if message packaging is configured end-to-end - that is

from the sending system using PI to the receiving system Message packaging is mainly applicable for

application systems connected to PI by ABAP proxy

From the runtime perspective no major changes are introduced when activating packaging Messages

remain individual entities in regards to persistence and monitoring Additional transactions are introduced

allowing the monitoring of the packaging process

Messages are also treated individually for error handling If an error occurs in a message processed in a

package then the package will be disassembled and all messages will be processed as single messages Of

course in a case where many errors occur (for example due to interface design) this will reduce the benefits

of message packaging SAP systems connected to PI via ABAP Proxy can also make use of the message

packaging to reduce the HTTP calls necessary to transmit messages between the systems Other sending

and receiving applications in particular will not see any changes because they send and receive individual

messages and receive individual commits

The PI message packaging can be adapted to meet your specific needs In general the packaging is

determined by three parameters

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

38

1) Message count Maximum number of messages in a package (default 100)

2) Maximum package size Sum of all messages in kilobytes (default 1 MB)

3) Delay time Time to wait before the queue is processed if the number of messages does not reach

the message count (default 0 meaning no waiting time)

These values can be adjusted using transactions SXMS_BCONF and SXMS_BCM To better use the

performance improvements offered by packaging you could for example define a specific packaging for

interfaces with very small messages to allow up to 1000 messages for each package Another option could

be to increase the waiting time (only if latency is not critical) to create bigger packages

See SAP Note 1037176 - XI Runtime Message Packaging for more details about the necessary

prerequisites and configuration of message packaging More information is also available at

httphelpsapcom SAP NetWeaver SAP NetWeaver PIMobileIdM 71 SAP NetWeaver Process

Integration 71 Including Enhancement Package 1 SAP NetWeaver Process Integration Library

Function-Oriented View Process Integration Integration Engine Message Packaging

46 Prevent blocking of EO queues

In general messages for interfaces using Quality of Service Exactly Once are independent of each other In

case of an error in one message there is no business reason to stop the processing of other messages But

exactly this happens in case an EO queue goes into error due to an error in the processing of a single

message The queue will then be automatically retried in configurable intervals This retry will cause a delay

of all other messages in the queue which cannot be processed due to the error of the first message in the

queue

To avoid this a new error handling was introduced in SAP Notes 1298448 - XI runtime No automatic retry

for EO message processing (set EO_RETRY_AUTOMATIC = 0 and schedule restart job

RSXMB_RESTART_MESSAGES) and SAP Note 1393039 - XI runtime Dump stops XI queue After

applying this Note EO queues will not go in SYSFAIL status any longer These Notes are recommended for

all PI systems and are per default activated on PI 73 systems By specifying a receiver ID as sub parameter

this behavior can be configured for individual interfaces only

In PI 73 you can also configure the number of messages that would go into error before the whole queue

would be stopped For permanent errors (eg due to a problem with the receiving backend) the queue would

go into SYSFAIL so that not too many messages are going into error This also reduces the number of alerts

for Message Based Alerting To define the threshold the parameter EO_RETRY_AUT_COUNT can be set

The default value is 0 and indicates that the number of messages is not restricted

47 Avoid uneven backlogs with queue balancing

From PI 73 on a new mechanism is available to address the distribution of messages in queues Per default

in all PI versions the messages are getting assigned to the different queues randomly In case of different

runtimes of LUWs caused by eg different message sizes or different runtimes in mappings this can cause

an uneven distribution of messages within the different queues This can increase the latency of the

messages waiting at the end of that queue

To avoid such an unequal distribution the system checks the queue length before putting a new message in

the queue Hence it tries to achieve an equal balancing during inbound processing A queue which has a

higher backlog will get less new messages assigned Queues with fewer entries will get more messages

assigned Therefore it is different than the old BALANCING parameter of category TUNING) which was used

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

39

to rebalance messages already assigned to a queue The assignment of messages to queues is shown in

the diagram below

To activate the new queue balancing you have to set the parameters EO_QUEUE_BALANCING_READ and

EO_QUEUE_BALANCING_SELECT of category TUNING in SXMB_ADM If the value of the parameter

EO_QUEUE_BALANCING_READ is 0 (default value) then the messages are distributed randomly across

the queues that are available If however EO_QUEUE_BALANCING_READ is set to a value n greater than

zero then on average the current fill level of the queue is determined after every nth message and stored in

the shared memory of the application server This data is used as the basis for determining the queue (see

description of parameter EO_QUEUE_BALANCING_SELECT) relevant for balancing

The parameter EO_QUEUE_BALANCING_SELECT specifies the relative fill levels of the queues in percent

Relative here means in relation to the maximum filled queue If there are queues with a lower fill level than

defined here then only these are taken into consideration for distribution If all queues have a higher fill level

then all queues are taken into consideration

Please note that determining the fill level takes place using database access and therefore impacts system

performance The value of EO_QUEUE_BALANCING_READ should for this reason be oriented towards the

message throughput and specific requirements or even distribution For higher volume system a higher value

should be chosen

Letrsquos look at an example In case you set the parameter EO_QUEUE_BALANCING_READ to 1000 after

every 1000th incoming message the queue distribution will be checked This requires a database access

and can therefore cause a performance impact The fill level for each queue will then be written to shared

memory In our example we configured EO_QUEUE_BALANCING_SELECT to 20 At a given point in time

you have three outbound queues on the system XBTO__A contains 500 entries XBTO__B 150 and

XBTO__C contains 50 messages Therefore XBTO__B has a fill level of 30 and XBTO__C of 10 That

means the only queue relevant for balancing is XBTO__C which will get more messages assigned

Based on this example you can see that it is important to find the correct values As a general guideline to

minimize performance impact you should set EO_QUEUE_BALANCING_READ to a rather large value The

correct value of EO_QUEUE_BALANCING_SELECT depends on the criticality of a delay caused by a

backlog

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

40

48 Reduce the number of parallel EOIO queues

As discussed in Chapter Parallelization of PI qRFC Queues it is important to keep the number of queues

limited As a rule of thumb the number of active queues should be equal to the number of available work

processes in the system It is especially crucial to not have a lot of queues containing only one message

since this will cause a high overhead on the QIN scheduler due to very frequent reloading of the queues

Especially for EOIO queues we often see a high number of parallel queues containing only one message

The reason for this is for example that the serialization has to be done on document number to eg ensure

that updates to the same material are not transferred out of sequence Typically during a batch triggered

load of master data this can result in hundreds of EOIO queues in SMQ2 just containing only one or two

messages This is very bad from a performance point of view and will cause significant performance

degradation for all other interfaces running at the same time To overcome this situation the overall number

of EOIO queues (for all interfaces) can be limited as of PI 71 as described in Change Number of EOIO

Queues

The maximum number of queues can be set in SXMB_ADM Configure Filter for Queue Prioritization

Number of EOIO queues

During runtime a new set of queues will be used with the name XB2 as can be seen below for the outbound

queues

These queues will be shared between all EOIO interfaces for the given interface priority and by setting the

number of queues the parallelization for all EOIO interfaces is limited Thus more messages will be using the

same EOIO queue and therefore PI message packaging will work better and also the reloading of the

queues by the QIN scheduler will show much better performance

In case of errors the messages will be removed from the XB2 queues and will be moved to the standard

XBQ queues All other messages for the same serialization context will be moved to the XBQ queue

directly to ensure the serialization is maintained This means that in case of an error the shared EOIO

queues will not be blocked and messages for other serialization contexts will be not delayed

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

41

49 Tuning the ABAP IDoc Adapter

Very often the IDoc adapter deals with very high message volume and tuning of this is very essential

491 ABAP basis tuning

As stated above the IDoc adapter uses the tRFC layer to transfer messages from PI to the receiver system

The IDoc adapter will only put the message in SM58 and from there the standard tRFC layer forwards the

LUW You therefore have to ensure that sufficient resources (mainly DIA WPs) are available for processing

tRFC requests

Wily Introscope also offers a dashboard (shown below) for the tRFC layer which shows the tRFC entries and

their different statuses With these dashboards you are also able to identify history backlogs on tRFC

In order to control the resources used when sending the IDocs from sender system to PI or from PI to the

receiver backend you can also think about registering the RFC destinations in SMQS in PI as known from

standard ALE tuning By changing the Max Conn or the Max Runtime values in SMQS you can limit or

increase the number of DIA WPs used by the tRFC layer to send the IDocs for a given destination This will

mitigate the risk of system overload on sender and receiver side

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

42

492 Packaging on sender and receiver side

One option for improving the performance of messages processed with the IDoc adapter is to use IDoc

packaging

The receiver IDoc adapter in PI (for sending messages from PI to the receiving application system) already

had the option for packaging IDocs in previous releases For this messages processed in the IDoc adapter

are bundled together before being sent out to the receiving system Thus only one tRFC call is required to

send multiple IDocs The message packaging that is being discussed in section PI Message Packaging uses

a similar approach and therefore replaces the ldquooldrdquo packaging of IDocs in the receiver IDoc adapter

Therefore we highly recommend configuring Message Packaging since this helps transferring data for the

IDoc adapter as well as the ABAP Proxy

For the sender IDoc adapter there was previously the option to activate packaging in the partner profile of

the sending system in case IDocs are not send immediately but in batch By specifying the ldquoMaximum

Number of IDocsrdquo in the report RSEOUT00 multiple IDocs are bundled in one tRFC calls At the PI side

these packages will be disassembled by the IDoc adapter and the messages will be processed individually

Therefore this will help mainly to reduce the tRFC resources involved in transferring the data between the

systems but not within PI

This behavior was changed with SAP NetWeaver PI 711 and higher If the sender backend sends an IDoc

package to PI a new option has been introduced which allows the processing of IDocs as a package on PI

as well You can specify an IDoc package size on the sender IDoc Communication Channel The necessary

configuration is described in detail in the following SDN blog - IDoc Package Processing Using Sender IDoc

Adapter in PI EhP1 A significant increase in throughput was recorded for high volume interfaces with small

IDocs

493 Configuration of IDoc posting on receiver side

As described in SAP Note 1828282 - IDoc adapter long runtime XI -gt IDoc-gt ALE also the way the IDocs

are posted on the receiver ERP backend will directly influence the processing of the LUW on the PI side In

general two options exist for IDoc posting

Trigger immediately The IDoc will be posted directly when it is received For this a free dialog workprocess will be required

Trigger by background program If this option is chosen the IDoc is persisted on the ERP side and the posting of the application data will happen asynchronously using report RBDAPP01 via a background job

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

43

When ldquotrigger immediatelyrdquo is configured the sender system (PI) has to wait till the IDoc is posted While the

workprocess will roll out you will see the LUW in ldquoTransaction Executingrdquo during this time For the

IDOC_AAE the RFC connection and the java threads will be kept busy until the IDOC is posted in the

backend therefore leading to backlogs on the IDOC_AAE

The coding for posting the application data can be very complex and therefore this operation can take a long

time consuming unnecessary resources also on the sender side With background processing of the IDocs

the sender and receiver systems are decoupled from each other

Due to this we generally recommend to use ldquotrigger immediatelyrdquo in exceptional cases where the data has to

be posted without any delay In all other cases like high volume replication of master data background

processing of IDocs should be chosen

With Note 1793313 and 1855518 a third option for posting IDocs on the ERP side was introduced This

option is using bgRFC on the ERP side to queue the IDocs As soon as the configured resources for the

bgRFC destination are available the IDocs will be posted By doing so the IDocs will be posted based on

resource availability on the receiver system and no additional background jobs will be required Furthermore

storing the LUWs in bgRFC will ensure that the sender and receiver systems are decoupled This option

therefore guarantees that the IDocs will be posted as soon as possible based on the available resources on

the receiver system without the requirement to schedule many background jobs This new option is therefore

the recommended way of IDoc posting in systems that fulfill the necessary requirements

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

44

410 Message Prioritization on the ABAP Stack

A performance problem is often caused by an overload of the system for example due to a high volume

master interface If business critical interfaces with a maximum response time are running at the same time

then they might face delays due to a lack of resources Furthermore as described in qRFC Queues (SMQ2)

messages of different interfaces use the same inbound queue by default which means that critical and less-

critical messages are mixed up in the same queue

We highly recommend using message prioritization on the ABAP stack to prevent such delays for critical

interfaces (for Java see Interface Prioritization in the Messaging System )

For more information about PI prioritization see httphelpsapcom and navigate to SAP NetWeaver SAP

NetWeaver PIMobileIdM 71 SAP NetWeaver Process Integration 71 Including Enhancement Package 1

SAP NetWeaver Process Integration Library Function-Oriented View Process Integration

Integration Engine Prioritized Message Processing

To balance the available resources further between the configured interfaces you can also configure a

different parallelization level for queues with different priorities Details for this can be found in SAP Note

1333028 ndash Different Parallelization for HP Queues

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

45

5 ANALYZING THE BUSINESS PROCESS ENGINE (BPE)

This chapter helps to analyze performance problems if Chapter Determining the Bottleneck has shown that

the BPE is the slowest step for a message during its time in the PI system Of course this type of runtime

engine is very susceptible to inefficient design of the implemented integration process Information about

best practices for designing BPE processes can be found in the documentation Making Correct Use of

Integration Processes In general the memory and resource consumption is higher than for a simple pipeline

processing in the Integration Engine As outlined in the document linked above every message that is sent

to BPE and every message that is sent from BPE is duplicated In addition work items are created for the

integration process itself as well as for every step More database space is therefore required than

previously expected and more CPU time is needed for the additional steps

51 Work Process Overview

The Business Process Engine executes the steps (work items) of an integration process using tRFC calls to

the RFC destination WORKFLOW_LOCAL_ltclientgt To check if there are enough resources available to

process the work items call transaction SM50 (on each application server) while one of the integration

processes is running

o Look for the user ldquoWF-BATCHrdquo and follow its activities by refreshing the transaction

o Are there always dialog work processes available

The most important point to keep in mind here is the concurrent activity of the Business Process Engine (for

ccBPM processes) and the Integration Engine (for pipeline processing) Both runtime environments need

dialog work processes Thus performance problems in one of the engines cannot be solved by restricting

the other Rather an appropriate balance between the two runtimes has to be found

The activity of the WF-BATCH user is not restricted by default It is highly recommended that you register the

RFC destination WORKFLOW_LOCAL_ltclientgt with transaction SMQS and assign a suitable amount of

maximum connections Otherwise ccBPM activity may block the pipeline processing of the Integration

Server and reduce the throughput of PI drastically SAP Note 888279 - Regulatingdistributing the Workflow

Load gives more details

In addition it might help to use specific servers (dialog instances) to carry out a given workflow or to use load

balancing Of course both options only apply for larger PI systems that consist of at least a central instance

and one dialog instance

52 Duration of Integration Process Steps

As described in section Processing Time in the Business Process Engine with PI 73 there is a new monitor

available in ldquoConfiguration and Monitoring Homerdquo on PI start page It shows the duration of individual steps

of an integration process in a very easy way and is the now tool to analyze performance related issues on

the ccBPM

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

46

For releases prior than PI 73 check in the workflow log for long-running steps as described in the previous

chapter Use the ldquoList with Technical Detailsrdquo to do so Since the design of integration processes varies

significantly there is no general rule for the duration of a step or for specific sequences of steps Instead

you have to get a feeling for your implemented integration processes

Did a specific process step decrease in performance over a period of time

Does one specific process step stick out with regard to the other steps of the same integration process

Do you observe long durations for a transformation step (ldquomappingrdquo)

Do you observe long durations of synchronous sendreceive steps in the integration process which correlates for example to a slow response from a connected remote system

Use the SAP Notes database to learn about recommendations and hints SAP Note 857530 - BPE

Composite Note Regarding Performance acts as a central note for all performance notes and might be used

as the entry point

Once you are able to answer the above questions there are several paths to follow

o Two common reasons should be taken into consideration for a performance decrease over time Firstly the load on PI might have increased over time as well Check the hardware resources (chapter 9) and carefully monitor the work process availability (chapter 51) Alternatively the underlying databases might slow down the integration processes Use chapter 54 to check this possibility

o If it is a specific process step that sticks out the process design itself has to be reviewed The SAP Notes mentioned above might help here

o If the mapping step takes a long time it is worth having a look at chapter 442 since the BPE and the IE use the same resources (mapping connections from the ABAP gateway to the J2EE server) to execute their mapping

o If there is a long processing time for a synchronous sendreceive step check the connected backend system for any performance issues

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

47

53 Advanced Analysis Load Created by the Business Process Engine (ST03N)

Since the Business Process Engine runs on the ABAP part of the PI Integration Server you can use the

statistical data written by the dialog work processes to analyze a performance problem

Procedure

Log in to your Integration Server and call transaction ST03N Switch to ldquoExpert Moderdquo and choose the

appropriate timeframe (this could be the ldquoLast Minutersquos Loadrdquo from the Detailed Analysis menu or a specific

day or week from the Workload menu) In the Analysis View navigate to User and Settlement Statistics and

then to User Profile The user you are interested in is the WF-BATCH user who does all ccBPM-related work

Compare the total CPU Time (in seconds) to the time frame (in seconds) of the analysis In the above example an analysis time frame of 1 hour has been chosen which corresponds to 3600 seconds To be exact these 3600 seconds have to be multiplied by the number of available CPUs Out of these 3600CPU seconds the WF-BATCH user used up 36 seconds

o The above value gives you a good idea of how much load the Business Process Engine creates on your PI Integration Server By comparison with the other users (as well as the CPU utilization from transaction ST06) you can determine if the Integration Server is able to handle this load with the available number of CPUs or not This does not help you to optimize the load but it does provide support when deciding how to distribute the workload of the BPE and the workload of the Integration Server (see Chapter 51)

Experienced ABAP administrators should also analyze the database time wait time and compare these values

54 Database Reorganization

The database is accessed many times during the processing of an Integration Process This is not usually

connected with performance problems but if a specific database table is large then statements may take

longer than for small database tables

Use transaction ST05 to collect a SQL trace while the integration process in question is being executed Restrict the trace to user WF-BATCH When the integration process is finished stop the trace and view it Look for high values in the duration column especially for SELECT statements

Use transaction SE16 to determine the number of entries for the typical workflow tables SWW_WI2OBJ and SWFRXI There are many more database tables used by BPE but if the number of entries is high for the above tables they are high for the rest as well See SAP Note 872388 - Troubleshooting Archiving and Deletion in XI and the links given in this note if you find high amounts of entries

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

48

55 Queuing and ccBPM (or increasing Parallelism for ccBPM Processes)

Until recently ccBPM processes accepted messages from only one queue (one queue XBQO$PEWS for

each workflow) If there was a high message throughput for this workflow then a high backlog occurred for

these queues A couple of enhancements have therefore been implemented to improve scalability if the

ccBPM runtime

o Inbound processing without buffering

o Use of one configurable queue that can be prioritized or assigned to a dedicated server

o Use of multiple inbound queues (queue name will be XBPE_WS)

o Configure transactional behavior of ccBPM process steps to reduce number of persistency steps

Details about the configuration and tuning of the ccBPM runtime are described at

httpswwwsdnsapcomirjsdnhowtoguides SAP Netweaver 70 End-to-End Process Integration

o How to Configure Inbound Processing in ccBPM Part I Delivery Mode

o How to Configure Inbound Processing in ccBPM Part II Queue Assignment

o How to Configure ccBPM Runtime Part III Transactional Behavior of an Integration Process

Procedure

Use transaction SMQ2 to check if the queues of type XBQO$PE show a high backlog

o Based on this information verify if the ccBPM process is suitable to run with multiple queues This is especially the case for collect patterns realized in BPE If you can use multiple queues configure the workflow based on the above guides

56 Message Packaging in BPE

Inbound processing takes up the most amount of processing time in many scenarios within BPE

Message packaging in BPE helps to improve performance by delivering multiple messages to BPE process

instances in one transaction This can lead to an increased throughput which means that the number of

messages that can be processed in a specific amount of time can increase significantly Message packaging

can also increase the runtime for individual messages (latency) due to the delay in the packaging process

The sending of packages can be triggered when the packages exceed a certain number of messages a

specific package size (in kB) or a maximum waiting time The extent of the performance improvement that

can be obtained depends on the type of scenario Scenarios with the following prerequisites are generally

most suitable

o Many messages are received for each process instance

o Messages that are sent to a process instance arrive together in a short period of time

o Generally high load on the process type

o Messages do not have to be delivered immediately

For example collect scenarios are particularly suitable for message packaging The higher the number of

messages is in a package the higher the potential performance improvements will be

Tests that have been performed have shown a high potential for throughput improvements up to factor 47 in

tested Collect Scenarios depending on the packaging size that has been configured For more details about

the functionality of Message Packaging in BPE and its configuration see SAP Note 1058623 - BPE

Message Packaging

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

49

Message packaging in BPE is independent of message packaging in PI (see Chapter 45) but they can be

used together

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

50

6 ANALYZING THE (ADVANCED) ADAPTER ENGINE

This chapter describes further checks and possible reasons for bottlenecks in the (Advanced) Adapter

Engine Although this is not the place to describe the architecture of the Adapter Framework in detail the

following aspects are important to know for the analysis of a performance problem on the AFW

o The AFW contains the Messaging System that is responsible for the persistence and queuing of messages Within the flow of a message it is placed in between the sender adapter (for example a file adapter that polls from a folder) and the Integration Server as well as between the Integration Server and the receiver adapter (for example a JMS adapter that sends a message out to an external system)

o The Messaging System uses 4 queues for each adapter type and these are responsible for receiving and sending messages To separate the message transfer each adapter (for example JMS SOAP JDBC and so on) has its own set of queues One queue for receiving synchronous messages (Request queue) one queue for sending synchronous messages (Call queue) one queue for receiving asynchronous messages (Receive queue) and one queue for sending asynchronous messages (Send queue) The name of the queue always states the adapter type and the queue name (for example JMS_httpsapcomxiXISystemReceive for the JMS receiver asynchronous queues)

o A configurable number of consumer threads are assigned on each queue These threads work on the queue in parallel meaning that the Messaging Queues are not strictly FirstIn-FirstOut Five threads are assigned by default to each queue and thus five messages can be processed in parallel But as will be discussed later the parallel execution of requests to the backend depends heavily on the adapter being used since not every adapter can work in parallel by default

o SAP NetWeaver PI 71 introduced a new queue The dispatcher queue The dispatcher queue is used to realize prioritization on the AAE (see chapter Prioritization in the Messaging System) Every message in the PI system is first assigned to the dispatcher queue before it is assigned to the adapter-specific queues

Viewed from the perspective of a message that enters the PI system using a J2EE Engine adapter (for

example File) and leaves the PI system using a J2EE Engine adapter (for example JDBC) asynchronously

the steps are as follows

1) Enter the sender adapter

2) Put into the dispatcher queue of the messaging system

3) Forwarded to the File send queue of the messaging system (based on Interface priority)

4) Retrieved from the send queue by a consumer thread

5) Sent from the messaging system to the pipeline of the ABAP Integration Engine (if no Integrated Configuration (AAE) is used

6) Processed by the pipeline steps of the Integration Engine

7) Sent to the messaging system by the Integration Engine

8) Put into the dispatcher queue of the messaging system

9) Forwarded to the JDBC receive queue of the messaging system (based on Interface priority)

10) Retrieved from the receive queue (based on the maxReceivers)

11) Sent to the receiver adapter

12) Sent to the final receiver

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

51

All the analysis is based on the information provided in the Audit Log With SAP NetWeaver PI 71 the audit

log is not persisted for successful messages in the database by default to avoid performance overhead

Therefore the audit log is only available in the cache for a limited period of time (based on the overall

message volume) More details can be found in chapter Persistence of Audit Log information in PI 710 and

higher

As described in chapter Adapter Engine Performance monitor in PI 731 and higher a new performance

monitor is available from PI 731 SP4 With this monitor you can display the processing time per interface in

a given interval and identify the processing steps in which a lot of time is spend

With 731 SP10 and 74 SP5 a download functionality for the performance monitor is provided as described

in SAP Note 1886761

61 Adapter Performance Problem

Compared to a bottleneck in the Messaging System it is relatively easy to detect a performance

problembottleneck in the adapter Since the timestamps for inbound and outbound messages are written at

different points in time the procedure is described separately for sender and receiver adapters

Note It is not generally possible for all sender and receiver adapters to work in parallel (discussed in section

Adapter Parallelism)

611 Adapter Parallelism

As mentioned before not all adapters are able to handle requests in parallel Thus increasing the number of

threads working on a queue in the messaging system will not solve a performance problembottleneck

There are 3 strategies to work around these restrictions

1) Create additional communication channels with a different name and adjust the respective senderreceiver agreements to use them in parallel

2) Add a second server node that will automatically have the same adapters and communication channel running as the first server node This does not work for polling sender adapters (File JDBC or Mail) since the adapter framework scheduler assigns only one server node to a polling communication channel

3) Install and use a non-central Adapter Framework for performance-critical interfaces to achieve better separation of interfaces

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

52

Some of the most frequently-used adapters and the possible options are discussed below

o Polling Adapters (JDBC Mail File)

At the sender side these adapters use an Adapter Framework Scheduler which assigns a server node that does the polling at the specified interval Thus only one server node in the J2EE cluster does the polling and no parallelization can be achieved Therefore scaling via additional server node is not possible More details will be presented in section Adapter Framework Scheduler For example since the channels are doing the same SELECT statement on the database or pick up files with the same file name parallel processing will only result in locking problems To increase the throughput for such interfaces the polling interval has to be reduced to avoid a big backlog of data that is to be polled If the volume is still too high you should think about creating a second interface for example the new interface would poll the data from a different directory or database table to avoid locking

At the receiver side the adapters work sequentially on each server node by default For example only one UPDATE statement can be executed for JDBC for each Communication Channel (independent of the number of consumer threads configured in the Messaging System) and all the other messages for the same Communication Channel will wait until the first one is finished This is done to avoid blocking situations on the remote database On the other hand this can cause blocking situations for whole adapters as discussed in section Avoid Blocking Caused by Single SlowHanging Receiver Interface

To allow better throughput for these adapters you can configure the degree of parallelism at the receiver side In the Processing tab of the Communication Channel in the field ldquoMaximum Concurrencyrdquo enter the number of messages to be processed in parallel by the receiver channel For example if you enter the value 2 then two messages are processed in parallel on one J2EE server node The parallel execution of these statements at database level of course depends on the nature of the statements and the isolation level defined on the database If all statements update the same database record database locking will occur and no parallelization can be achieved

o JMS Adapter

The JMS adapter is per default using a push mechanism on the PI sender side This means the data is pushed by the sending MQ provider Per default every Communication channel has one JMS connection established per J2EE server node On each connection it processes one message after the other so that the processing is sequential Since there is one connection per server node scaling via additional server nodes is an option

With Note 1502046 - JMS Adapter message pull mechanism the JMS protocol can be changed to use a pull mechanism By doing so you can specify a polling interval in the PI Communication Channel and PI will be the initiator of the communication Also here the Adapter Framework Scheduler is used which implies sequential processing But in contrary to JDBC or File sender channels the JMS polling sender channel will allow parallel processing on all server nodes of the J2EE cluster Therefore scaling via additional J2EE server nodes is an option

The JMS receiver side supports parallel operation out of the box Only some small parts during message processing (pure sending of the message to the JMS provider) are synchronized Therefore to enable parallel processing on the JMS receiver side no actions are necessary

o SOAP Adapter

The SOAP adapter is able to process requests in parallel The SOAP sender side has in general no limitations in the number of requests it can execute in parallel The limiting factor here is the FCAThreads available to process the incoming HTTP calls This is discussed in more detail in chapter Tuning SOAP sender adapter On the receiver side the parallelism depends on the number of the threads defined in the messaging system and the ability of the receiving system to cope with the load

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

53

o RFC Adapter

The RFC adapter offers parameters to adjust the degree of parallelism by defining the number of initial and maximum connections to be used The initial threads are allocated from the Application thread pool directly and are therefore not available for any other tasks Therefore the number of initial thread should be kept minimal To avoid bottlenecks during the peak time the maximum connections can be used A bottleneck is indicated by the following exception in the Audit Log

comsapaiiaframsapiDeliveryException error while processing message to remote systemcomsapaiiafrfccoreclientRfcClientException resource error could not get a client from JCOPool comsapmwjcoJCO$Exception (106) JCO_ERROR_RESOURCE Connection pool RfcClient hellip is exhausted The current pool size limit (max connections) is 1 connections

Also this should be done carefully since these threads will be taken from the J2EE application thread pool Therefore a very high value there can cause a bottleneck on the J2EE engine and therefore cause a major instability of the system As per RFC adapter online help the maximum number of connections are restricted to 50

o IDoc_AAE Adapter

The IDoc adapter on Java (IDoc_AAE) was introduced with PI 73 On the sender side the parallelization depends on the configuration mode chosen In Manual Mode the adapter works sequential per server node For channels in ldquoDefault Moderdquo it depends on the configuration of the inbound Resource Adapter (RA) Via the parameter MaxReaderThreadCount of the inbound RA you can configure how many threads are globally available for all IDoc adapters running in Default Mode Hence this determines the overall parallelization of the IDoc sender adapter per Java server node Currently the recommended maximum number of threads is 10

The receiver side of the Java IDoc adapter is working parallel per default

The table below gives a summary of the parallelism for the different adapter types

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

54

612 Sender Adapter

The first step(s) to be taken for a sender adapter depend on the type of adapter itself Afterwards however

the message flow is always the same First the message is processed in the Module Processor and

afterwards put into the queue of the Messaging System If the message is synchronous then it is the call

queue if the message is asynchronous that it is the send queue The final action of the Adapter Framework

is then to send the message from the messaging system to the Integration Server (for Non ICO interfaces)

Only the first entries of the audit log are of interest to establish if the sender adapter has a performance

problem From the first entry to the entry ldquoMessage successfully put into the queuerdquo These steps logically

belong to the sender adapter and the Module Processor while the subsequent steps belong to the

Messaging System The sender adapter uses one J2EE Engine application thread to execute its work This

is generally the same thread for all steps involved

Select the ldquoDetailsrdquo of a message in the AFW The first entry of the audit log will be the beginning of the activity of the sender adapter The end of the adapterrsquos activity is marked by the entry ldquoThe application sent the message asynchronously using connection Returning to applicationrdquo

6121 Tuning SOAP sender adapter As described above in general there is no limitation in the parallelization of the SOAP sender adapter itself

The incoming SOAP requests are limited by the available FCA Threads for HTTP processing The FCA

Threads are discussed in more detail in FCA Server Threads

Per default all interfaces using the SOAP adapter are using the same set of FCA Threads since they are all

using the same URL httplthostgtltportgtMessageServlet In case one interface is facing a very high load or

slow backend connections this can cause a blockage of the available FCA Threads and could cause a heavy

impact on the processing time of other interfaces

To avoid this SAP Note 1903431 allows the usage of different URLs for specific interfaces This way you

could isolate interfaces with a very high volume on a dedicated entry point using its own set of FCA Threads

In case of high load for this interface the other SOAP sender interfaces would not be affected by a shortage

of FCA Threads

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

55

To use this new feature you can specify on the sender system the following two entry points

o MessageServletInternal

o MessageServletExternal

More information about the URL of the SOAP adapter can be found here

613 Receiver Adapter

For a receiver adapter it is just the other way around when compared to the sender adapter First the

message is received from the Integration Server and then put into a queue of the Messaging System this

time either the request or the receive queue for asynchronous and the request queue for synchronous

messages The last step is then the activity of the receiver adapter itself

Select the ldquoDetailsrdquo of a message in the AFW The adapter activity starts after the entry ldquoThe message status set to DLNGrdquo The message will then be forwarded to the entry point of the AFW ldquoDelivering to channel ltchannel_namegtrdquo enters the Module Processor ldquoMP Entering Module Processorrdquo and ends with ldquoThe message was successfully delivered to the application using connection rdquo

Depending on the type of the adapter you now have to investigate why the receiver adapter needs a high amount of processing time This could be a complicated operating system command a bad network connection or a slow backend system among many other reasons

Here too custom modules can be the reason for a prolonged processing time Check Performance of Module Processing for more details

Note From 71 the audit log is not persisted any longer for performance reasons as described in Persistence of Audit Log information in PI 710 and higher

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

56

614 IDoc_AAE adapter tuning

A comprehensive and up-to date description about the tuning of the IDoc adapter is summarized in Note

1641541 - IDoc_AAE Adapter performance tuning Please refer to this Note for details

Similar to all other adapters running on the Adapter Engine also the IDOC_AAE adapter is using the Messaging System queues and the explanations of the next chapter Messaging System Bottleneck are also valid for the IDOC_AAE adapter Note that the IDOC_AAE adapter only supports async Scenarios ndash so it is only using the ldquoSendrdquo queue in Java-only scenarios and the ldquoSendrdquo and ldquoReceiverdquo queue in dual-stack scenariosrdquo

For the IDoc_AAE sender you can configure in addition bulk support as indicated in the screenshot below By doing so all messages received by PI in one RFC call from ERP will be processed as one message This is similar to the mechanism of the ABAP IDoc adapter described in the chapter Packaging on sender and receiver side except with the difference that the number of messages in one bulk cannot be further limited

It is also essential to ensure that the size of the IDocs or IDoc packages for sender or receiver IDocs is not

getting too large to avoid any negative impact on the Java heap memory and Garbage Collection In the

case of the IDoc_AAE adapter the XML size of the IDoc is less critical it is the number of segments in an

IDoc or the overall number of segments in all IDocs of an IDoc package that will influence the memory

allocation during message processing Based on the current implementation per IDoc segment around 5 KB

of memory during processing is required A package of 5 IDocs with 2000 segments for each IDoc would

consume roughly 500 MB during processing In such cases it is important to eg lower the package size

andor use large message queues as outlined in chapter Large message queues on PI Adapter Engine

615 Packaging for Proxy (SOAP in XI 30 protocol) and Java IDoc adapter

Due to message packaging in the Integration Engine (see PI Message Packaging for Integration Engine) the

ABAP based IDoc and Proxy adapter was always using packaging when sending asynchronous messages

to the receiver system With PI 731 packaging was also introduced for Java based IDoc and Proxy (SOAP in

XI 30 protocol) adapter The aim of packaging is to achieve similar benefits to packaging already existing in

ABAP pipeline

o Less calls to backend consuming less parallel Dialog Work Processes on the receiver

o Less overhead due to context switches authentication when calling the backend

Especially for high volume interfaces packaging should be enabled to avoid overloading of the backend

systems and impact on users operating on the system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

57

Important

In the ABAP stack the packaging is active per default (in 71 and higher) as described in chapter PI

Message Packaging for Integration Engine Experience shows that packaging can have a very high impact

on the DIA Work process usage on the ERP receiver side Especially for migration projects from dual-stack

PI to AEXPO it is important to evaluate packaging to avoid any negative impact on the receiving ECC due

to missing packaging

The packaging on Java works in general different then on ABAP While in ABAP the aggregation is done on

the individual qRFC queue level (which always contains messages to one receiver system only) this is not

possible on the Adapter Engine since messages for all receivers are located in the same queue Therefore

packages are built outside of the Adapter Engine queues The messages to be packaged will be forwarded

to a bulk handler which waits till either the time for the package is exceeded or the number of messages or

data size is reached After this the message is send by a bulk thread to the receiving backend system

Packaging on Java is always done for a server node individually If you have a high number of server nodes

packaging will work less efficiently due to the load balancing of messages across the available server nodes

When enabling message packaging the message status will stay in status ldquoDeliveringrdquo throughout all steps

described above In the audit log you can see the time it spends in packaging The audit log shown below

shows eg that the message was almost waiting one minute before the package was built and 9 IDocs were

sent with this package

Packaging can be enabled globally by setting the following parameters in Messaging System service as

described in Note 1913972

o messagingsystemmsgcollectorenabled Enable packaging or disable it globally

o messagingsystemmsgcollectormaxMemPerBulk The maximum size of a bulk message

o messagingsystemmsgcollectorbulkTimeout The wait time per bulk - default 60 seconds

o messagingsystemmsgcollectormaxMemTotal Maximum memory which can be used by message collector

o messagingsystemmsgcollectorpoolSize Number of parallel threads used to send out packages to the backend This value has to be tuned based on the general performance of the receiver system the network latency between the systems or the configuration of the IDoc posting on the ERP sender side (described in chapter IDoc posting on receiver side) Therefore tuning of these threads might be very important These threads can unfortunately not be monitored yet with any PI tool or Wily

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

58

Introscope To verify that not all packaging threads are in use you can use the Thread monitoring in the SAP Management Console and check for threads named ldquoBULK_EXECUTORrdquo as shown below

If you would like to adapt the packaging for specific communication channel this can also be done using the

configuration options in the Integration Directory

While in the SOAP adapter in XI 30 protocol you can define three different packaging KPIs this is currently

not possible for the IDoc adapter There you can only specify the package size based on number of

messages

616 Adapter Framework Scheduler

For polling adapters like File JDBC and JMS the Adapter Framework Scheduler determines on which server

node the polling takes place Per default a File or JDBC sender only works on one server node and the

Adapter Framework (AFW) Scheduler determines on which one the job is scheduled

The Adapter Frameworks scheduler had several performance and reliability issues in a cluster environment

especially in systems having many Java server nodes and many polling communication channels Based on

this and the symptoms described in Note 1355715 - AF Scheduler to avoid using cluster communication a

new scheduler was released We highly recommend using the new version of the AFW scheduler To

activate the new scheduler you have to set the parameter schedulerrelocMode of the Adapter framework

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

59

service property to some negative value For instance by setting the value to -15 after every 15th polling

interval a rebalancing of the channel within the J2EE cluster might happen This can allow for better

balancing of the incoming load across the available server nodes if the files are coming in on regular

intervals The balancing is achieved by doing a servlet call Based on the HTTP load balancing the channel

might then be dispatched to another server node To avoid the balancing overhead this value should not be

set too low A value between -10 and -15 should be acceptable in most cases For very short polling intervals

(eg every second) even lower values (eg -50) can be configured

In case many files are put at one given time on the directory (by eg using a batch process) all these files will

be processed by one polling process only Hence a proper load balancing cannot be achieved by the AFW

scheduler In such a case the only option is to use write the files with different names or different directories

so that you can configure multiple sender channels to pick up the files

Starting from PI 731 you can monitor the Adapter Framework Scheduler in pimon Monitoring

Background Job Processing Monitor Adapter Framework Scheduler Jobs There you can check on which

server node the Communication Channel is polling (Status=rdquoActiverdquo) Also you can see the time the channel

was polling the last time and will poll next time

You cannot influence the server node on which the channel is polling This is determined via the AFW

scheduler You can only influence the frequency after which a channel can potentially move to another

server node by tuning the parameter relocMode as outlined above

Prior to PI 731 you have to use the following URL to monitor the Adapter Framework Scheduler

httpserverportAdapterFrameworkschedulerschedulerjsp

Above you see an example of a File sender channel The type value determines if the channel runs only on

one server node (type = L (local)) or on all server nodes (Type = G (global)) The time in the brackets

[60000] determines the polling interval in ms ndash in this case 60 seconds The Status column shows the

status of the channel

o ldquoONrdquo Currently polling

o ldquoonrdquo Currently waiting for the next polling

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

60

o ldquooffrdquo not scheduled any longer (eg channel deactivated or scheduled on another server node)

62 Messaging System Bottleneck

As described earlier the Messaging System is responsible for persisting and queuing messages that come

from any sender adapter and go to the Integration Server pipeline or Java only interfaces (ICO) or that come

from the Integration Server pipeline and go to any receiver adapter The time difference between receiving

the message from the one side and delivering it to the other side is the value that can be used to analyze

bottlenecks in the Messaging System The following chapters describe how to determine the time difference

for both directions

A bottleneck (or rather a ldquofakerdquo bottleneck) in the Messaging System can usually be closely connected with a

performance problem in the receiver adapter You must therefore make sure you execute the check

described in chapter 613 If the receiver adapter needs a lot of time to process messages then of course

the messages get queued in the messaging system and remain there for a long time since the adapter is not

ready to process the next one yet It looks like the messaging system is not fast enough but actually the

receiver adapter is the limiting factor

Execute the checks described in chapter 621 and 622 to get an overview of the situation you should do this for multiple messages

Decide if a specific receiver adapter is causing long waiting times in the messaging system by executing the check described in chapter 613 An additional hint at a receiver adapter problem is if only messages to specific receivers show a long wait time in the queue

Use the queue monitoring in the Runtime Workbench (RWB) to analyze how many threads are active for each of the queue types and how large the queue size currently is To do so choose Component Monitoring Adapter Engine Press the Engine Status button and choose tab ldquoAdditional Datardquo This will open the page below showing the number of messages in a queue the threads currently working and the maximum available threads Note This view only shows the Messaging System of one J2EE server node To get the whole picture you have to check all the server nodes that are available in your system via the drop down list box

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

61

o Since checking all the server nodes with this browser frontend can be very tedious it is better to use Wily Introscope since it allows easy graphical analysis of the queues and thread usage of the messaging system across Java server nodes

The starting point in Wily Introscope is usually the PI triage showing all the backlogs in the queue In the screenshot below a backlog in the dispatcher queue (for 71 and higher only) can be seen

The dispatcher queue is not the problem here since it will forward messages to the adapter queue if any free

threads are available on the adapter specific consumer threads The analysis must start in the PI inbound

queues By using the navigation button in the upper right corner of the inbound queue size you can directly

jump to a more detailed view where you can see that the file adapter was causing the backlog

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

62

To see the consumer thread usage you can then follow the link to the file adapter In the screenshot below

you can see that all consumer threads were exceeded during the time of the backlog Thus increasing the

number of consumer threads could be a solution if the resulting delay is not acceptable for business

o If it is obvious that the messaging system does not have enough resources as in the case above increase the number of consumers for the queue in question Use the results of the previous check to decide which adapter queues need more consumers

The adapter specific queues in the messaging system have to be configured in the NWA using

service ldquoXPI Service AF Corerdquo and property messagingconnectionDefinition The default

values for the sending and receiving consumer threads are set as follows

(name=global messageListener=localejbsAFWListener exceptionListener=

localejbsAFWListener pollInterval=60000 pollAttempts=60

SendmaxConsumers=5 RecvmaxConsumers=5 CallmaxConsumers=5

RqstmaxConsumers=5)

To set individual values for a specific adapter type you have to add a new property set to the default set with the name of the respective adapter type for example

(name=JMS_httpsapcomxiXISystem

messageListener=localejbsAFWListener

exceptionListener=localejbsAFWListener pollInterval=60000

pollAttempts=60 SendmaxConsumers=7 RecvmaxConsumers=7

CallmaxConsumers=7 RqstmaxConsumers=7)

The name of the adapter specific queue is based on the pattern ltadapter_namegt_ ltnamespacegt for

example RFC_httpsapcomxiXISystem

Note that you must not change parameters such as pollInterval and pollAttempts For more details see SAP Note 791655 - Documentation of the XI Messaging System Service Properties

o Not all adapters use the parameter above Some special adapters like CIDX RNIF or Java Proxy can be changed by using the service ldquoXPI Service Messaging Systemrdquo and property messagingconnectionParams by adding the relevant lines for the adapter in question as described above

o A performance problem on the Messaging System can also be caused by a high logging as described in chapter Logging Staging on the AAE (PI 73 and higher)

o Important Make sure that your system is able to handle the additional threads that is monitor the CPU usage and application thread availability as well as the memory of the J2EE Engine after you have applied the change

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

63

o Important Keep in mind that not all adapters are able to handle requests in parallel even if you assign more threads on the Messaging System queues In such a case adding additional threads will not speed up processing significantly See chapter Adapter Parallelism for details

o A backlog of messages in the messaging system is highly critical for synchronous scenarios due to timeouts that might occur Therefore the number of synchronous threads always has to be tuned so that no bottleneck occurs Often a backlog in synchronous queues is caused due to bad performance on the receiving backend system

o More information is provided in the blog Tuning the PI Messaging System Queues

621 Messaging System in between AFW Sender Adapter and Integration Server (Outbound)

Open the Audit Log of a message and look for the following entry ldquoThe message was successfully retrieved

from the send queuerdquo Note that the queue name would be ldquoCall Queuerdquo if the message was synchronous

To determine how long the message waited to be picked up from the queue compare the timestamp of the

above entry with the timestamp for the step ldquoMessage successfully put into the queuerdquo A large time

difference between those two timestamps would therefore indicate a bottleneck for consumer threads on the

sender queues in the Messaging System

Instead of checking the latency of single messages we recommend using Wily Introscope to monitor the

queue behavior as shown above

Asynchronous messages only use one thread for processing in the messaging system Multiple threads are

used for synchronous messages The adapter thread puts the message into the messaging system queue

and will wait till the messaging system delivers the response The adapter thread is therefore not available

for other tasks until the response is returned A consumer thread on the call queue sends the message to the

Integration Engine The response will be received by a third thread (consumer thread of send queue) which

correlates the response with the original request After this the initiating adapter thread will be notified to

send the response to the original sender system This correlation can be seen in the audit log of the

synchronous message below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

64

622 Messaging System in Between Integration Server and AFW Receiver Adapter (Inbound)

Open the Audit Log of a message and look for the following entry ldquoMessage successfully put into the

queuerdquo Compare this timestamp with the timestamp for the step ldquoThe message was successfully retrieved

from the receive queuerdquo The time difference between these two timestamps is the time that the message

waited in the messaging system for a free consumer thread A large time difference between those two

timestamps indicates a bottleneck in the adapter specific inbound queues of the Messaging System

623 Interface Prioritization in the Messaging System

SAP NetWeaver PI 71 and higher offers the possibility for prioritization of interfaces similar to the ABAP

stack This feature is especially helpful if high volume interfaces run at the same time with business critical

interfaces To minimize the delay for critical messages the Messaging System now makes it possible for you

to define high medium and low priority processing at interface level Based on the priority the Dispatcher

Queue of the messaging system (which will be the first entry point for all messages) will forward the

messages to the standard adapter-specific queues The priority assigned to an interface determines the

number of messages that are forwarded once the adapter-specific queues have free consumer threads

available This is done based on a weighting of the messages to be reloaded The weights for the different

priorities are as follows High 75 Medium 20 Low 5

Based on this approach you can ensure that more resources can be used for high priority interfaces The

screenshot below shows the UI for message prioritization available in pimon Configuration and

Administration Message Prioritization

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

65

The number of messages per priority can be seen in a dashboard in Wily as shown below

You can find more details on configuring usage prioritization within the AAE at SAP Online Help navigate to

SAP NetWeaver SAP NetWeaver PI SAP NetWeaver Process Integration 71 Including Enhancement

Package 1 SAP NetWeaver Process Integration Library Function-Oriented View Process Integration

Process Integration Monitoring Component Monitoring Prioritizing the Processing of Messages

624 Avoid Blocking Caused by Single SlowHanging Receiver Interface

As already discussed above some receiver adapters process messages sequentially on one server node by

default This is independent of the number of consumer threads defined for the corresponding receiver

queue in the messaging system For example even though you have configured 20 threads for the JDBC

Receiver one Communication Channel will only be able to send one request to the remote database at a

given time If there are many messages for the same interface all of them will get a thread from the

messaging system but will be blocked when performing the adapter call

In the case of a slow message transmission for example for EDI interfaces using ISDN technology or

remote system with small network bandwidth this behavior can cause a ldquohangingrdquo situation for a specific

adapter since one interface can block all messaging threads As a result all the other interfaces will not get

any resources and will be blocked

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

66

Based on this a parameter messagingsystemqueueParallelismmaxReceivers was introduced which can

be set in NWA in service XPI Service Messaging System With this parameter you can restrict the maximum

number of receiver consumer threads that can be used by one interface In older releases (lower 731 SP11

and 74 SP6) this is a global parameter at the moment that affects all adapters It should therefore not be set

too restrictively If you need high parallel throughput one option is to set the maxReceiver parameter to 5 (so

that each interface can use 5 consumer threads for each server node) and increase the overall number of

threads on the receive queue for the adapter in question (for example JDBC) to 20 With this configuration it

will be possible for four interfaces to get resources in parallel before all threads are blocked For more

information see Note 1136790 - Blocking Receiver Channel May Affect the Whole Adapter Type and SDN

blog Tuning the PI Messaging System Queues

In the screenshot below you see the impact of maxReceivers parameter when a backlog for one interface

occurs Even though there are more free SOAP threads available they are not consumed Hence the free

SOAP threads could be assigned to other interfaces running in parallel

This parameter has a direct impact on the queue where message backlogs occur As mentioned earlier

usually a backlog should occur on the Dispatcher queue only since it is responsible to dispatch threads only

if there are free consumer threads in the adapter specific queue When setting maxReceiver you restrict the

threads per interface and this means that less consumer threads are blocked When one interface is facing a

high backlog there will always be free consumer threads and therefore the dispatcher queue can dispatch

the message to the adapter specific queue Hence the backlog will appear in the adapter specific queue so

that the message prioritization would not work properly any longer

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

67

Per default the maxReceiver parameter is only relevant for asynchronous message processing (ICo and

classical dual-stack scenarios) The reason here is that a wait time for synchronous scenarios should be

avoided by all means and therefore additional restrictions in the number of available threads can be very

critical Therefore it is usually advisable to not limit the threads per interface but increase the overall number

of available threads In case you have many high volume synchronous scenarios with different priority that

run in parallel it might be advisable to limit the threads each interface can use To do so you have to set the

parameter messagingsystemqueueParallelismqueueTypes to Recv ICoAll as described in SAP Note

1493502 - Max Receiver Parameter for Integrated Configurations

Enhancement with 731 SP11 and 74 SP6

With Note 1916598 - NF Receiver Parallelism per Interface an enhancement was introduced that allows

the specification of the maximum parallelization on more granular level This new feature has to be activated

by setting the parameter messagingsystemqueueParallelismperInterface to true Using a configuration UI

you can specify rules to determine the parallelization for all interfaces of a given receiver service If no rule

for a given interface is specified the global maxReceivers value will be considered If the receiver service

corresponds to a technical business system this configuration would help to restrict the parallel requests

from PI to that system This also works across protocols so that eg it could be used to restrict the number of

parallel requests using Proxy or IDoc to the same ERP receiver system Below you can find a screenshot of

the configuration UI in NWA SOA Monitoring

With the improvement mentioned above also the dispatching mechanism in the dispatcher queue is

changed so that it is aware of the maxReceiver settings This means that now again the backlog will now be

placed in the dispatcher queue and the prioritization will work properly

625 Overhead based on interface pattern being used

Also the interface pattern configured can cause some overhead When choosing the interface pattern

ldquoStateless Operationrdquo each message is parsed against its data type during inbound processing This can

cause performance and memory overhead but gives a syntactical check of the data received

When choosing ldquoStateless (XI30-Compatible)rdquo no check of the data type is performed and therefore the

overhead is avoided Therefore for interface with good data quality and high message throughput this

interface pattern should be chosen

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

68

63 Performance of Module Processing

If your analysis in chapter Adapter Performance Problem pointed to a problem in the Module Processing

then you must analyze the runtime of the modules used in the Communication Channel Adapter modules

can be combined so that one CC calls multiple modules in a defined sequence Adapter modules can be

custom-developed or SAP standard and can be used for many different purposes ndash like to transform the EDI

message before sending it to the partner or to change header attributes of a PI message before forwarding it

to a legacy application Due to the flexible usage of adapter modules there is also no standard way to

approach performance problems

In the audit log shown below you can see two adapter modules One module is a customer-developed

module called SimpleWaitModule The next module called CallSAPAdapter which is a standard module that

inserts the data into the messaging system queues

In the audit log you will get a first impression of the duration of the module In the example above you can

see that the SimpleWaitModule requires 4 seconds of processing time

Once again Wily Introscope offers a better overview of the processing time of modules In the screenshot

below you can see a dashboard showing the cumulativeaverage response times and number of invocations

of different modules If there is one that has been running for a very long time then it would be very easy to

identify since there will be a line indicating a much higher average response time The tooltip displays the

name of the module

Module

Processing

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

69

Once you have identified the module with the long running time you have to talk to the responsible developer

to understand why it is taking that long Eventually the module executes a look-up to a remote system using

JCo or JDBC which could be responsible for the delay In the best case the module will print out information

in addition to the audit log to detect such steps If not use the Wily Introscope transaction trace as explained

in appendix Wily Transaction Trace

64 Java only scenarios Integrated Configuration objects

From SAP NetWeaver PI 71 on you can configure scenarios that only run on the Java stack using the

Advance Adapter Engine (AAE) when only Java-based adapters are used A new configuration object is

used for this that is called Integrated Configuration When using this the steps executed so far in the ABAP

pipeline (Receiver Determination Interface Determination and Mapping) are also executed by the services in

the Adapter Engine

641 General performance gain when using Java only scenarios

The major advantage of AAE processing is a reduced overhead due to the context switches between ABAP

and Java Thus the overall throughput can be increased significantly and the overall latency of a PI

message can be reduced greatly If possible interfaces should always use the Integrated Configuration to

achieve best performance Therefore the best tuning option is to change a scenario that is using Java based

sender and receiver adapters from a classical ABAP-based scenario to an AAE-based one

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

70

Based on SAP internal measurements performed on 71 releases the throughput as well as the response

time could be improved significantly as shown in the diagrams below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

71

Based on these measurements the following statements can be made

1) Significant performance improvements can be achieved always with local processing (if available) for a certain scenario

2) This is valid for all adapter types huge mapping runtimes slow networks slow (receiver) applications reduce the throughput and therefore rated for the overall scenario also reduce the benefit of local processing in terms of overall throughput and response time measurements

3) Greatest benefit is recognized for small payloads (comparison 10k 50k 500k) and asynchronous messages

642 Message Flow of Java only scenarios

All the Java based tuning options mentioned in chapter Analyzing the (Advanced) Adapter Engine and Long

Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo are still valid Just the calling sequence is different

The following describes the steps used in Integrated Configuration to help you better understand the

message flow of an interface The example is JMS to Mail

1) Enter the JMS sender adapter

2) Put into the dispatcher queue of the messaging system

3) Forwarded to the JMS send queue of the messaging system

4) Message is taken by JMS send consumer thread

a No message split used

In this case the JMS consumer thread will do all the necessary steps previously done in the ABAP pipeline (like receiver determination interface determination and mapping) and will then also transfer the message to the backend system (remote database in our example) Thus all the steps are executed by one thread only

b Message split used (1n message relation)

In this case there is a context switch The thread taking the message out of the queue will process the message up to the message split step It will create the new message and put it in the Send queue again from where it will be taken by different threads (which will then map the child message and finally send it to the receiving system)

As we can see in this example for Integrated Configuration only one thread does all the different steps of a

message The consumer thread will not be available for other messages during the execution of these steps

The tuning of the Send queue (Call for synchronous messages) is therefore much more important for

scenarios using Integrated Configuration than for ABAP-based scenarios

The different steps of the message processing can be seen in the audit log of a message If for example a

long-running mapping or adapter module is indicated then you can use the relevant chapter of this guide

There is no difference in the analysis except that for mappings no JCo connection is required since the

mapping call is done directly from the Java stack

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

72

The example audit logs below are based on a Java only synchronous SOAP to SOAP scenario

In the highlighted areas you can see that all the steps are very fast except the mapping call which lasts

around 7 seconds To analyze the long duration of the mapping now the same steps as discussed in chapter

Long Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo have to be followed

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

73

643 Avoid blocking of Java only scenarios

Like classical ABAP based scenarios Java only scenarios can face backlog situations in case of a slow or

hanging receiver backend Since the Java only interfaces are only using send queues the restriction of

consumer threads on the receiver queue as described in chapter Avoid Blocking Caused by Single

SlowHanging Receiver Interface is no solution

Based on that an additional property messagingsystemqueueParallelismqueueTypes of service SAP XI AF

MESSAGING was introduced via Note 1493502 - Max Receiver Parameter for Integrated Configurations

The parameter can be set for synchronous and asynchronous interfaces Please keep in mind that

messagingsystemqueueParallelismmaxReceivers is a global parameter for all adapters and for all

configured Quality of Service This global restriction can be avoided starting with 731 SP11 740 SP06 as

per SAP Note 1916598 - NF Receiver Parallelism per Interface Usually a restriction for the parallelization

can be highly critical for synchronous interfaces Therefore we generally recommend to set

messagingsystemqueueParallelismqueueTypes in most cases to Recv IcoAsync only

644 Logging Staging on the AAE (PI 73 and higher)

Up to PI 711 on the PI Adapter Engine only one message version was persisted Especially for Java only

scenarios this was often not sufficient for troubleshooting since for instance the result of a mapping could not

be verified

Starting from PI 73 the persistence steps can be configured globally in the Adapter Engine for synchronous

and asynchronous interfaces This is similar to the LOGGING or LOGGING_SYNC on the ABAP stack In

later version of PI you will be able to do the configuration on interface level

The versions that can be persisted are shown in the diagram below (the abbreviations in green are the

values for the configuration)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

74

In the Messaging System we generally distinguish Staging (versioning) and Logging An overview is given

below

For details about the configuration please refer to the SAP online help Saving Message Versions

Similar to the LOGGING on the ABAP Integration Server persisting several versions of the Java message

can cause a high overhead on the DB and can cause a decrease in performance

The persistence steps can be seen directly in the audit log of a message

Also the Message Monitor shows the persisted versions

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

75

While this additional monitoring attributes are extremely helpful for troubleshooting they should be used

carefully from a performance perspective Especially the number of message versions and the

loggingstaging mode you choose can have a severe impact on performance Therefore you have to find the

balance between business requirement and performance overhead Some guidelines on how to use staging

and logging are summarized in SAP Note 1760915 - FAQ Staging and Logging in PI 73 and higher

65 J2EE HTTP load balancing

With NetWeaver version 71 and higher the Java HTTP load balancing is no longer done by the Java

Dispatcher but by the ICM The default load balancing rules of the ICM are designed with a stateful

application in mind giving priority to the balancing of HTTP sessions These default rules are not optimal for

stateless applications such as SAP PI

In the past we could observe an unequal load distribution across Java server nodes caused by this In case

of high backlogs this can cause a delay in the overall message processing of the interface because one

server node has more messages assigned than others and therefore not all available resources are used

A typical example can be seen in the Wily screenshot below

SAP Note 1596109 ndash Uneven distribution of HTTP requests on Java server nodes introduces new load

balancing rules for stateless applications like PI Please follow the description in the Note to ensure the

messages are distributed equally across the available server nodes In the meantime these load balancing

rules can also be executed using the CTC templates as described in Notes 1756963 and 1757922 Also this

has been included in the PI Initial Setup wizard in order to execute this task automatically as a post-

installation step (see Note 1760700 - PI CTC Add HTTP loadbalancing to initial setup)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

76

For the example given above we could see a much better load balancing after the new load balancing rules

were implemented This can be seen in the following screenshot

Please Note The load balancing rules mentioned above are only responsible to balance the messages

across the available server nodes of one instance HTTP load balancing across application servers is done

by the SAP Web Dispatcher More information about this is provided in the Guide How To Scale SAP PI 71

66 J2EE Engine Bottleneck

All tuning that can be done in the AFW is limited by the ability of the J2EE Engine to handle a given amount

of threads and to provide enough memory that is needed to process all requests Of course the CPU is also

a limiting factor but this will be discussed in Chapter 91

From SAP NetWeaver PI 71 onwards the SAP VM is used on all platforms Therefore the analysis of all

platforms can use the same tools

661 Java Memory

The first analysis of the J2EE memory is usually done using the garbage collection (GC) behavior of the

Java Virtual Machine (JVM) To get the information of the GC written to the standard log file the JVM has to

be started with the parameter ndashverbosegc

Analyze the file std_serverltngtout for the frequency of GCs their duration and the allocated memory

Look for unusual patterns indicating an Out-of-Memory error or an unintentional restart of your J2EE engine This would be visible if the allocated memory drops to 0 at a given point in time A healthy Garbage Collection output would display a saw tooth pattern ndash that is the memory usage would increase over time but then go down to its original value as soon as a full Garbage Collection is triggered You should see the memory usage go down to initial values during low volume times (night-time)

Pay attention to GCs with a high duration This is important because no PI message can be mapped or processed by the J2EE Adapter Framework during a GC in a JAVA VM One of the many reasons for long GCs that should be mentioned here is paging (see chapter 92) If the heap space of the J2EE Engine is exhausted all objects in the heap are evaluated for their references If there is still a reference the object is kept If there is no reference the object is deleted If some or all of the objects have been swapped out due to insufficient RAM then they have to be swapped in to be evaluated This leads to very long runtimes for GCs GCs of more than 15 minutes have been observed on swapping systems

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

77

Different tools exist for the Garbage Collection Analysis

1) Solution Manager Diagnostics

Solution Manager Diagnostics (SMD) offers a memory analysis application that is able to read the GC output in the std_serverltngtout files To start call the Root Cause Analysis section In the Common Task list you find the entry Java Memory Analysis which will give you the chance to upload your std_serverltngtout file It shows the memory allocated after the GC and also prints the duration of each GC (black dots with duration scale on right axis) A normal GC output should show a saw tooth pattern where the memory always goes down to an initial value (especially during low volume times) This is shown in the example screenshot below

2) Wily Introscope

Again Wily Introscope offers different dashboards that can be used to check the Garbage Collection behavior on the J2EE engine The dashboard can be found using the J2EE Overview J2EE GC Overview and shows important KPIs for the GC such as the count of GC or the duration for each interval The GC Time dashboards shows the ratio of time spent in GC An example is shown in the screenshot below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

78

3) Netweaver Administrator

The NWA also offers a view to monitor the Java heap usage However the important information about the duration of the GC is not available This monitor can be found by navigating to Availability and Performance Management Java System Reports Choose Report System Health

In NetWeaver 73 you find this data in Availability and Performance Management Resource Monitoring History Reports There you can build your own Memory report of the monitoring data provided The screenshot below eg shows the Used Memory per server node

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

79

4) SAP JVM Profiler

The SAP JVM profiler allows the analysis of the extensive GC information stored in the sapjvm_gcprf files of the server node folders or to directly connect to the server node to retrieve this information More information can be found at httpwikiscnsapcomwikidisplayASJAVAJava+Profiling

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

80

Procedure

o Search for restarts of the J2EE Engine by searching for the string ldquois startingrdquo in the file std_serverltngtout Apart from the first of these entries (which simply marks the initial start of the J2EE Engine) the second and any later entries mark a fatal situation after which the J2EE server node had to be restarted This could for example be an outndashof-memory situation Search the log above those entries for a possible reason

o Search for exceptions and fatal entries Was an out-of-memory error reported (search for pattern OutOfMemory) Was a thread shortage reported

o This document does not deal with tuning the J2EE Engine If at this point it becomes clear that the J2EE Engine is the limiting factor proceed with SAP Notes or open a customer incident Only some very basic recommendations are given here

o The parameters for the SAP VM are defined by Zero Admin templates From time to time new Zero Admin templates might be released which change important parameters for the SAP VM Thus you should check SAP Note 1248926 - AS Java VM Parameters for NetWeaver 71-based Products regularly and apply the changes accordingly

o The problem might be caused by too high parallelization when processing large messages This should be restricted to avoid overloading of the Java heap memory

1) In case the interfaces are using the ABAP Integration Server use the large message queue filters to restrict the parallelization of mapping calls from ABAP queues To do so set parameter EO_MSG_SIZE_LIMIT to eg 5000 to direct all large messages to dedicated XBTL or XBTM queues

2) To restrict the parallelization on the Java Messaging System you can use the large message queue mechanism described in chapter Large message queues on the Messaging System

o If it is already obvious that the memory of your J2EE Engine is getting short and nothing can be changed on the scenarios itself you have several options for scaling your PI landscape Especially for large PI installations with high message throughput or large messages SAP Note 1502676 - Scaling up large PI installations describes potential options

1) Increase the Java Heap of your server node If sufficient physical memory is available increase the default heap size of 2 GB to a higher value such as 3 or 4 GB Experience shows that the SAP VM can handle these heap sizes without major impact on performance Larger heap sizes can cause longer processing time of GCs which in turn can affect the PI application negatively Increasing the maximum heap size on the J2EE engine will automatically adapt the new area of the heap due to a dynamic configuration (in newer J2EE versions this will be per default set to 16 of the overall heap) After increasing the heap size monitor the duration of the GC execution

2) Add an additional server node to distribute the load over more processes SAP recommends that productive systems have at least 2 Java server nodes per instance Prerequisite is that your physical host has enough RAM and CPU To minimize the overhead for J2EE cluster communication it is in general recommended configuring larger server nodes (as described above) then a high number of server nodes

3) Add an additional application server consisting of a double stack Web Application Server (ABAP and J2EE Engine) to distribute the load to an additional hardware Find details on the scaling of PI with multiple instances in the Guide How To Scale up SAP NetWeaver Process Integration

4) Use a non-central Adapter Engine to separate large messages that cause the memory problem from business critical scenarios

662 Java System and Application Threads

A system or application thread is required for every task that the J2EE Engine executes Each of these

thread types are maintained using a thread pool If the pool of available threads is expired no more requests

can be processed It is therefore important that you have enough threads available at all times

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

81

o In general SAP recommends that you increase the system and application thread count for a PI system As described in SAP Note 937159 - XI Adapter Engine is stuck the application threads should be increased to 350 and the system threads to lsquo120rsquo for all J2EE server nodes

o There are two options for checking the thread usage

1) Wily Introscope

Wily Introscope provides a dashboard in the J2EE Overview section called J2EE CPU and Memory Detail that can also be used to monitor the thread usage on a PI engine Different views exist for Application and System Threads as shown in the screenshot below

2) NetWeaver Administrator (NWA)

The NWA also offers a monitor similar to the one available for the memory mentioned above This monitor can be found by navigating to Availability and Performance Management Java System Reports Choose Report System Health and looking at the windows shown in the screenshot below

Again in NetWeaver 73 and higher the monitor can be found at Availability and Performance

Management Resource Monitoring History Report An example for the Application Thread

Usage is shown below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

82

Check if the ThreadPoolUsageRate is below 80 for both the application threads and the system threads Also check the maximum value by choosing a longer history in Wily Introscope

Has the thread usage increased over the last daysweeks

Is there one server node that shows a higher thread usage than others

3) SAP Management Console

The SAP Management Console is an essential tool to analyze different aspects of the J2EE engine It also offers a thread dump view similar to SM50 in ABAP showing the activity of all the threads of a given Java Application Server More information about the SAP MC can be found in Note 1014480 - SAP Management Console (SAP-MC) and the online help The SAP Management Console can be started as Java web-start application using httpltservergt5ltInstanc_numbergt13

Navigate to AS Java Threads and check for any threads in red status (gt20 secs running time) This indicates long running threads and you should check for the related thread stacks The example below shows a long running thread related to the PI directory You can see the user doing the request and can see that the thread is waiting for an HTTP response from the CPACache

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

83

If you are not sure what is causing the problem you should take multiple thread dumps (at least 3 every 30 seconds) for all server nodes This can be done in the Management Console navigating to AS Java Process Table

The Thread Dump Viewer tool (TDV) from Note 1020246 - Thread Dump Viewer for SAP Java Engine can be used to analyze the produced thread dumps (inside the results archive that can be downloaded to your local PC)

663 FCA Server Threads

An additional type of thread was introduced with SAP NetWeaver PI 71 that is responsible for HTTP traffic

FCA Server Threads The FCA Server Threads are responsible for receiving HTTP calls at the Java side

(after the Java dispatcher is no longer available) Also FCA Threads use a Thread Pool Fifteen FCA

Threads are configured by default but based on SAP Note 1375656 - SAP Netweaver PI System Parameters

we recommend that you increase it to 50 This can be done in the NWA by changing the parameter

FCAServerThreadCount of service HTTP Provider

FCA Server Threads are particularly crucial for synchronous message transfer and for HTTP-based

scenarios like WebServices or calls from the ABAP engine to the Adapter Engine If there is a long duration

of the HTTP call due to a slow backend system then the thread is blocked for the whole time and not

available to send other HTTP requests (other PI messages)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

84

The configured FCAServerThreadCount represents the maximum number of threads working on a single

entry point An entry point in the PI sense could eg be the SOAP Adapter Servlet (share with all channels

as described in Tuning SOAP sender adapter)

o If response times for specific entry point are high (gt15 seconds) additional FCA threads will be spawned to be available for parallel HTTP based incoming requests using different entry points only This ensures that in case of problems with one application not all other applications will be blocked constantly

o An overall maximum of 20xFCAServerThreadCount may be used in the J2EE engine

There is currently no standard monitor available for FCA Threads except the thread view in the SAP

Management Console A new dashboard will be integrated into Wily Introscope soon and will also show the

FCA Server Threads that are in use

664 Switch Off VMC

The Virtual Machine Container (VMC) is enabled by default for an ABAP WebAS 71 Since PI does not use

VMC at runtime the VMC can be switched off on a PI system to avoid resource overhead This

recommendation is based on SAP Note 1375656 - SAP NetWeaver PI System Parameters

You can activate the VM Container setting the profile parameters vmcjenable = off which can be changed

using profile maintenance (transaction RZ10) Once you have made the change you need to restart the SAP

Web Application Server instance

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

85

7 ABAP PROXY SYSTEM TUNING

Every ABAP WebAS gt 620 includes an ABAP Proxy runtime This enables a SAP system to talk native PI

protocol (XI-SOAP) so that no costly transformation is necessary

In general ABAP proxy is just a framework that is triggered by (sender) or triggers (receiver) application-

specific coding It is not possible to give a general tuning recommendation because the applications and use

cases of ABAP proxy can differ so greatly

In this section we would like to highlight the system tuning options that can be applied to improve the

throughput at the ABAP Proxy backend side

In general you can increase the throughput of EO interfaces by changing the queue parallelization Two

different parameters exist to change the number of qRFC queues on the ABAP Proxy backend As in PI

ABAP Proxy processing is based on qRFC inbound queues (SMQ2) with specific names A sender proxy

uses XBTS queues (10 parallel by default) and a receiver proxy XBTR (20 parallel by default)

The sender proxy only forwards the message to PI by executing an HTTP call that must not take too much

time and which is not resource-critical On the other hand the receiver proxy executes the inbound

processing of the messages based on the application context (and which can be very time-consuming) It is

therefore more likely to face a backlog situation in the Receiver Proxy queues In the screenshot below you

can see the performance header of a receiver proxy message that required around 20 minutes in the

PLSRV_CALL_INBOUND_PROXY (which corresponds to the posting of the application data)

Of course such a long-running message will block the queue and all messages behind it will face a higher

latency Since this step is purely application-related it is only possible to perform tuning at the application

side

The number of sender and receiver ABAP Proxy queues can be changed in transaction SXMB_ADM on the

proxy backend with parameter EO_INBOUND_PARALLEL and sub parameter SENDER (XBTS queues)

and RECEIVER (XBTR queues) The prerequisite for increasing the number of queues is that there are

enough qRFC resources are available at the proxy system (as discussed in section qRFC Resources

(SARFC))

Also as described in chapter Message Prioritization on the ABAP Stack prioritization can be configured in

the ABAP Proxy system This can be used to separate runtime-critical interfaces from interfaces with a long

application processing (as shown above)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

86

71 New enhancements in Proxy queuing

The queuing mechanism was enhanced for sender and receiver proxy systems with Note 1802294

(Receiver) and Note 1831889 (Sender) After implementing these two Notes interface specific queues will be

used and tuning of these queues on interface level will be possible This can be very helpful in cases where

one receiver interface shows very long posting time in the application coding that cannot be further

improved Other messages for more business critical interfaces will eventually be blocked by this message

due to the common usage of the XBTR queues On the sender side of a Proxy system the processing of the

queues call the central PI hub In general the processing time there should be fast But in case of high

volume interfaces you might want to slow down less business critical interfaces to avoid overloading of your

central PI hub The prioritization mechanisms available prior to these Notes did not provide such options

This is achieved by adding an interface specific identifier to the queue-name In the screenshot below you

see a comparison in the queue names for the old framework (red) and the new framework (blue)

For the sender queues the following new parameters are introduced

o Parameter EO_QUEUE_PREFIX_INTERFACE with sub parameter SENDERSENDER_BACK Should be set to lsquo1rsquo to active interface specific queue This is a global switch affecting all sender interfaces

o Parameter EO_INBOUND_PARALLEL_SENDER without sub parameter determines the general number of queues per interface

Using a sub parameter ltsender IDgt the parallelization of individual interfaces can be changed This can eg be used to assign more queues for high priority interfaces

Wildcards can be used in the ltsender IDgt to group interfaces sharing eg the same service name or the same interface name but different services

For the receiver queues the following new parameters are introduced

o Parameter EO_QUEUE_PREFIX_INTERFACE with sub parameter RECEIVER Should be set to lsquo1rsquo to active interface specific queue This is a global switch affecting all receiver interfaces

o Parameter EO_INBOUND_PARALLEL_RECEIVER without sub parameter determines the general number of queues per interface

Using a sub parameter ltsender IDgt the parallelization of individual interfaces can be changed This can eg be used to assign more queues for high priority interfaces

Wildcards can be used in the ltsender IDgt to group interfaces sharing eg the same service name or the same interface name but different services

This new feature also replaces the currently existing prioritization since it is in general more flexible and

powerful We therefore recommend adjusting existing prioritization rules using the new queuing possibilities

described above

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

87

8 MESSAGE SIZE AS SOURCE OF PERFORMANCE PROBLEMS

The message size directly influences the performance of an interface The size of a PI message depends on

two elements The PI header elements with a rather static size and the payload which can vary greatly

between interfaces or over time for one interface (for example larger messages during year-end closing)

The size of the PI message header can cause a major overhead for small messages of only a few kB and

can cause a decrease in the overall throughput of the interface Furthermore many system operations (like

context switches or database operations) are necessary for only a small payload The larger the message

payload the smaller the overhead due to the PI message header On the other hand large messages

require a lot of memory on the Java stack which can cause heavy memory usage on ABAP or excessive

garbage collection activity (see section Java Memory) that will also reduce the overall system performance

Very large messages can even crash the PI system by causing an Out-of-Memory exception for example

You therefore have to find a compromise for the PI message size

Below you see throughput measurements performed by SAP In general the best throughput was identified

for messages sizes of 1 to 5 MB

The message size in these measurements corresponds to the XML message size processed in PI and not

the size of the file or IDoc being sent to PI You can use the Runtime Header of the ABAP stack to check the

message size Below you can see an example of a very small message While the MessageSizePayload

field describes the size of the payload in bytes (here 433 bytes) the MessageSizeTotal describes the total

message size (header + payload) In the example this is around 14 KB demonstrating the overhead by the PI

header for small messages The next two lines describe the payload size before and after the mapping In

the example below the mapping reduces the payload size The last two lines determine the size of the

response message that is sent back to PI before and after the response mapping for synchronous

messages

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

88

Based on the above observations we highly recommend that you use a reasonable message size for your

interfaces During the design and implementation of the interface we therefore recommend using a message

size of 1 to 5 MB if possible Messages can be collected at the sender side or by using IDoc packaging as

described in Tuning the IDoc Adapter to achieve this In case of large messages a split has to be performed

by changing the sender processing or by using the split functions available in the structure conversion of the

File adapter

81 Large message queues on PI ABAP

In case the interfaces are using the ABAP Integration Server use the large message queue filters to restrict

the parallelization of mapping calls from ABAP queues To do so set the parameter EO_MSG_SIZE_LIMIT of

category TUNING to eg 5000 to direct all messages larger 5 MB to dedicated XBTL or XBTM queues

The value of the parameter depends on the number of large messages and the acceptable delay that might

be caused due to a backlog in the large message queue

To reduce the backlog the number of large message queues can also be configured via parameter

EO_MSG_SIZE_LIMIT_PARALLEL of category TUNING The default value is 1 so that all messages larger

than the defined threshold will be processed in one single queue Naturally the parallelization should not be

set higher than 2 or 3 to avoid overloading of the Java memory due to parallel large message requests

82 Large message queues on PI Adapter Engine

SAP Note 1727870 - Handling of large messages in the Messaging System introduces large message

queues (virtual queues no adapter behind) also for the Java based Adapter Engine Contrary to the

Integration Engine it is not the size of a single large message only that determines the parallelization

Instead the sum of the size of the large messages across all adapters on a given Java server node is limited

to avoid overloading the Java heap This is based on so called permits that define a threshold of a message

size Each message larger than the permit threshold is considered as large message The number of permits

can be configured as well to determine the degree of parallelization Per default the permit size is 10 MB and

10 permits are available This means that large messages will be processed in parallel if 100 MB are not

exceeded

To show this let us look at an example using the default values Let us assume we have 5 messages waiting

to be processed (status ldquoTo Be Deliveredrdquo on one server node Message A has 5 MB message B has 10

MB message C has 50 MB message D 150 MB message E 50 MB and message F 40 MB Message A is

not considered large since the size is smaller than the permit size and is not considered large and can be

immediately processed Message B requires 1 permit message C requires 5 Since enough permits are

available processing will start (status DLNG) Hence for message D all available 10 permits would be

required Since the permits are currently not available it cannot be scheduled If blacklisting is enabled the

message will be put to error status (NDLV) since it exceed the maximum number of defined permits In that

case the message would have to be restarted manually Message E requires 5 permits and can also not be

scheduled But since there are 4 permits left message F is put to DLNG Due to the smaller size message B

and message F finish first releasing 5 permits This is sufficient to schedule message E which requires 5

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

89

permits Only after message E and C have finished message D can be scheduled consuming all available

permits

The example above shows a potential delay a large message could face due to the waiting time for the

permits But the assumption is that large messages are not time critical and therefore additional delay is less

critical than potential overload of the system

The large message queue handling is based on the Messaging System queues This means that restricting

the parallelization is only possible after the initial persistence of the message in the Messaging System

queues Per default this is only done after the Receiver Determination Therefore if you have a very high

parallel load of incoming large requests this feature will not help Instead you would have to restrict the size

of incoming requests on the sender channel (eg file size limit in the file adapter or the

icmHTTPmax_request_size_KB limit in the ICM for incoming HTTP requests) If you have very complex

extended receiver determination or complex content based routing it might be useful to configure staging in

the first processing step of the Messaging System (BI=3) as described in Logging Staging on the AAE (PI

73 and higher)

The number of permits consumed can be monitored in PIMON Monitoring Adapter Engine Status The

number of threads corresponds to the number of consumed permits

In newer Wily versions there is also a dashboard showing the number of consumed permits as well The

Worker Threads on the right hand side of the screenshot correspond to the permits

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

90

9 GENERAL HARDWARE BOTTLENECK

During all tuning actions discussed in the previous chapters you must keep in mind that the limitation of all

activities is set by the underlying CPU and memory capacity The physical server and its hardware have to

provide resources for three PI runtimes the Adapter Engine the Integration Engine and the Business

Process Engine Tuning one of the engines for high throughput leaves fewer resources for the remaining

engines Thus the hardware capacity has to be monitored closely

91 Monitoring CPU Capacity

When monitoring the CPU capacity it is not enough to use the average that is displayed in the ldquoPrevious

Hoursrdquo section of transaction ST06 Instead you should start your interface and monitor the CPU while it is

running The best procedure is to use operating system tools to monitor the CPU usage (particularly if using

hardware virtualization)

The SMD Host Agent also reports CPU data to Wily Introscope which can be viewed using dashboards as

shown below This example shows two systems where one is facing a temporary CPU overload and the

other a permanent one

Also the NWA offers a view on the CPU activity from NetWeaver 73 on via Availability and Performance

Management Resource Monitoring History Reports There you can build your own report based on the

ldquoCPU utilizationrdquo data as shown in the screenshot

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

91

You can also monitor the CPU usage in ST06 as described below

Procedure

Log on to your Integration Server and call transaction ST06 Take a look at the snapshot that is provided on

the entry screen than navigate to ldquoDetailed Analysis Menurdquo

Snapshot view Is the idle time (upper left corner) at a reasonable value at all times for example around 10 or higher

Snapshot view The load average is a good indicator of the dimension of the bottleneck It describes the number of processes for each CPU that are in a wait queue before they are assigned to a free CPU As long as the average remains at one process for each available CPU the CPU resources are sufficient Once there is an average of around three processes for each available CPU there is a bottleneck at the CPU resources In connection with a high CPU usage a high value here can indicate that too many processes are active on the server In connection with a low CPU usage a high value here can indicate that the main memory is too small The processes are then waiting due to excessive paging

Detailed Analysis View TOP CPU Which thread uses up the highest amount of CPU Is it the J2EE Engine the work processes or the database

o There are different follow-up actions to be taken depending on the findings of the second check The first option is of course to simply increase the hardware This should be accompanied by a thorough sizing (for which SAP provides the Quicksizer in the Service Marketplace httpservicesapcomquicksizer )

o The second option is to identify the CPU-consuming threads and to think about a reduction of the load in this component For example if the J2EE Engine is the largest consumer then it is worth re-evaluating the number of concurrent mapping threads andor consumer queues as described in the previous chapters

o SAP Note 742395 - Analyzing High CPU Usage by the J2EE Engine provides a good entry point if the J2EE Engine consumes too much CPU

o If it is the work processes that consume a lot of CPU use transaction ST03N to figure out if it is the Integration Server or the Business Process Engine You can do that by looking at the CPU usage per user BPE uses the user WF-BATCH while the Integration Server uses the last user who activated a qRFC queue (typically PIAFUSER or PIAPPLUSER) Use the previous chapter to restrict these activities

92 Monitoring Memory and Paging Activity

As stated above paging is very critical for the Java stack since it influence the Java GC behavior directly

Therefore paging should be avoided in every case for a Java-based system

The OS tools are the most reliable for monitoring the paging activity on your system Transaction ST06 can

also be used to perform a snapshot analysis Take a look at the snapshot that is provided on the entry

screen

Snapshot view Is there enough physical memory available

Snapshot view Is there considerable paging Paging can have a negative influence on your J2EE Engine If it does this can be seen in long GC times as described in Chapter 661

93 Monitoring the Database

Monitoring database performance is a rather complex task and requires information to be gathered from a

variety of sources Only a basic set of indicators is given for the purpose of this guide A more detailed

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

92

analysis is required if these indicators point to a major performance problem in the database If assistance is

needed open a support message under the component BC-DB-ORA (or MSS SDB DB6 respectively)

931 Generic J2EE database monitoring in NWA

The next sections of this chapter will be mainly based on ABAP transactions to do a detailed analysis of the

database performanceThe NWA however offers possibilities to monitor the database performance for the

Java based tables This is especially helpful for Java only AEXPO or Non-Central AAEs In the NWA you

can use the ldquoOpen SQL Monitorrdquo in the Troubleshooting section of NWA The official documentation can be

found in the Online Help Monitoring DB via DBACOCKPIT transaction (ABAP based) is possible also for the

Java-only systems from SAP Solution Manager (DBA Cockpit must be configured in the Managed System

Setup)

It will be too much to outline all the available functionalities here Instead only a few key capabilities will be

demonstrated If you want to eg see the number of select update or insert statements in your system you

can use the ldquoTable Statistics Monitorrdquo It clearly shows you which tables in your system are accessed the

most frequently

To see the processing time of the different statements on your system you can use the ldquoOpen SQL statisticsrdquo

monitor There you can see the count the total average and max processing time of the individual SQL

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

93

statements and can therefore identify the expensive statements on your system

Per default the recorded period is always from the last restart of the system If you would like to just look at

the statistics for a specific time period (eg during a test) you have the option of resetting the statistics in the

individual monitor prior to the test

932 Monitoring Database (Oracle)

Procedure

Log on to your Integration Server and call transaction ST04

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

94

Many of the numbers in the screenshot above depend on each other The following checklist names a few

key performance figures of your Oracle database

The data buffer quality for instance is based on the ratio of physical reads versus the total number of reads The lower the ratio the better the buffer quality The data buffer quality should be better than 94 the statistics should be based on 15 million total reads This number of reads ensures that the database is in an equilibrated state

Ratio of user and recursive calls A good performance is indicated by ratio values greater than 2 Otherwise the number of recursive calls compared to the number of user calls is too high Over time this ratio always declines because more and more SQL statements get parsed in the meantime

Number of reads for each user call If this value exceeds 30 blocks for each user call this indicates an expensive SQL statement

Check the value of TimeUser call Values larger than 15 ms often indicate an optimization issue

Compare busy wait time versus CPU time A ratio of 6040 generally indicates a well-tuned system Significantly higher values (for example 8020) indicate room for improvement

The DD-cache quality should be better than 80

933 Monitoring Database (MS SQL)

Procedure

To display the most important performance parameters of the database call Transaction ST04 or choose

Tools Administration Monitor Performance Database Activity An analysis is only meaningful if

the database has been running for several hours with a typical workload To ensure a significant database

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

95

workload we recommend a minimum of 500 CPU busy seconds Note The default values displayed in

section Server Engine are relative values To display the absolute values press button Absolute values

Check the values in (1)

The cache hit ratio (2) which is the main performance indicator for the data cache shows the

average percentage of requested data pages found in the cache This is the average value since startup The value should always be above 98 (even during heavy workload) If it is significantly below 98 the data cache could be too small To check the history of these values use Transaction ST04 and choose Detail Analysis Menu Performance Database A snapshot is collected every 2 hours

Memory setting (5) shows the memory allocation strategy used and shows the following

FIXED SQL Server has a constant amount of memory allocated which is set by SQL Server configuration parameters min server memory (MB) = max server memory (MB)

RANGE SQL Server dynamically allocates memory between min server memory (MB) lt gt max server memory (MB)

AUTO SQL Server dynamically allocates memory between 4 MB and 2 PB which is set by min server memory (MB) = 0 and max server memory (MB) = 2147483647

FIXED-AWE SQL Server has a constant amount of memory allocated which is set by min server memory (MB) = max server memory (MB) In addition the address windowing extension functionality of Windows 2000 is enabled

934 Monitoring Database (DB2)

Procedure

To get an overview of the overall buffer pool usage catalog and package cache information go to

transaction ST04 choose Performance Database section Buffer Pool (or section Cache respectively)

1 1

1 2

1 3

1 5

1 4

gt 6h

gt 98

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

96

Buffer Pools Number Number of buffer pools configured in this system

Buffer Pools Total Size The total size of all configured buffer pools in KB If more than one buffer pool is used choose Performance Buffer Pools to get the size and buffer quality for every single buffer pool

Overall buffer quality This represents the ratio of physical reads to logical reads of all buffer pools

Data hit ratio In addition to overall buffer quality you can use the data hit ratio to monitor the database (data logical reads - data physical reads) (data logical reads) 100

Index hit ratio In addition to overall buffer quality you can use the index hit ratio to monitor the database (index logical reads - index physical reads) (index logical reads) 100

Data or Index logical reads The total number of read requests for data or index pages that went through the buffer pool

Data or Index physical reads The total number of read requests that required IO to place data or index pages in the buffer pool

Data synchronous reads or writes Read or write requests performed by db2agents

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

97

Catalog cache size Maximum size of the catalog cache that is used to maintain the most frequently accessed sections of the catalog

Catalog cache quality Ratio of catalog entries (inserts) to reused catalog entries (lookups)

Catalog cache overflows Number of times that an insert in the catalog cache failed because the catalog cache was full (increase catalog cache size)

Package cache size Maximum size of the package cache that is used to maintain the most frequently accessed sections of the package

Package cache quality Ratio of package entries (inserts) to reused package entries (lookups)

Package cache overflows Number of times that an insert in the package cache failed because the package cache was full (increase package cache size)

935 Monitoring Database (MaxDB SAP DB)

Procedure

As with the other database types call transaction ST04 to display the most important performance

parameters

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

98

With the Database Performance Monitor (transaction ST04) and its submonitors the CCMS in the SAP R3

System allows you to view in the SAP R3 System all of the information that can be used to identify

bottleneck situations

The SQL Statements section provides information about the number of SQL statements executed and related sizes

The IO Activity section lists physical and logical read and write accesses to the database

The Lock Activity section (in transaction ST04) allows you to identify possible bottlenecks in the assignment of locks by the database

The Logging Activity section combines information about the log area

The Scan and sort activity section can be helpful in identifying that suitable indexes are missing

The Cache Activity section provides information about the usage of the caches and the associated hit rates

o The data cache Hit rate should be 99 in a balanced system If this is not the case you need to check if the low hit rate is due to the size of the data cache being too small or due to an unsuitable SQL statement If necessary you can increase the data cache by increasing the MaxDB parameter Data_Cache_Size (lower than 74) or Cache_Size (74 or higher)

o The catalog cache hit rate should be 85 in a balanced system It is not a cause for concern if the catalog hit rate is temporarily lower since accesses to the system tables and an active command monitor can temporarily impair the hit rate If the catalog cache is busy the pages are paged out in the data cache rather than on the hard disk

o Log entries are not written directly to the log volume but first into a buffer (LOG_IO_QUEUE) so as to be able to write several log entries together asynchronously with one IO on the hard disk Only once a LOG_IO_QUEUE page is full is it written to the hard disk However there are situations in which you need to write LOG entries from the LOG_QUEUE onto the hard disk as quickly as possible for example if transactions are completed (COMMITROLLBACK) The transactions wait until the log writer reports the OK informing them that the log entry is on the hard disk Firstly this means that is important to use the quickest possible hard disks for the log volume(s) and secondly you must ensure that no LOG_QUEUE_OVERFLOWS occur in production operation If the LOG_IO_QUEUE is full then all transactions that want to write LOG entries must wait until free memory becomes available again in the LOG_IO_QUEUE At LOG_QUEUE_OVERFOLWS (Transaction DB50 -gt Current Status -gt Activities Overview -gt LOG_IO_QUEUE Overflow) you need to increase the LOG_IO_QUEUE parameter

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

99

o In accordance with Note 819324 - FAQ MaxDB SQL optimization check which SQL statements are responsible for the most disk accesses and whether they can be optimized You should also optimize SQL statements that have a poor runtime and poor selectivity

Bottleneck Analysis

The Bottleneck Analysis function is available in transaction ST04 under the Problem Analyzes

Performance Database Analyzer Bottlenecks You can use this function to activate the corresponding

analysis tool dbanalyzer

The dbanalyzer then collects important performance data every 15 minutes and evaluates this for possible

bottlenecks

The result of the bottleneck analysis is output in text format to provide a quick overview of possible causes of

performance problems For a more detailed description of the bottleneck messages see the online

documentation (httphelpsapcom) in the SAP Web Application Server area and search for SAP DB

bottleneck analysis messages

The dbanalyzer tool can also be started at operating system level It logs its analysis results in files with date

stamps in the subdirectory ltrundirectorygtanalyzer (you can find the actual directory by double-clicking on

lsquoPropertiesrsquo in DB50) of the relevant database instance

94 Monitoring Database Tables

Some of the tables of SAP PI grow very quickly and can cause severe performance problems if archiving or

deletion does not take place frequently For troubleshooting see SAP Note 872388 ndash Troubleshooting

Archiving and Deletion in PI

Procedure

Log on to your Integration Server and call transaction SE16 Enter the table names as listed below and execute In the following screen simply press the button ldquoNumber of Entriesrdquo The most important tables are SXMSPMAST (cleaned up by XML message archivingdeletion)

If you use the switch procedure you have to check SXMSPMAST2 as well

SXMSCLUR SXMSCLUP (cleaned up by XML message archivingdeletion)

If you use the switch procedure you have to check SXMSPCLUR2 and SXMSCLUP2 as well

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

100

SXMSPHIST (cleaned up by the deletion of history entries)

If you use the switch procedure you have to check SXMSPHIST2 as well

SXMSPFRAWH and SXMSPFRAWD (cleaned up by the performance jobs see SAP Note 820622 ndash Standard Jobs for XI Performance Monitoring)

SWFRXI (cleaned up by specific jobs see SAP Note 874708 - BPE HT Deleting Message Persistence Data in SWFRXI)

SWWWIHEAD (cleaned up by work item archivingdeletion)

Use an appropriate database tool for example SQLPLUS for Oracle for the database tables of the J2EE schema Main tables that could be affected of growth

PI Message tables BC_MSG and BC_MSG_AUDIT (when audit log persistence is enabled)

PI message logging information when stagingloging is used BC_MSG_LOG_VERSION

XI IDoc tables (if IDoc persistence is activated) XI_IDOC_IN_MSG and XI_IDCO_OUT_MSG

Check for all tables if the number of entries is reasonably small or remains roughly constant over a period of time If that is not the case check your archivingdeletion setup

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

101

10 TRACES LOGS AND MONITORING DECREASING PERFORMANCE

101 Integration Engine

There are three important locations for the configuration of tracing logging in the Integration Engine The

pipeline settings of PI itself the Internet Communication Manager (ICM) and the gateway All three are

heavily involved in message processing and therefore their tracing or logging settings can have an impact on

performance On top of this you might have changed the SM59 RFC destinations and have switched on the

trace in order to analyze a problem If so then this needs to be reset as well

Procedure

Start transaction SXMB_ADM and navigate to Integration Engine Configuration Specific Configuration and search for the following entries

Category Parameter Subparameter Current Value Default

RUNTIME TRACE_LEVEL ltnonegt ltyour valuegt 1

RUNTIME LOGGING ltnonegt ltyour valuegt 0

RUNTIME LOGGING_SYNC ltnonegt ltyour valuegt 0

Set the above parameters back to the default value which is the recommended value by SAP

Start transaction SMICM and check the value of the Trace Level that is displayed in the overview

(third line) Set the trace level to lsquo1rsquo if not already the case You can do so by navigating to Goto Trace Level Set

Start transaction SMGW navigate to Goto Parameters Display and check the parameter Trace Level Set the trace level to lsquo1rsquo if not already the case You can do so by navigating to Goto Trace Gateway Reduce Level (or Increase Level respectively)

Start transaction SM59 and search the following RFC destination for the flag ldquoTracerdquo on the tab Special Options and remove the Flag ldquoTracerdquo if it has been set

AI_RUNTIME_JCOSERVER

AI_DIRECTORY_JCOSERVER

LCRSAPRFC

SAPSLDAPI It is possible that other RFC destinations are used for sending out IDocs and similar

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

102

102 Business Process Engine

You only have to check the Event Trace in the Business Process Engine

Procedure

Call transaction SWELS and check if the Event Trace is switched on The recommended setting is lsquooffrsquo (as shown in the screenshot below)

103 Adapter Framework

Since the Adapter Framework runs on the J2EE Engine the tracing and logging can be conveniently

checked and controlled using NetWeaver Administrator To improve the performance SAP recommends that

you set all trace levels to default Higher trace levels are only acceptable in productive usage during problem

analysis Below you will find a description on how to set the default trace level for all locations at once It is of

possible to do it for every location separately but this way you might forget to reset one of the locations

Procedure

Start the NetWeaver Administrator and navigate to Problem Management (or Troubleshooting in NW 73 and higher) Logs and Traces and choose Log Configuration In the Show drop down list box choose Tracing Locations Select the Root Location in the tree and press the button Default Configuration This will set the default configuration for all locations

Start the NetWeaver Administrator and navigate to Troubleshooting Logviewer (direct link nwalinks) Open the View ldquoDeveloper Tracerdquo and check if you have very frequent reoccurring exceptions that fill up the trace Analyze the exception and check what is causing this (eg a wrong configuration of a Communication Channel causing repeated traces If you see very frequent errors for which you cannot find the root cause open a customer incident at SAP

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

103

An aggregated view of Java exceptions (by number date instance server node etc) are reported and available in SAP Solution Manager Root Cause Analysis Exception Analysis functionality

1031 Persistence of Audit Log information in PI 710 and higher

With SAP NetWeaver PI 71 the audit log is not persisted for successful messages in the database by default

to avoid performance overhead Therefore the audit log is only available in the cache for a limited period of

time (based on the overall message volume)

Audit log persistence can be activated temporarily for performance troubleshooting where audit log

information is required To do so use the NWA and change the parameter

ldquomessagingauditLogmemoryCacherdquo to false in service XPI Service Messaging System by setting the

parameter to false Details can be found in SAP Note 1314974 - PI 71 AF Messaging System audit log

persistence Only do so temporarily during the time of troubleshooting to avoid any performance problems

from the additional persistence

After implementing Note 1611347 - New data columns in Message Monitoring additional information like

message processing time in ms and server node is visible in the message monitor

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

104

11 ERRORS AS SOURCE OF PERFORMANCE PROBLEMS

If errors occur within messages or within technical components of the SAP Process Integration then this can

have a severe impact on the overall performance Thus to solve performance problems it is sometimes

necessary to analyze the log files of technical components and to search for messages with errors To

search for messages with errors use the PI Admin Check which is available using SAP Note 884865 - PIXI

Admin Check

Procedure

ICM (Internet Communication Manager)

Start transaction SMICM open the log file (Shift + F5 or GoTo Trace File Show All) and check for errors

Gateway

Start transaction SMGW open the Log File (CTRL + Shift + F10 or Goto Trace Gateway Display File) and check for errors

System Log

Start transaction SM21 choose an appropriate time interval and search the log entries for errors Repeat the procedure for remote systems if you are using dialog instances

ABAP Runtime Errors

Start transaction ST22 choose an appropriate time interval and search for PI-related dumps In general the number of ABAP dumps should be very small on a PI system and therefore all occurring dumps should be analyzed

Alerts CCMS

Start transaction RZ20 and search for recent alerts

Work Process and RFC trace

In the PI work directory check all files which begin with dev_rfc or dev_w for errors

J2EE Engine

In the PI work directory search for errors in file dev_server0out If you are using more than one J2EE server node check dev_serverltngtout as well

Applications running on the J2EE Engine

The traces for application-related errors are the defaulttrc files located in j2eeclusterserverltngtlog directory To monitor exceptions across multiple server nodes the best tool to use is the LogViewer service of the NWA This is available via Problem Management Logs and Traces and choose Log Viewer Search for exceptions in the area of the performance problem for example the PI Messaging system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

105

APPENDIX A

A1 Wily Introscope Transaction Trace

As mentioned above the Wily Transaction trace can be used to trace expensive steps that were noticed

during the mapping or module processing The transaction trace will allow you to drill down further into Java

performance problems and to distinguish if it is a pure coding problem or caused by a look-up to a remote

system or a slow connection to the local database

The Wily Transaction trace can be used to trace all steps that exceed a specific duration on the Java stack If

a user is known (for example for SAP Enterprise Portal) the tracing can be restricted to a specific user In PI

if you encounter a module that lasts several seconds then you can restrict the tracing as shown below

Note Starting the Transaction Trace increases the data collection on the satellite system (PI) and is

therefore only recommended in productive environment for troubleshooting purposes

With the selection above the trace will run for 10 minutes (this is the maximum and it can be canceled earlier)

and will trace all steps exceeding 3 seconds for the agents of the PI system If such long-running steps are

found then a new window will be displayed listing these steps

In the screenshot below you can see the result in the Trace View The Trace View shows the elapsed time

from left to rightndash in the example below around 48 seconds From top to bottom we can see the call stack of

the thread In general we are interested in long-running threads on the bottom of the trace view A long-

running block at the bottom means that this is the lowest level coding that was instrumented and which is

consuming all the time

In the example below we see a mapping call that is performing many individual database statements ndash this

will become visible by highlighting the lowest level In such a case you have to review the coding of the

mapping to see if the high amount of database calls can be summarized in one call

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

106

Another case that is often seen is that a lookup using JDBC to a remote database or RFC to an ABAP

system takes a long time in the mapping or adapter module In such a case there will be one long block at

the bottom of the transaction trace that also gives you some details about the statement that was executed

A2 XPI inspector for troubleshooting and performance analysis

The XPI Inspector is an additional tool that provides enhanced logging and tracing capabilities with a main

focus on troubleshooting But the tool can also be used for troubleshooting of performance issues

General information about the tool can be found at Note 1514898 - XPI Inspector for troubleshooting PI The

tool can be called using the following URL on your system httphostjava-portxpi_inspector

To analyze performance problems typically the example ldquo51 ndash ldquoPerformance Problem)rdquo is used As a basic

measurement it allows you to take multiple thread dumps in specific time intervals These thread dumps can

be analyzed later by your administrators or SAP support to identify what is causing the problem The Thread

Dump Viewer tool (TDV) from Note 1020246 - Thread Dump Viewer for SAP Java Engine can be used to

analyze the produced thread dumps (inside the results archive that can be downloaded to your local PC)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

107

In addition the tool allows you to do JVM profiling be either doing JVM Performance tracing or JVM Memory

Allocation tracing This can help to understand in detail which steps of the processing are taking a long time

As output the tool provides prf files that can be loaded into the JVM Profiler More information can be found

at httpwikiscnsapcomwikidisplayASJAVAJava+Profiling Please Note that these traces (especially the

memory profiling) will cause a high overhead on the J2EE engine and are therefore not recommended to be

used in production Instead it is advisable to reproduce the problem on your QA system and use the tool

there

Document Version 30 March 2014

wwwsapcom

copy 2014 SAP AG All rights reserved

SAP R3 SAP NetWeaver Duet PartnerEdge ByDesign SAP

BusinessObjects Explorer StreamWork SAP HANA and other SAP

products and services mentioned herein as well as their respective

logos are trademarks or registered trademarks of SAP AG in Germany

and other countries

Business Objects and the Business Objects logo BusinessObjects

Crystal Reports Crystal Decisions Web Intelligence Xcelsius and

other Business Objects products and services mentioned herein as

well as their respective logos are trademarks or registered trademarks

of Business Objects Software Ltd Business Objects is an SAP

company

Sybase and Adaptive Server iAnywhere Sybase 365 SQL

Anywhere and other Sybase products and services mentioned herein

as well as their respective logos are trademarks or registered

trademarks of Sybase Inc Sybase is an SAP company

Crossgate mgic EDDY B2B 360deg and B2B 360deg Services are

registered trademarks of Crossgate AG in Germany and other

countries Crossgate is an SAP company

All other product and service names mentioned are the trademarks of

their respective companies Data contained in this document serves

informational purposes only National product specifications may vary

These materials are subject to change without notice These materials

are provided by SAP AG and its affiliated companies (SAP Group)

for informational purposes only without representation or warranty of

any kind and SAP Group shall not be liable for errors or omissions

with respect to the materials The only warranties for SAP Group

products and services are those that are set forth in the express

warranty statements accompanying such products and services if

any Nothing herein should be construed as constituting an additional

warranty

wwwsapcom

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

5

641 General performance gain when using Java only scenarios 69 642 Message Flow of Java only scenarios 71 643 Avoid blocking of Java only scenarios 73 644 Logging Staging on the AAE (PI 73 and higher) 73 65 J2EE HTTP load balancing 75 66 J2EE Engine Bottleneck 76 661 Java Memory 76 662 Java System and Application Threads 80 663 FCA Server Threads 83 664 Switch Off VMC 84

7 ABAP PROXY SYSTEM TUNING 85 71 New enhancements in Proxy queuing 86

8 MESSAGE SIZE AS SOURCE OF PERFORMANCE PROBLEMS 87 81 Large message queues on PI ABAP 88 82 Large message queues on PI Adapter Engine 88

9 GENERAL HARDWARE BOTTLENECK 90 91 Monitoring CPU Capacity 90 92 Monitoring Memory and Paging Activity 91 93 Monitoring the Database 91 931 Generic J2EE database monitoring in NWA 92 932 Monitoring Database (Oracle) 93 933 Monitoring Database (MS SQL) 94 934 Monitoring Database (DB2) 95 935 Monitoring Database (MaxDB SAP DB) 97 94 Monitoring Database Tables 99

10 TRACES LOGS AND MONITORING DECREASING PERFORMANCE 101 101 Integration Engine 101 102 Business Process Engine 102 103 Adapter Framework 102 1031 Persistence of Audit Log information in PI 710 and higher 103

11 ERRORS AS SOURCE OF PERFORMANCE PROBLEMS 104

APPENDIX A 105 A1 Wily Introscope Transaction Trace 105 A2 XPI inspector for troubleshooting and performance analysis 106

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

6

1 INTRODUCTION

SAP NetWeaver Process Integration (PI) consists of the following functional components Integration Builder

(including Enterprise Service Repository (ESR) Service Registry (SR) and Integration Directory) Integration

Server (including Integration Engine Business Process Engine and Adapter Engine) Runtime Workbench

(RWB) and System Landscape Directory (SLD) The SLD in contrast to the other components may not

necessarily be part of the PI system in your system landscape because it is used by several SAP NetWeaver

products (however it will be accessed by the PI system regularly) Additional components in your PI

landscape might be the Partner Connectivity Kit (PCK) a J2SE Adapter Engine one or several non-central

Advanced Adapter Engines or an Advanced Adapter Engine Extended An overview is given in the graphic

below The communication and accessibility of these components can be checked using the PI Readiness

Check (SAP Note 817920)

Processing at runtime is carried out by the Integration Server (IS) with the aid of the SLD The IS in this

graphic stands for the Web Application Server 70 or higher In the classic environment PI is installed as a

double stack an ABAP and a Java stack (J2EE Engine) The ABAP stack is responsible for pipeline

processing in the Integration Engine (Receiver Determination Interface Determination and so on) and the

processing of integration processes in the Business Process Engine Every message has to pass through

the ABAP stack The Java stack executes the mapping of messages (with the exception of ABAP mappings)

and hosts the Adapter Framework (AFW) The Adapter Framework contains all XI adapters except the plain

HTTP WSRM and IDoc adapters

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

7

With 71 the so called Advanced Adapter Engine (AAE) was introduced The Adapter Engine was enhanced

to provide additional routing and mapping call functionality so that it is possible to process messages locally

This means that a message that is handled by a sender and receiver adapter based on J2EE does not need

to be forwarded to the ABAP Integration Engine This saves a lot of internal communication reduces the

response time and significantly increases the overall throughput The deployment options and the message

flow for 71 based systems and higher are shown below Currently not all the functionalities available in the

PI ABAP stack are available on the AAE but each new PI release closes the gap further

In SAP PI 73 and higher the Adapter Engine Extended (AEX) was introduced In addition to the AAE the

Adapter Engine Extended also allows local configuration of the PI objects The AEX can therefore be seen

as a complete PI installation running on Java only From the runtime perspective no major differences can be

seen compared to the AAE and therefore no differentiation is made in this guide

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

8

Looking at the different involved components we get a first impression of where a performance problem

might be located In the Integration Engine itself in the Business Process Engine or in one of the Advanced

Adapter Engines (central non-central plain PCK) Of course a performance problem could also occur

anywhere in between for example in the network or around a firewall Note that there is a separation

between a performance issue ldquoin PIrdquo and a performance issue in any attached systems If viewed ldquothrough

the eyesrdquo of a message (that is from the point of view of the message flow) then the PI system technically

starts as soon as the message enters an adapter or (if HTTP communication is used) as soon as the

message enters the pipeline of the Integration Engine Any delay prior to this must be handled by the

sending system The PI system technically ends as soon as the message reaches the target system for

example a given receiver adapter (or in case of HTTP communication the pipeline of the Integration Server)

has received the success message of the receiver Any delay after this point in time must be analyzed in the

receiving system

There are generally two types of performance problems The first is a more general statement that the PI

system is slow and has a low performance level or does not reach the expected throughput The second is

typically connected to a specific interface failing to meet the business expectation with regard to the

processing time The layout of this check is based on the latter First you should try to determine the

component that is responsible for the long processing time or the component that needs the highest absolute

time for processing Once this has been clarified there is a set of transactions that will help you to analyze

the origin of the performance problem

If the recommendations given in this guide are not sufficient SAP can help you to optimize the service via

SAP consulting or SAP MaxAttention services SAP might also handle smaller problems restricted to a

specific interface if you describe your problem in a SAP customer incident Support offerings like ldquoSAP

Going Live Analysis (GA)rdquo and ldquoSAP Going Live Verification (GV)rdquo can be used to ensure that the main

system parameters are configured according to SAP best practice You can order any type of service using

SAP Service Marketplace or your local SAP contacts

If you have already worked with the PI Admin Check (SAP Note 884865) you will recognize some of the

transactions This is because SAP considers the regular checking of the performance to be an important

administrational task However this check tries to show the methodology to approach performance problems

to its reader Also it offers the most common reasons for performance problems and links to possible follow-

up actions but does not refer to any regular administrational transactions

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

9

Wily Introscope is a prerequisite for analyzing performance problems at the Java side It allows you to

monitor the resource usage of multiple J2EE server nodes and provides information about all the important

components on the Java stack like mapping runtime messaging system or module processor Furthermore

the data collected by Wily is stored for several days so it is still possible to analyze a performance problem at

a later date Wily Introscope is provided free-of-charge for SAP customers It is delivered as part of Solution

Manager Diagnostics but can also be installed separately For more information see

httpservicesapcomdiagnostics or SAP Note 797147

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

10

2 WORKING WITH THIS DOCUMENT

If you are experiencing performance problems on your Process Integration system start with Chapter 3

Determining the Bottleneck It helps you to determine the processing time for the different runtimes within PI

Integration Engine Business Process Engine and Adapter Engine or connected Proxy System Once you

have identified the area in which the bottleneck is most probably located continue with the relevant chapter

This is not always easy to do because a long processing time is not always an indication for a bottleneck It

can make sense to do this if a complicated step is involved (Business Process Engine) if an extensive

mapping is to be executed (IS) or if the payload is quite large (all runtimes) For this reason you need to

compare the value that you retrieve from Chapter 3 with values that you have received previously for

example The history data provided for the Java components by Wily Introscope is a big help If this is not

possible you will have to work with a hypothesis

Once the area of concern has been identified (or your first assumption leads you there) Chapters 4 5 6

and 7 will help you to analyze the Integration Engine the Business Process Engine the PI Adapter

Framework and the ABAP Proxy runtime

After (or preferably during) the analysis of the different process components it is important to keep in mind

that the bottlenecks you have observed could also be caused by other interfaces processing at the same

time It will lead you to slightly different conclusions and tuning measures You therefore have to distinguish

the following cases based on where the problem occurs

A with regard to the interface itself that is it occurs even if a single message of this interface is processed

B or with regard to the volume of this interface that is many messages of this interface are processed at the same time

C or with regard to the overall message volume processed on PI

This can be analyzed by repeating the procedure of Chapter 3 for A) a single message that is processed

while no other messages of this interface are processed and while no other interfaces are running Then

compare this value with B) the processing time for a typical amount of messages of this interface not simply

one message as before If the values of measurement A) and B) are similar repeat the procedure with C) a

typical volume of all interfaces that is during a representative timeframe on your productive PI system or

with the help of a tailored volume test

These three measurements ndash A) processing time of a single message B) processing time of a typical

amount of messages of a single interface and C) processing time of a typical amount of messages of all

interfaces ndash should enable you to distinguish between the three possible situations

o A specific interface has long processing steps that need to be identified and improved The tuning options are usually limited and a re-design of the interface might be required

o The mass processing of an interface leads to high processing times This situation typically calls for tuning measures

o The long processing time is a result of the overall load on the system This situation can be solved by tuning measures and by taking advantage of PI features for example to establish separation of interfaces If the bottleneck is hardware-related it could also require a re-sizing of the hardware that is used

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

11

Chapters 8 9 10 and 11 deal with more general reasons for bad performance such as heavy

tracinglogging error situations and general hardware problems They should be taken into account if the

reason for slow processing cannot be found easily or if situation C from above applies (long processing times

due to a high overall load)

Important

Chapter 9 provides the basic checks for the hardware of the PI server It should not only be used when

analyzing a hardware bottleneck due to a high overall load andor an insufficient sizing but should also be

used after every change made to the configuration of PI to ensure that the hardware is able to handle the

new situation This is important because a classical PI system uses three runtime engines (IS BPE AFW)

Tuning one engine for high throughput might have a direct impact on the others With every tuning action

applied you have to be aware of the consequences on the other runtimes or the available hardware

resources

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

12

3 DETERMINING THE BOTTLENECK

The total processing time in PI consists of the processing time in the Integration Server processing time in

the Business Process Engine (if ccBPM is used) as well as the processing time in the Adapter Framework (if

adapters except IDoc plain HTTP or ABAP proxies are used) In the subsequent chapters you will find a

detailed description of how to obtain these processing times

For reasons of completeness we will also have a look at the ABAP Proxy runtime and the available

performance evaluations

31 Integration Engine Processing Time

You can use an aggregated view of the performance (details about activation in Note 820622 - Standard jobs

for XI performance monitoring) data for the Integration Engine as shown below

Procedure

Log on to your Integration Engine and call transaction SXMB_IFR In the browser window follow the link to

the Runtime Workbench In there click ldquoPerformance Monitoringrdquo Change the display to ldquoDetailed Data

Aggregatedrdquo select lsquoISrsquo as the Data Source (or lsquoPMIrsquo if you have configured the Process Monitoring

Infrastructurersquo) and ldquoXIIntegrationServerltyour_hostgtrdquo as the component Then choose an appropriate time

interval for example the last day You have to enter the details of the specific interface you want to monitor

here

The interesting value is the ldquoProcessing Time [s]rdquo in the fifth column It is the sum of the processing time of the single steps in the Integration Server For now you are not interested in the single steps that are listed on the right side since you are still trying to find out the part of the PI system where the most time is lost Chapter 44 will work with the individual steps

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

13

If you do not know which interface is affected yet you first have to get an overview Instead of navigating to

Detailed Data Aggregated in the Runtime Workbench choose Overview Aggregated Use the button Display

Options and check the options for sender component receiver component sender interface and receiver

interface Check the processing times as described above

32 Adapter Engine Processing Time

Procedure

Log on to your Integration Server and call transaction SXMB_IFR In the browser window follow the link to

the Runtime Workbench In there click Message Monitoring choose Adapter Engine lthostgt from Database

from the drop down lists and press Display Enter your interface details and ldquoStartrdquo the selection Select one

message using the radio button and press Details

Calculate the difference between the start and end timestamp indicated on the screen

Do the above calculation for the outbound (AFW IS) as well as the inbound (IS AFW) messages

The audit log of successful message is no longer persisted in SAP NetWeaver PI 71 by default in order to minimize the load on the database The audit log has a high impact on the database load since it usually consists of many rows that are inserted individually in the database In newer releases therefore a cache was implemented that keeps the audit log information for a period of time only Thus no detailed information is available any longer from a historical point of view Instead you should use Wily Introscope for historical analyses of the performance of successful messages

The audit log can be persisted on the database to allow historical analysis of performance problems

on the Java stack To do so change the parameter ldquomessagingauditLogmemoryCacherdquo from

true to false for service ldquoXPI Service Messaging Systemrdquo using NetWeaver Administrator (NWA) as described in SAP Note 1314974 Note Only do this temporarily if you have identified the bottleneck on the AFW First try using Wily to determine the root cause of the problem

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

14

Due to above mentioned limitations Wily Introscope is the right tool for performance analysis of the Adapter Engine Wily persists the most important steps like queuing module processing and average response time in historical dashboards (aggregated data per intervals for all server nodes of a system or all systems with filters possible etc)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

15

Especially for Java-Only runtime using Integrated Configuration Wily shows the complete processing time after the initial sender adapter steps (receiver determination mapping and time in the backend call) as can be seen in the Message Processing Time dashboard below

Using the ldquoShow Minimum and Maximumrdquo functionality deviations from the average processing time can be seen In the example below a message processing step takes up to 22 seconds The reason for this could either be a slow mapping or a delay in the call to the receiver system (in the example below Java IDoc adapter is used to transfer the data to ERP)

In case of high processing times in one of your steps further analysis might be required using thread dumps or the Java Profiler as outlined in the appendix section A2 XPI inspector for troubleshooting and performance analysis

321 Adapter Engine Performance monitor in PI 731 and higher

Starting from PI 731 SP4 there is a new performance monitor available for the Adapter Engine More

information on the activation of the performance monitor can be found at Note 1636215 ndash Performance

Monitoring for the Adapter Engine Similar to the ABAP monitor in the RWB the data is displayed in an

aggregated format On the PI start page use the link ldquoConfiguration and Monitoring Homerdquo Go to ldquoAdapter

Enginerdquo and select ldquoPerformance Monitorrdquo The data is persisted on an hourly basis for the last 24 hours and

on a daily basis for the last 7 days The data is further grouped on interface level and per Java server node

With the information provided you can therefore see the minimum maximum and average response time of

an individual interface on a specific Java server node All individual steps of the message processing like

time spent in the Messaging System queues or Adapter modules are listed In the example below you can

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

16

see that most of the time (5 seconds) is spent in a module called SimpleWaitModule This would be the entry

point for the further analysis

Starting with 731 SP10 (740 SP05) this data can be exported to excel in two ways Overview or Detailed (details in SAP Note 1886761)

33 Processing Time in the Business Process Engine

Procedure

Log on to your integration server and call transaction SXMB_MONI Adjust your selection in a way that you

select one or more messages of the respective interface Once the messages are listed navigate to the

Outbound column (or Inbound if your integration process is the sender) and click on PE Alternatively you

can select the radio button for ldquoProcess Viewrdquo on the entranceselection screen of SXMB_MONI

To judge the performance of an integration process it is essential to understand the steps that are executed A long-running integration process in itself is not critical because it can wait for a trigger event or can include a synchronous call to a backend Therefore it is important to understand the steps that are included before analyzing an individual workflow log For example the screenshot below shows a long waiting time after the mapping This is perhaps due to an integrated wait step

Calculate the time difference between the first and the last entry Note that this time for the workflow does not include the duration of RFC calls for example To see this processing time navigate to the ldquoList with Technical Detailsrdquo (second button from the left on the screenshot below or shift + F9) Repeat this step for several messages to get an overview about the most time consuming steps

Please Note that the processing time mentioned above does not include the waiting time of the messages in the ccBPM queues Therefore it is essential to monitor a queue backlog in ccBPM queues as discussed in section ldquoQueuing and ccBPM (or increasing Parallelism for ccBPM Processes)rdquo

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

17

Via transaction SWI2_DURA you can display an aggregated performance overview across all ccBPM

processes running on your PI system In the initial screen you have to choose ldquo(Sub-) Workflow and the time

range you would like to look at

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

18

The results here allow you to compare the average performance of a process It shows you the processing

time based on barriers The 50 barrier means that 50 of the messages were processed faster than the

value specified Furthermore you also see a comparison of the processing time to eg the day before By

adjusting the selection criteria of the transaction you can therefore get a good overview about the normal

processing time of the Integration process and can judge if there is a general performance problem or just a

temporary one

The number of process instances per integration process can be easily checked via transaction

SWI2_FREQ The data shown allows you to check the volume of the Integration Processes and allow you to

judge if your performance problem is caused by an increase of volume which could cause a higher latency in

the ccBPM queues

New in PI 73 and higher

Starting from PI 73 a new monitoring for ccBPM processes is available This monitor can be started from

transaction SXMB_MONI_BPE Integration Process Monitoring (also available in ldquoConfiguration and

Monitoring Homerdquo on PI start page) This is new browser based view that allows a simplified and aggregated

view on the PI Integration Processes

On the initial screen you get an overview about all the Integration Processes executed in the selected time

interval Therefore you can immediately see the volume of each Integration Process

From there you can navigate to the Integration process facing the performance issues and look at the

individual process instances and the start and end time Furthermore there is a direct entry point to see the

PI messages that are assigned to this process

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

19

Choosing one special process instance ID you will see the time spend in each individual processing step

within the process In the example below you can see that most of the time is spend in the Wait Step

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

20

4 ANALYZING THE INTEGRATION ENGINE

If chapter Determining the Bottleneck has shown that the bottleneck is clearly in the Integration Engine there

are several transactions that can help you to analyze the reason for this To understand why this selection of

transactions helps to analyze the problem it is important to know that the processing within the Integration

Engine is done within the pipeline The central pipeline of the Integration Server executes the following main

steps

Central Pipeline Steps Description of Central Pipeline Steps

PLSRV_XML_VALIDATION_RQ_INB XML Validation Inbound Channel Request

PLSRV_RECEIVER_DETERMINATION Receiver Determination

PLSRV_INTERFACE_DETERMINATION Interface Determination

PLSRV_RECEIVER_MESSAGE_SPLIT Branching of Messages

PLSRV_MAPPING_REQUEST Mapping

PLSRV_OUTBOUND_BINDING Technical Routing

PLSRV_XML_VALIDATION_RQ_OUT XML Validation Outbound Channel Request

PLSRV_CALL_ADAPTER Transfer to Respective Adapter IDoc HTTP Proxy AFW

PLSRV_XML_VALIDATION_RS_INBXML Validation Inbound Channel Response

PLSRV_MAPPING_RESPONSE Response Message Mapping

PLSRV_XML_VALIDATION_RS_OUT Validation Outbound Channel Response

The last three steps highlighted in bold are available in case of synchronous messages only and reflect the

time spent in the mapping of the synchronous response message

The steps indicated in red are new in SAP NetWeaver PI 71 and higher and can be used to validate the

payload of a PI message against an XML schema These steps are optional and can be executed at different

times in the PI pipeline processing for example before and after a mapping (as shown above)

With PI 731 an additional option of using an external Virus Scan during PI message processing can be

activated as described in the Online Help The Virus Scan can be configured on multiple steps in the pipeline

ndash eg after the mapping (Virus Scan Outbound Channel Request) when receiving a synchronous response

(Virus Scan Inbound Channel Response) and after the response mapping (Virus Scan Outbound Channel

Response)

It is important to know is that these pipeline steps are executed based on qRFC Logical Unit of Work (LUW)

by using dialog work processes

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

21

With 711 and higher versions you can activate the Lean Message Search (LMS) to be able to search for

payload content This is described in more detail in chapter Long Processing Times for

ldquoLMS_EXTRACTIONrdquo

41 Work Process Overview (SM50SM66)

The work process overview is the central transaction to get an overview of the current processes running in

the Integration Engine For message processing PI is only using Dialog work processes (DIA WP) Therefore

it is essential to ensure that enough DIA WPs available

The process overview is only a snapshot analysis You therefore have to refresh the data multiple times to

get a feeling for the dynamics of the processes The most important questions to be asked are as follows

How many work processes are being used on average Use the CPU Time (clock symbol) to check that not all DIA WPs are used In case all DIA WPs have a high CPU time this indicates a WP bottleneck

Which users are using these processes It depends on the usage of your PI system All asynchronous messages are processed by the QIN scheduler who will start the message processing in DIA WPs The user that is shown in SM66 will be the one that triggered the QIN scheduler This can eg be a communication user for a connected backend (usually a copy of PIAPPLUSER) or the user used to send data from the Adapter Engine (per default PIAFUSER)

Activities related to the Business Process Engine (ccBPM) will be indicated by user WF-BATCH If you see high WF-BATCH activity take a look at chapter 51

Are there any long running processes and if so what are these processes doing with regard to the reports used and the tables accessed

o As a rule of thumb for the initial configuration the number of DIA work processes should be around six times the value of CPU in your PI system (rdispwp_no_dia = 6 to 8 times CPUs cores based on SAP Note 1375656 - SAP NetWeaver PI System Parameters)

If you would like to get an overview for an extended period of time without actually refreshing the transaction

at regular intervals use report SDFMON and schedule the Daily Monitoring It allows you to collect metrics

such as CPU usage amount of free Dialog work processes and most importantly a snapshot of the work

process activity as provided in transactions SM50 and SM66 The frequency for the data collection can be as

low as 10 seconds and the data can be viewed after the configurable timeframe of the analysis Note

SDFMON is only intended for analysis of performance issues and should only be scheduled for a limited

period of time

If you have Solution Manager Diagnostics installed and all the agents connected to PI you can also monitor

the ABAP resource usage using Wily Introscope As you can see in the dashboard below you can use it to

monitor the number of free work processes (prerequisite is that the SMD agent is running on the PI system)

The major advantage of Wily Introscope is that this information is also available from the past and allows

analysis after the problem has occurred in the system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

22

42 qRFC Resources (SARFC)

Depending on the configuration of the RFC server group not all dialog work processes can be used for

qRFC processing in the Integration Server As stated earlier all (ABAP based) asynchronous messages are

processed using qRFC Therefore tuning the RFC layer is one of the most important tasks to achieve good

PI performance

In case you have a high volume of (usually runtime critical) synchronous scenarios you have to ensure that

enough DIA WPs are kept free by the asynchronous interfaces running at the same time Since this is a very

difficult tuning exercise we usually recommend implementing runtime critical synchronous interfaces using

Java only configuration (ICO) whenever possible If this is not possible you have to ensure via SARFC tuning

that enough work processes are kept free to process the synchronous messages

The current resource situation in the system can be monitored using transaction SARFC

Procedure

First check which application servers can be used for qRFC inbound processing by checking the AS Group

assigned in transaction SMQR

Call transaction SARFC and refresh several times since the values are only snapshot values

Is the value for ldquoMax Resourcesrdquo close to your overall number of dialog work processes Max Resources is a fixed value and describes how many DIA work processes may be used for RFC communication If the value is too low your RFC parameters need tuning to provide the Integration Server with enough resource for the pipeline processing The RFC parameters can be displayed by double-clicking on ldquoServer Namerdquo

o A good starting point is to set the values as follows

Max no of logons = 90

Max disp of own logons = 90

Max no of WPs used = 90

Max wait time = 5

Min no of free WP = 3-10 (depending on the size of the application server)

These values are specified in SAP Note 1375656 ndash SAP NetWeaver PI System Parameters (this parameter must be increased if you plan to have synchronous interfaces) To avoid any bottleneck and blocking situation free DIA WPs should always be available The value should be adjusted based on the overall number of DIA WPs available at the system and the above requirements

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

23

Note You have to set the parameters using the SAP instance profile Otherwise the changes are lost

after the server is restarted

The field ldquoResourcesrdquo shows the number of DIA WPs available for RFC processing Is this value reasonably high for example above zero all the time If the ldquoResourcesrdquo value is zero then the Integration Server cannot process XML messages because no dialog work process is free to execute the necessary qRFC There are several conclusions that can be drawn from this observation as follows

1) Depending on the CPU usage (see Chapter ldquoMonitoring CPU Capacityrdquo) it might be necessary at this time to increase the number of dialog work processes in your system If the CPU usage however is already very high increasing the number of DIA will not solve the bottleneck

2) Depending on the number of concurrent PI queues (see Chapter qRFC Queues (SMQ2)) it might be necessary to decrease the degree of parallelism in your Integration Server because the resources are obviously not sufficient to handle the load The next section describes how to tune PI inbound and outbound queues

The data provided in SARFC is also collected and shown in a graphical and historical way using Solution

Manager Diagnostic and Wily Introscope (as can be seen in the screenshot below) This allows easy

monitoring of the RFC resources across all available PI application servers

43 Parallelization of PI qRFC Queues

The qRFC inbound queues on the ABAP stack are one of the most important areas in PI tuning Therefore it

is essential to understand the PI queuing concept

PI generally uses two types of queues for message processing ndash PI inbound and PI outbound queues Both

types are technical qRFC inbound queues and can therefore be monitored using SMQ2 PI inbound queues

are named XBTI (EO) or XBQI (EOIO) and are shared between all interfaces running on PI by default The

PI outbound queues are named XBTO (EO) and XBQO (EOIO) The queue suffix (in red XBTO0___0004)

specifies the receiver business system This way PI is using dedicated outbound queues for each receiver

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

24

system All interfaces belonging to the same receiver business system are contained in the same outbound

queue Furthermore there are dedicated queues for prioritization of separation of large messages To get an

overview about the available queues use SXMB_ADM Manage Queues

PI inbound and outbound queues execute different pipeline steps

PI Inbound Queues

Pipeline Steps Description of Pipeline Steps

PLSRV_XML_VALIDATION_RQ_INB XML Validation Inbound Channel Request

PLSRV_RECEIVER_DETERMINATION Receiver Determination

PLSRV_INTERFACE_DETERMINATION Interface Determination

PLSRV_RECEIVER_MESSAGE_SPLIT Branching of Messages

PI Outbound Queues

Pipeline Steps Description of Pipeline Steps

PLSRV_MAPPING_REQUEST Mapping

PLSRV_OUTBOUND_BINDING Technical Routing

PLSRV_XML_VALIDATION_RQ_OUT XML Validation Outbound Channel Request

PLSRV_CALL_ADAPTER Transfer to Respective Adapter IDoc HTTP Proxy AFW

Before the steps above are executed the messages are placed in a qRFC queue (SMQ2) The messages

will wait till they are first in queue and till a free DIA work process is assigned to a queue The wait time in the

queue are recorded in the performance header in DB_ENTRY_QUEUING (PI inbound queue) and

DB_SPLITTER_QUEUING (PI outbound queue) Most often you will see the long processing times there as

discussed in chapter Long Processing Times for ldquoDB_ENTRY_QUEUEINGrdquo and Long Processing Times for

ldquoDB_SPLITTER_QUEUEINGrdquo

Looking at the steps above it is clear that tuning the queues will have a direct impact on the connected PI

components and also backend systems For example by increasing the number of parallel outbound queues

more mappings will be executed in parallel which will in turn put a greater load on the Java stack or more

messages will be forwarded in parallel to the backend system in case of a Proxy call Thus when tuning PI

queues you must always keep the implications for the connected systems in mind

The tuning of PI ABAP queue parameters is done in transaction SXMB_ADM Integration Engine

Configuration and by selecting the category TUNING

For productive usage we always recommend to use inbound and outbound queues (parameter

EO_INBOUND_TO_OUTBOUND=1) Otherwise only inbound queues will be used which are shared across

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

25

all interfaces Hence a problem with one single backend system will affect all interfaces running on the

system

The principle of less is sometimes more also applies for tuning the number of parallel PI queues Increasing

the number of parallel queues will result in more parallel queues with fewer entries per queue In theory this

should result in a lower latency if enough DIA WPs are available

But practically this is not true for high volume systems The main reason for this is the overhead involved in

the reloading of queues (see details below) Furthermore important tuning measures like PI packaging (see

Chapter ldquoPI Message Packagingrdquo) aim to increase the throughput based on a higher number of messages in

the queue Thus from a throughput perspective it is definitely advisable to configure fewer queues

If you have very high runtime requirements you should prioritize these interfaces and assign a different

parallelism for high priority queues only This can be done using parameters

EO_INBOUND_PARALLEL_HIGH_PRIO and EO_OUTBOUND_PARALLEL_HIGH_PRIO as described in

section ldquoMessage Prioritization on the ABAP Stackrdquo

To tune the parallelism of inbound and outbound queues the relevant parameters are

EO_INBOUND_PARALLEL and EO_OUTBOUND_PARALLEL have to be used

Below you can see a screenshot of SMQ2 You can see the PI inbound queues and outbound queues Also

ccBPM queues (XBQO$PE) are displayed and will be discussed in section 5

Procedure

Log on to your Integration Server call transaction SMQ2 and execute If you are running ABAP proxies on a

separate client on the same system enter lsquorsquo for the client Transaction SMQ2 provides snapshots only and

must therefore be refreshed several times to get viable information

o The first value to check is the number of queues concurrently active over a period of time Since each queue needs a dialog work process to be worked on the number of concurrent queues must not be too high compared to the number of dialog work processes usable for RFC communication (see Chapter qRFC Resources (SARFC)) The optimum throughput can be achieved if the number of active queues is equal to or slightly higher than the number of dialog work processes (provided that the number of DIA work processes can be handled by the CPU of the system see transaction ST06)

In case many of the queues are EOIO queues (eg because the serialization is done on the material number) try to reduce the queues by following Chapter EOIO tuning

o The second value of importance is the number of entries per queue Hit refresh a few times to check if the numbers increasedecreaseremain the same An increase of entries for all queues or a specific

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

26

queue points to a bottleneck or a general problem on the system The conclusion that can be drawn from this is not simple Possible reasons have been found to include

1) A slow step in a specific interface

Bad processing time of a single message or a whole interface can be caused by expensive processing steps as for example the mapping step or receiver determination This can be confirmed by looking at the processing time for each step as shown in ldquoAnalyzing the runtime of pipeline stepsrdquo

2) Backlog in Queues

Check if inbound or outbound queues face a backlog A backlog in a queue is generally nothing critical since the queues ensure that the PI components as well as the backend systems are not overloaded For instance batch triggered interfaces are usually causing high backlogs that get processed over time Only in case the backlog prevents you to meet the business requirements for an interface this should be analyzed

If the backlog is caused by a high volume of messages arriving in a short period of time one solution to minimize the backlog is to increase the number of queues available for this interface (if sufficient hardware is available) If one interface is having a backlog in the outbound queues you could eg specify EO_OUTBOUND_PARALLEL with a sub parameter specifying your interface to increase the parallelism for this interface

But in general a backlog can also be caused by a long processing time in one of the pipeline steps as discussed in 1) or a performance problem with one of the connected components like PI Java stack or connected backend systems Due to the steps performed in the outbound queues (mapping or call adapter) it is more likely that a backlog will be seen in the outbound queues Inbound queue processing is only expensive if Content Based Routing or Extended Receiver Determination is used To understand which step is taking long follow once more chapter ldquoAnalyzing the runtime of pipeline stepsrdquo

In addition to tuning of the number of inbound and outbound queues also ldquoMessage Prioritization on the ABAP Stackrdquo and ldquoPI Message Packagingrdquo can help

The screenshot below shows such a backlog situation in Wily Introscope Please Note that the information about the queues is only collected by Wily in 5 minutes intervals explaining the gaps between the measurement points On the dashboard ldquoTotal qRFC Inbound Queue Entriesrdquo you can see that the number of messages in SMQ2 is increasing The number of inbound queues does not increase at the same time It also offers a dashboard for the number of entries by queue name and the time entries remain in SMQ2 With the information provided here it is also possible to distinguish a blocking situation or a general resource problem after the problem is no longer present in the system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

27

3) Queues stay in status READY in SMQ2

To see the status of the queues use the filter button in SMQ2 as shown below

It is a normal situation to see many queues in READY status for a limited amount of time The reason for this is the way the QIN scheduler reloads LUWs (see point 2 below) A waiting time of up to 5 seconds per queuing step is considered normal during normal load situation (even if there is no backlog in the queue and enough resources are available) If you observe a situation where many queues are in READY status for longer period of time the following situations can apply

A resource bottleneck with regard to RFC resources Confirm by following section qRFC Resources (SARFC)

To avoid overloading a system with RFC load the QIN scheduler only reloads new queues after a certain amount of queues have finished processing This can lead to long waiting times in the queues as explained in SAP Note 1115861 - Behaviour of Inbound Scheduler after Resource Bottleneck Since PI is a non-user system we recommend setting rfcinb_sched_resource_threshold to 3 as described in SAP Note 1375656 ndash SAP NetWeaver PI System Parameters

Long DB loading times for RFC tables If using Oracle ensure that SAP Note 1020260 - Delivery of Oracle statistics is applied

Additional overhead during the pipeline executions occurs due to PI internal mechanisms Per default a lock is set during processing of each message This is no longer necessary and should

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

28

be switched off by setting the parameter LOCK_MESSAGE of category RUNTIME to 0 as described in SAP Note 1058915 In addition during the processing of an individual message the message repeatedly calls an enqueue which leads to a deterioration of the throughput This can be avoided by setting parameter CACHE_ENQUEUE to 0 as described in SAP Note 1366904

Sometimes a queue stays in READY status for a very long time while other queues are getting processed fine After manual unlocking the queue is processed but then stops after some time again A potential reason is described in Note 1500048 - SMQ2 inbound Queue remains in READY state In a PI environment queues in status RUNNING for more than 30 minutes usually happen due to one individual step taking long (as discussed in step 1) or a problem on the infrastructure (eg memory problems on Java or HTTP 200 lost on network) After 30 minutes the QIN scheduler removes this queue from the scheduling and therefore it remains in READY To solve such cases the root cause for the blocking has to be understood

4) Queue in READY status due to queue blacklisting

If a queue is for a very long time in status RUNNING (eg due to a problem with a mapping or due to a very slow backend) it is possible that the QIN scheduler excludes this queue from queue scheduling Hence the queue stays in status READY even though enough work processes are available The ldquoblacklistingrdquo of queues takes place when the runtime of a queue exceeds the ldquoMonitoring thresholdrdquo configured on this queue Notes 1500048 - queues stay in ready (schedule_monitor) and Note 1745298 - SMQR Inbound queue scheduler identify excluded entries provide more input on this topic The screenshot below shows a setting of 2400 seconds (40 minutes) for this parameter

Check if a queue stays in READY status for a long time while others are processing without any issue Ensure Note 1745298 is implemented and check the system log for the following exception ldquoQINEXCLUDE ltqueue namegt lttimedate at which the queue scheduler startedgt If this is the case check why the long runtime occurs (see section Analyzing the runtime of PI pipeline steps) or increase the schedule_monitor in exceptional cases only (if eg the runtime of an individual processing step cannot be improved further)

5) An error situation is blocking one or more queues

Click the Alarm Bell (Change View) push button once to see only queues with error status Errors delay the queue processing within the Integration Server and may decrease the throughput if for example multiple retries occur Navigate to the first entry of the queue and analyze the problem by forward navigation to the relevant message in SXMB_MONI

Often you see EO queues in SYSFAIL or RETRY due to the problems during the processing of an individual message This can be prevented by following the description of Chapter Prevent blocking of EO queues

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

29

44 Analyzing the runtime of PI pipeline steps

The duration of the pipeline steps is the ultimate answer to long processing times in the Integration Engine

since it describes exactly how much time was spent at which point The recommended way to retrieve the

duration of the pipeline steps is the RWB and will be described below Advanced users may use the

Performance Header of the SOAP message using transaction SXMB_MONI but the timestamps are not

easy to read If you still prefer the latter method here is the explanation 20110409092656165 must be read

as yyyymmddhh(min)(min)(sec)(sec)(microseconds) that is the timestamp corresponds to April 9 2011 at

092656 and 165 microseconds Please note that these timestamps in Performance Header of

SXI_MONITOR transaction are stored and displayed in UTC time therefore conversion to system time must

be done when analyzing them

In case PI message packaging is configured the performance header will always reflect the processing time

per package Hence a duration of 50 seconds does not mean a single message took 50 seconds but

eventually the package contained 100 messages so that every message took 05 ms More details about

this can be found in section PI Message Packaging

Procedure

Log on to your Integration Server and call transaction SXMB_IFR In the browser window follow the link to

the Runtime Workbench In there click Performance Monitoring Change the display to Detailed Data

Aggregated and choose an appropriate time interval for example the last day For this selection you have to

enter the details of the specific interface you want to monitor

You will now see in the lowerndashpart of the screen a split of the processing time into single steps Check the time difference between the steps Does any step take longer than expected In the example in the screenshot below the DB_ENTRY_QUEUEING starts after 0032 seconds and ends after 0243 seconds which means it took 0211 seconds (211ms)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

30

Compare the processing times for the single steps for different measurements as outlined in Chapter

Pipeline Steps (SXMB_MONI or RWB) For example is a single step only long if many messages

are processed or if a single message is processed This helps you to decide if the problem is a

general design problem (single message has long processing step) or if it is related to the message

volume (only for a high number of messages this process step has large values)

Each step has different follow-up actions that are described next

441 Long Processing Times for ldquoPLSRV_RECEIVER_ DETERMINATIONrdquo PLSRV_INTERFACE_DETERMINATION

Steps that can take a long time in inbound processing are the Receiver and Interface Determination In these

steps the receiver system and interface is calculated Normally this is very fast but PI offers the possibility of

enhanced receiver determinations In these cases the calculation is based on the payload of a message

There are different implementation options

o Content-Based Routing (CBR)

CBR allows defining XPath expressions that are evaluated during runtime These expressions can be combined by logical operators for example to check the value for multiple fields The processing time of such a request directly correlates to the number and complexity of the conditions defined

No tuning options exist for the system in regard to CBR The performance of this step can only be changed by reducing the number of rules by changing the design of the interfaces Another option is to use an extended receiver determination (executing a mapping program) since this is faster than CBR using many rules

o Mapping to determine Receivers

A standard PI mapping (ABAP Java XSLT) can also be used to determine the receiver If you observe high runtimes in such a receiver determination follow the steps outlined in the next section since the same mapping runtime is used

442 Long Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo

Before analyzing the mapping you must understand which runtime is used Mappings can be implemented

in ABAP as graphical mappings in the Enterprise Service Builder as self-developed Java mappings or

XSLT mappings One interface can also be configured to use a sequence of mappings executed

sequentially In such a case analysis is more difficult because it is not clear which mapping in the sequence

is taking a long time

Normal debugging and tracing tools (transaction ST12) can be used for mappings executed in ABAP Any

type of mapping can be tested from ABAP using transaction SXI_MAPPING_TEST Knowing

senderreceiver partyserviceinterfacenamespace and source message payload it is possible to check the

target message (after mapping execution) and detailed trace output (similar to contents of trace of the

message in SXMB_MONI with TRACE_LEVEL = 3) It can also be used for debugging at runtime by using

the standard debugging functionality

For ABAP based XSLT mappings it is also possible via transaction XSLT_TOOL to test trace and debug

XSLT transformations on the ABAP stack

In general the mapping response time is heavily influenced by the complexity of the mapping and the

message size Therefore to analyze a performance problem in the mapping environment you should

compare the mapping runtime during the time of the problem with values reported several days earlier to get

a better understanding

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

31

If no AAE is used the starting point of a mapping is the Integration Engine that will send the request by RFC

destination AI_RUNTIME_JCOSERVER to the gateway There it will be picked up by a registered server

program The registered server program belongs to the J2EE Engine The request will be forwarded to the

J2EE Engine by a JCo call and then executed by the Java runtime When the mapping has been executed

the result is sent back to the ABAP pipeline There are therefore multiple places to check when trying to

determine why the mapping step took so long

Before analyzing the mapping runtime of the PI system check if only one interface is affected or if you face a

long mapping runtime for different interfaces To do so check the mapping runtime of messages being

processed at the same time in the system

The best tool for such an analysis is Wily Introscope which offers a dashboard for all mappings being

executed at a given time Each line in the dashboard represents one mapping and shows the average

response time and the number of invocations

In the screenshot below you can see that many different mapping steps have required around 500 seconds

for processing Comparing the data during the incident with the data from the day before will allow you to

judge if this might be a problem of the underlying J2EE engine as described in section J2EE Engine

Bottleneck

If there is only one mapping that faces performance problems there would be just one line sticking out in the

Wily graphs If you face a general problem that affects different interfaces you can choose a longer

timeframe that allows you to compare the processing times in a different time period and verify if it is only a

ldquotemporaryrdquo issue ndash this would for example indicate an overload of the mapping runtime

If you have found out that only one interface is affected then it is very unlikely to be a system problem but

rather a problem in the implementation of the mapping of that specific interface

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

32

Check the message size of the mapping in the runtime header using SXMB_MONI Verify if the message size is larger than usual (which would explain the longer runtime)

There can also be one or many lookups to a remote system (using RFC or JDBC) causing the long processing time of a mapping Together with the application you then have to check if the connection to the backend is working properly Such a mapping with lookups can be analyzed using Wily Transaction Trace as explained in the appendix section Wily Transaction Trace

If not one but several interfaces are affected a potential system bottleneck occurs and this is described in

the following

o Not enough resources (registered server programs) available That could either be the case if too many mapping requests are issued at the same time or if one J2EE server node is down and has not registered any server programs

To check if there were too many mapping requests for the available registered server programs compare the number of outbound queues that are concurrently active with the number of registered server programs The number of outbound queues that are concurrently active can be monitored with transaction SMQ2 and counting queues with the names XBTO amp XBQOXB2O XBTA and XBQAXB2A (high priority) XBTZ and XBQZXB2Z (low priority) and XBTM (large messages) In addition to this you have to take into account mapping steps that are executed by synchronous messages or in a ccBPM process Mappings executed by a ccBPM process are not queued but the ccBPM process calls the mapping program directly using tRFC The number of registered server programs can be determined with transaction SMGW Goto Logged On Clients and filtering by the program (ldquoTP Namerdquo) AI_RUNTIME_ltSIDgt In general we recommend configuring 20 parallel mapping connections per server node

If the problem occurred in the past it is more difficult to determine the number of queues that are concurrently active One way is to use transaction SXMB_MONI and specify every possible outbound queue (field ldquoQueue IDrdquo) in the advanced selection criteria (Note wildcard search not available) The timeframe can be restricted in the upper-part of the advanced selection criteria Advanced users could use transaction SE16 to search directly in table SXMSPMAST using the field ldquoQUEUEINTrdquo

The other option is to use the Wily ldquoqRFC Inbound Queue Detailrdquo dashboard as described in qRFC Queue Monitoring (SMQ2)

To check if the J2EE is not available or if the server program is not registered for a different reason use transaction SM59 to test the RFC destination AI_RUNTIME_JCOSERVER If the problem occurred in the past search the gateway trace (transaction SMGW GoTo Trace Gateway Display File) and the RFC developer traces dev_rfc files available in the work directory for the string ldquoAI_RUNTIMErdquo In addition check the std_serverltngtout in the work directory for restarts of the J2EE Engine

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

33

o Depending on the checks above there are two options to resolve the bottleneck Of course if the J2EE was not available the reason for this has to be found and prevented in the future

The two options for tuning are

1) the number of outbound queues that are concurrently active and

2) the number of mapping connections from the J2EE server to the ABAP gateway

The increase of mapping connections is only recommended for strong J2EE servers since each mapping thread needs resources like CPU and Heap memory Too many mapping connections mean too many mappings being executed on the J2EE Engine In turn this might reduce the performance of other parts of the PI system for example the pipeline processing in the Integration Engine or the processing within the adapters of the AFW or can even cause out-of-memory errors on the J2EE engine Add additional J2EE server nodes to balance the parallel requests Each J2EE server node will

register new destinations at the gateway and will therefore take a part of the mapping load

Another option is to reduce the number of outbound queues that are concurrently active to solve the bottleneck This will certainly result in higher backlogs in the queue and is therefore only applicable for less runtime critical interfaces This option can be used if the backlog is caused by a master data interface with high volume but no critical processing time requirements In such a case you can only lower the number of parallel queues for this interface by using a sub-parameter of EO_OUTBOUND_PARALLEL in transaction SXMB_ADM By doing so you slow down one interface to ensure other (more critical) interfaces have sufficient resources available

o If multiple application servers configured on the PI system ensure that all the server nodes are connected to their local gateway (local bundle option) as described in the guide How To Scale PI

443 Long Processing Times for ldquoPLSRV_CALL_ADAPTERrdquo

Call adapter is the last step executed in the PI pipeline and forwards the message to the next component

along the message flow In most cases (local Adapter Engine IDoc or BPE) this is a local call The IDoc

adapter only puts the message on the tRFC layer (SM58) of the PI system so that the actual transfer of the

IDoc is not included in the performance header For ABAP Proxies plain HTTP or calls to a central or

decentral Adapter Engine an HTTP call is made so that network time can have an influence here

Looking at the processing time we have to distinguish asynchronous (EO and EOIO) and synchronous (BE)

interfaces

In asynchronous interfaces the call adapter step includes the transfer of the message by the network and

the initial steps on the receiver side (in most cases this just means persisting the message on the receiverrsquos

database) A long duration can therefore have two reasons

o Network latency For large messages of several MB in particular the network latency is an important factor (especially if the receiving ABAP proxy system is located on another continent for example) Network latency has to be checked in case of a long call adapter step In case HTTP load balancers are used they are considered as part of the network

o Insufficient resources on receiver side Enough resources must be available at the receiver side to ensure quick processing of a message For instance enough ICM threads and dialog work processes must be available in the case of an ABAP proxy system Therefore the analysis of long call adapter steps always has to include the relevant backend system

For synchronous messages (requestresponse behavior) the call adapter step also includes the processing

time on the backend to generate the response message Therefore the call adapter for synchronous

messages includes the time of the transfer of the request message the calculation of the corresponding

response message at the receiver side and the transfer back to PI Therefore the processing time to

process a request at the receiving target system for synchronous messages must always be analyzed to find

the most costly processing steps

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

34

444 Long Processing Times for ldquoDB_ENTRY_QUEUEINGrdquo

The value for DB_ENTRY_QUEUEING describes the time that a message has spent waiting in a PI inbound

queue before processing started In case of errors the time also includes the wait time for restart of the LUW

in the queue The inbound queues (XBTI XBT1 XBT9 XBTL for EO messages and XBQIXB2I

XBQ1XB21 XBQ9XB29 for EOIO messages) process the pipeline steps for the Receiver

Determination the Interface Determination and the Message Split (and optionally XML inbound validation)

Thus if the step DB_ENTRY_QUEUEING has a high value the inbound queues have to be monitored using

transactions SMQ2 and SARFC The reasons are similar as those outlined in chapter qRFC Resources

(SARFC) and qRFC Queues (SMQ2)

o Not enough resources (DIA work processes for RFC communication) available

o A backlog in the queues caused by one of the following reasons

The number of parallel inbound queues is too low to handle the incoming messages in parallel Note that a simple increase of the parameter EO_INBOUND_PARALLEL might not always be the solution (as enough work processes also have to be available to process the queues in parallel) The number of work processes is in turn restricted by the number of CPUs that are available for your system If you would like to separate critical from non-critical sender systems define a dedicated set of inbound queues by specifying a sender ID as a sub parameter of EO_INBOUND_PARALLEL For example by doing so you can separate master data interfaces from business critical interfaces Please note that the separation on inbound queues only works on Business System level and not on interface level

The inbound queues were blocked by the first LUW Use transaction SMQ2 to check if that is still the case To identify if such a situation occurred in the past use the Wily Introscope dashboard ldquoABAP System qRFC Inbound Detailldquo It is generally not easy to find the one message that is blocking the queue It might be achieved by checking RFC traces or by searching for a single message with a long pipeline processing step For EO queues there is no business reason to block the queues in case of a single message error To prevent this follow the description in section Prevent blocking of EO queues

In case you have a different runtime of the messages in the queues (eg due to complex extended Receiver Determinations) the number of messages per queue might be uneven distributed between the queues In PI 73 a new balancing option is available as discussed in Avoid uneven backlogs with queue balancing

Problems when loading the LUW by the QIN scheduler as described in qRFC Queues (SMQ2)

Note If your system uses the parameter EO_INBOUND_TO_OUTBOUND = 0 you must also read chapter

445 for analyzing the reasons EO_INBOUND_TO_OUTBOUND only determines whether inbound queues

(value lsquo0rsquo) or inbound and outbound queues (value lsquo1rsquo) are used for the pipeline processing in the integration

server Check the value with transaction SXMB_ADM Integration Engine Configuration Specific

Configuration The default value is lsquo1rsquo meaning the usage of inbound and outbound queues (the

recommended behavior)

445 Long Processing Times for ldquoDB_SPLITTER_QUEUEINGrdquo

The value for DB_SPLITTER_QUEUEING describes the time that a message has spent waiting in a PI

outbound queue until a work process was assigned In case of errors the time also includes the wait time for

the restart of the LUW in the queue

The outbound queues (XBTO XBTA XBTZ XBTM for EO messages and XBQOXB2O

XBQAXB2A XBQZXB2Z for EOIO messages) process the pipeline steps for the mapping outbound

binding and call adapter

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

35

Thus if the step DB_SPLITTER_QUEUEING has a high value the outbound queues have to be monitored

using transactions SMQ2 and SARFC The reasons are similar as outlined in chapters qRFC Resources

(SARFC) and qRFC Queues (SMQ2) or described in the section above for PI inbound queues

o Not enough resources (DIA work processes for RFC communication) available

o A backlog in the queues caused by one of the following reasons

The number of parallel outbound queues is too low to handle the outgoing messages in parallel Note that simply increasing the parameter EO_OUTBOUND_ PARALLEL might not always be the solution as enough work processes have to be available to process the queues in parallel The number of work processes in turn is limited by the number of CPUs available for the system Also the parameter EO_OUTBOUND_PARALLEL is used differently than the parameter EO_INBOUND_ PARALLEL because it determines the degree of parallelism for each receiver system Thus it is possible to increase the number of parallel outbound queues for specific receivers whereas other receivers are handled with a lower degree of parallelism Note that not every receiver backend is able to handle a high degree of parallelism so that in case of an ABAP Proxy system it might not indicate a problem on PI in case you observe a high value for DB_SPLITTER_QUEUING

The outbound queues were blocked by the first LUW To identify if such a situation occurred in the past use the Wily Introscope dashboard ldquoABAP System qRFC Inbound Detailldquo In general it is not easy to find the one message that is blocking the queue It might be achieved by checking RFC traces or by searching for a single message with a long pipeline processing step For EO queues there is no business reason to block the queues in case of a single message error To prevent this follow the description in section Prevent blocking of EO queues

Since the outbound queues include a processing step that is bound to take longer than the other steps (that is the mapping and the call adapter step) this might mean a long waiting time for all subsequent messages As an example assume that a mapping takes 1 second and that 100 messages are waiting in a queue The first message gets executed at once the second message has to wait about 1 second (a bit more since more steps are executed than just the mapping but this should be ignored at the moment) the third takes 3 seconds and so on The 100

th message

has to wait 100 seconds until it is processed that is the value for DB_SPLITTER_QUEUING is as high as 100 For a given outbound queue you would therefore see an increase for the DB_SPLITTER_QUEUING value over time If you experience this situation proceed with Chapter 442

In case you have a different runtime of the messages in the queues (eg due to different message size) the number of messages per queue might be uneven distributed between the queues In PI 73 a new balancing option is available as discussed in Avoid uneven backlogs with queue balancing

o Problems when loading the LUW by the QIN scheduler as described in qRFC Queues (SMQ2)

Note The number of parallel outbound queues is also connected with the ability of the receiving system to

process a specific amount of messages for each time unit

In section Adapter Parallelism default restrictions of the specific adapters of the J2EE Engine are explained Some adapters (JDBC File Mail) can process only one message for each server node at a time (this can be changed) It does not make sense for such adapters to have too many parallel qRFC queues since this will only move the message backlog from ABAP to Java

For messages that are directed to the IDoc outbound adapter the value for EO_OUTBOUND_PARALLEL is connected to the MAXCONN value of the corresponding outbound destination You can define the maximum number of connections using transaction SMQS ndash the default value is 10 parallel connections If applicable SAP highly recommends that you use IDoc packaging in the outbound IDoc adapter to reduce the load during the transmission of tRFC calls from the PI system to the receiving system

For ABAP proxy systems the number of outbound queues directly determines the number of messages sent in parallel to the receiving system Thus you have to ensure that the resources

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

36

available on PI are aligned with those on the sendingreceiving ABAP proxy system

446 Long Processing Times for ldquoLMS_EXTRACTIONrdquo

Lean Message Search can be configured for newer PI releases as described in the Online Help After

applying Note 1761133 - PI runtime Enhancement of performance measurement an additional header for

LMS will be written to the performance header

The header could look like this indicating that around 25 seconds were spent in the LMS analysis

When using trace level two additional timestamps are written to provide details about this overall runtime

LMS_EXTRACTION_GET_VALUES

This timestamp describes the evaluation of the message according to user-defined filter criteria

LMS_EXTRACTION_ADJUST_VALUES

This timestamp describes the post-processing of the previously found values in the BAdI (as of PI 730)

The runtime of the LMS heavily depends on the number of payload attributes to be extracted Also the

payload size and the complexity of the XPath expression has a direct impact on the performance of the LMS

In general the number of elements to be indexed should be kept at a minimum and very deep and complex

XPath expressions should be avoided

If you want to minimize the impact of LMS on the PI message processing time you can define the extraction

method to use an external job In that case the messages will be indexed after processing only and it will

therefore have no performance impact during runtime Of course this method imposes a delay in the

indexing (based on the frequency of job SXMS_EXTRACT_MESSAGES) so that newly processed

messages cannot be searched using LMS If this delay is acceptable for the monitoring responsible the

messages should be indexed using an external job

447 Other step performed in the ABAP pipeline

As discussed earlier there are additional steps in the ABAP pipeline that can be executed based on the

configuration of your scenario Per default these steps will not be activated and should therefore not

consume any time All these steps are traced in the performance header of the message Below you can see

the details for

- XML validation

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

37

- Virus Scan

In case one of these steps is taking long you have to check the configuration of your scenario In the

example of the Virus scan the problem might be related to the external virus scanner used and tuning has to

happen on that side

45 PI Message Packaging for Integration Engine

PI message packaging was introduced with SAP NetWeaver PI 70 SP13 and is per default activated in PI

71 and higher Message packaging in BPE is independent of message packaging in PI (see chapter 56)

but they can be used together

To improve performance and message throughput asynchronous messages can be assembled to packages

and processed as one entity For this the qRFC scheduler was enabled to process a set of messages (a

package) in one LUW Thus instead of sending one message to the different pipeline steps (for example

mappingrouting) a package of messages will be sent that will reduce the number of context switches that is

required Furthermore access to databases is more efficient since requests can be bundled in one database

operation Depending on the number and size of messages in the queue this procedure improves

performance considerably In return message packaging can increase the runtime for individual messages

(latency) due to the delay in the packaging process

Message packaging is only applicable for asynchronous scenarios (QoS EO and EOIO) Due to the potential

latency of individual messages packaging is not suitable for interfaces with very high runtime requirements

In general packaging has the highest benefit for interfaces with high volume and small message size

The performance improvement achieved directly relates to the number of the messages bundled in each

package Message packaging must not solely be used in the PI system Tests have shown that the

performance improvement significantly increases if message packaging is configured end-to-end - that is

from the sending system using PI to the receiving system Message packaging is mainly applicable for

application systems connected to PI by ABAP proxy

From the runtime perspective no major changes are introduced when activating packaging Messages

remain individual entities in regards to persistence and monitoring Additional transactions are introduced

allowing the monitoring of the packaging process

Messages are also treated individually for error handling If an error occurs in a message processed in a

package then the package will be disassembled and all messages will be processed as single messages Of

course in a case where many errors occur (for example due to interface design) this will reduce the benefits

of message packaging SAP systems connected to PI via ABAP Proxy can also make use of the message

packaging to reduce the HTTP calls necessary to transmit messages between the systems Other sending

and receiving applications in particular will not see any changes because they send and receive individual

messages and receive individual commits

The PI message packaging can be adapted to meet your specific needs In general the packaging is

determined by three parameters

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

38

1) Message count Maximum number of messages in a package (default 100)

2) Maximum package size Sum of all messages in kilobytes (default 1 MB)

3) Delay time Time to wait before the queue is processed if the number of messages does not reach

the message count (default 0 meaning no waiting time)

These values can be adjusted using transactions SXMS_BCONF and SXMS_BCM To better use the

performance improvements offered by packaging you could for example define a specific packaging for

interfaces with very small messages to allow up to 1000 messages for each package Another option could

be to increase the waiting time (only if latency is not critical) to create bigger packages

See SAP Note 1037176 - XI Runtime Message Packaging for more details about the necessary

prerequisites and configuration of message packaging More information is also available at

httphelpsapcom SAP NetWeaver SAP NetWeaver PIMobileIdM 71 SAP NetWeaver Process

Integration 71 Including Enhancement Package 1 SAP NetWeaver Process Integration Library

Function-Oriented View Process Integration Integration Engine Message Packaging

46 Prevent blocking of EO queues

In general messages for interfaces using Quality of Service Exactly Once are independent of each other In

case of an error in one message there is no business reason to stop the processing of other messages But

exactly this happens in case an EO queue goes into error due to an error in the processing of a single

message The queue will then be automatically retried in configurable intervals This retry will cause a delay

of all other messages in the queue which cannot be processed due to the error of the first message in the

queue

To avoid this a new error handling was introduced in SAP Notes 1298448 - XI runtime No automatic retry

for EO message processing (set EO_RETRY_AUTOMATIC = 0 and schedule restart job

RSXMB_RESTART_MESSAGES) and SAP Note 1393039 - XI runtime Dump stops XI queue After

applying this Note EO queues will not go in SYSFAIL status any longer These Notes are recommended for

all PI systems and are per default activated on PI 73 systems By specifying a receiver ID as sub parameter

this behavior can be configured for individual interfaces only

In PI 73 you can also configure the number of messages that would go into error before the whole queue

would be stopped For permanent errors (eg due to a problem with the receiving backend) the queue would

go into SYSFAIL so that not too many messages are going into error This also reduces the number of alerts

for Message Based Alerting To define the threshold the parameter EO_RETRY_AUT_COUNT can be set

The default value is 0 and indicates that the number of messages is not restricted

47 Avoid uneven backlogs with queue balancing

From PI 73 on a new mechanism is available to address the distribution of messages in queues Per default

in all PI versions the messages are getting assigned to the different queues randomly In case of different

runtimes of LUWs caused by eg different message sizes or different runtimes in mappings this can cause

an uneven distribution of messages within the different queues This can increase the latency of the

messages waiting at the end of that queue

To avoid such an unequal distribution the system checks the queue length before putting a new message in

the queue Hence it tries to achieve an equal balancing during inbound processing A queue which has a

higher backlog will get less new messages assigned Queues with fewer entries will get more messages

assigned Therefore it is different than the old BALANCING parameter of category TUNING) which was used

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

39

to rebalance messages already assigned to a queue The assignment of messages to queues is shown in

the diagram below

To activate the new queue balancing you have to set the parameters EO_QUEUE_BALANCING_READ and

EO_QUEUE_BALANCING_SELECT of category TUNING in SXMB_ADM If the value of the parameter

EO_QUEUE_BALANCING_READ is 0 (default value) then the messages are distributed randomly across

the queues that are available If however EO_QUEUE_BALANCING_READ is set to a value n greater than

zero then on average the current fill level of the queue is determined after every nth message and stored in

the shared memory of the application server This data is used as the basis for determining the queue (see

description of parameter EO_QUEUE_BALANCING_SELECT) relevant for balancing

The parameter EO_QUEUE_BALANCING_SELECT specifies the relative fill levels of the queues in percent

Relative here means in relation to the maximum filled queue If there are queues with a lower fill level than

defined here then only these are taken into consideration for distribution If all queues have a higher fill level

then all queues are taken into consideration

Please note that determining the fill level takes place using database access and therefore impacts system

performance The value of EO_QUEUE_BALANCING_READ should for this reason be oriented towards the

message throughput and specific requirements or even distribution For higher volume system a higher value

should be chosen

Letrsquos look at an example In case you set the parameter EO_QUEUE_BALANCING_READ to 1000 after

every 1000th incoming message the queue distribution will be checked This requires a database access

and can therefore cause a performance impact The fill level for each queue will then be written to shared

memory In our example we configured EO_QUEUE_BALANCING_SELECT to 20 At a given point in time

you have three outbound queues on the system XBTO__A contains 500 entries XBTO__B 150 and

XBTO__C contains 50 messages Therefore XBTO__B has a fill level of 30 and XBTO__C of 10 That

means the only queue relevant for balancing is XBTO__C which will get more messages assigned

Based on this example you can see that it is important to find the correct values As a general guideline to

minimize performance impact you should set EO_QUEUE_BALANCING_READ to a rather large value The

correct value of EO_QUEUE_BALANCING_SELECT depends on the criticality of a delay caused by a

backlog

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

40

48 Reduce the number of parallel EOIO queues

As discussed in Chapter Parallelization of PI qRFC Queues it is important to keep the number of queues

limited As a rule of thumb the number of active queues should be equal to the number of available work

processes in the system It is especially crucial to not have a lot of queues containing only one message

since this will cause a high overhead on the QIN scheduler due to very frequent reloading of the queues

Especially for EOIO queues we often see a high number of parallel queues containing only one message

The reason for this is for example that the serialization has to be done on document number to eg ensure

that updates to the same material are not transferred out of sequence Typically during a batch triggered

load of master data this can result in hundreds of EOIO queues in SMQ2 just containing only one or two

messages This is very bad from a performance point of view and will cause significant performance

degradation for all other interfaces running at the same time To overcome this situation the overall number

of EOIO queues (for all interfaces) can be limited as of PI 71 as described in Change Number of EOIO

Queues

The maximum number of queues can be set in SXMB_ADM Configure Filter for Queue Prioritization

Number of EOIO queues

During runtime a new set of queues will be used with the name XB2 as can be seen below for the outbound

queues

These queues will be shared between all EOIO interfaces for the given interface priority and by setting the

number of queues the parallelization for all EOIO interfaces is limited Thus more messages will be using the

same EOIO queue and therefore PI message packaging will work better and also the reloading of the

queues by the QIN scheduler will show much better performance

In case of errors the messages will be removed from the XB2 queues and will be moved to the standard

XBQ queues All other messages for the same serialization context will be moved to the XBQ queue

directly to ensure the serialization is maintained This means that in case of an error the shared EOIO

queues will not be blocked and messages for other serialization contexts will be not delayed

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

41

49 Tuning the ABAP IDoc Adapter

Very often the IDoc adapter deals with very high message volume and tuning of this is very essential

491 ABAP basis tuning

As stated above the IDoc adapter uses the tRFC layer to transfer messages from PI to the receiver system

The IDoc adapter will only put the message in SM58 and from there the standard tRFC layer forwards the

LUW You therefore have to ensure that sufficient resources (mainly DIA WPs) are available for processing

tRFC requests

Wily Introscope also offers a dashboard (shown below) for the tRFC layer which shows the tRFC entries and

their different statuses With these dashboards you are also able to identify history backlogs on tRFC

In order to control the resources used when sending the IDocs from sender system to PI or from PI to the

receiver backend you can also think about registering the RFC destinations in SMQS in PI as known from

standard ALE tuning By changing the Max Conn or the Max Runtime values in SMQS you can limit or

increase the number of DIA WPs used by the tRFC layer to send the IDocs for a given destination This will

mitigate the risk of system overload on sender and receiver side

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

42

492 Packaging on sender and receiver side

One option for improving the performance of messages processed with the IDoc adapter is to use IDoc

packaging

The receiver IDoc adapter in PI (for sending messages from PI to the receiving application system) already

had the option for packaging IDocs in previous releases For this messages processed in the IDoc adapter

are bundled together before being sent out to the receiving system Thus only one tRFC call is required to

send multiple IDocs The message packaging that is being discussed in section PI Message Packaging uses

a similar approach and therefore replaces the ldquooldrdquo packaging of IDocs in the receiver IDoc adapter

Therefore we highly recommend configuring Message Packaging since this helps transferring data for the

IDoc adapter as well as the ABAP Proxy

For the sender IDoc adapter there was previously the option to activate packaging in the partner profile of

the sending system in case IDocs are not send immediately but in batch By specifying the ldquoMaximum

Number of IDocsrdquo in the report RSEOUT00 multiple IDocs are bundled in one tRFC calls At the PI side

these packages will be disassembled by the IDoc adapter and the messages will be processed individually

Therefore this will help mainly to reduce the tRFC resources involved in transferring the data between the

systems but not within PI

This behavior was changed with SAP NetWeaver PI 711 and higher If the sender backend sends an IDoc

package to PI a new option has been introduced which allows the processing of IDocs as a package on PI

as well You can specify an IDoc package size on the sender IDoc Communication Channel The necessary

configuration is described in detail in the following SDN blog - IDoc Package Processing Using Sender IDoc

Adapter in PI EhP1 A significant increase in throughput was recorded for high volume interfaces with small

IDocs

493 Configuration of IDoc posting on receiver side

As described in SAP Note 1828282 - IDoc adapter long runtime XI -gt IDoc-gt ALE also the way the IDocs

are posted on the receiver ERP backend will directly influence the processing of the LUW on the PI side In

general two options exist for IDoc posting

Trigger immediately The IDoc will be posted directly when it is received For this a free dialog workprocess will be required

Trigger by background program If this option is chosen the IDoc is persisted on the ERP side and the posting of the application data will happen asynchronously using report RBDAPP01 via a background job

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

43

When ldquotrigger immediatelyrdquo is configured the sender system (PI) has to wait till the IDoc is posted While the

workprocess will roll out you will see the LUW in ldquoTransaction Executingrdquo during this time For the

IDOC_AAE the RFC connection and the java threads will be kept busy until the IDOC is posted in the

backend therefore leading to backlogs on the IDOC_AAE

The coding for posting the application data can be very complex and therefore this operation can take a long

time consuming unnecessary resources also on the sender side With background processing of the IDocs

the sender and receiver systems are decoupled from each other

Due to this we generally recommend to use ldquotrigger immediatelyrdquo in exceptional cases where the data has to

be posted without any delay In all other cases like high volume replication of master data background

processing of IDocs should be chosen

With Note 1793313 and 1855518 a third option for posting IDocs on the ERP side was introduced This

option is using bgRFC on the ERP side to queue the IDocs As soon as the configured resources for the

bgRFC destination are available the IDocs will be posted By doing so the IDocs will be posted based on

resource availability on the receiver system and no additional background jobs will be required Furthermore

storing the LUWs in bgRFC will ensure that the sender and receiver systems are decoupled This option

therefore guarantees that the IDocs will be posted as soon as possible based on the available resources on

the receiver system without the requirement to schedule many background jobs This new option is therefore

the recommended way of IDoc posting in systems that fulfill the necessary requirements

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

44

410 Message Prioritization on the ABAP Stack

A performance problem is often caused by an overload of the system for example due to a high volume

master interface If business critical interfaces with a maximum response time are running at the same time

then they might face delays due to a lack of resources Furthermore as described in qRFC Queues (SMQ2)

messages of different interfaces use the same inbound queue by default which means that critical and less-

critical messages are mixed up in the same queue

We highly recommend using message prioritization on the ABAP stack to prevent such delays for critical

interfaces (for Java see Interface Prioritization in the Messaging System )

For more information about PI prioritization see httphelpsapcom and navigate to SAP NetWeaver SAP

NetWeaver PIMobileIdM 71 SAP NetWeaver Process Integration 71 Including Enhancement Package 1

SAP NetWeaver Process Integration Library Function-Oriented View Process Integration

Integration Engine Prioritized Message Processing

To balance the available resources further between the configured interfaces you can also configure a

different parallelization level for queues with different priorities Details for this can be found in SAP Note

1333028 ndash Different Parallelization for HP Queues

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

45

5 ANALYZING THE BUSINESS PROCESS ENGINE (BPE)

This chapter helps to analyze performance problems if Chapter Determining the Bottleneck has shown that

the BPE is the slowest step for a message during its time in the PI system Of course this type of runtime

engine is very susceptible to inefficient design of the implemented integration process Information about

best practices for designing BPE processes can be found in the documentation Making Correct Use of

Integration Processes In general the memory and resource consumption is higher than for a simple pipeline

processing in the Integration Engine As outlined in the document linked above every message that is sent

to BPE and every message that is sent from BPE is duplicated In addition work items are created for the

integration process itself as well as for every step More database space is therefore required than

previously expected and more CPU time is needed for the additional steps

51 Work Process Overview

The Business Process Engine executes the steps (work items) of an integration process using tRFC calls to

the RFC destination WORKFLOW_LOCAL_ltclientgt To check if there are enough resources available to

process the work items call transaction SM50 (on each application server) while one of the integration

processes is running

o Look for the user ldquoWF-BATCHrdquo and follow its activities by refreshing the transaction

o Are there always dialog work processes available

The most important point to keep in mind here is the concurrent activity of the Business Process Engine (for

ccBPM processes) and the Integration Engine (for pipeline processing) Both runtime environments need

dialog work processes Thus performance problems in one of the engines cannot be solved by restricting

the other Rather an appropriate balance between the two runtimes has to be found

The activity of the WF-BATCH user is not restricted by default It is highly recommended that you register the

RFC destination WORKFLOW_LOCAL_ltclientgt with transaction SMQS and assign a suitable amount of

maximum connections Otherwise ccBPM activity may block the pipeline processing of the Integration

Server and reduce the throughput of PI drastically SAP Note 888279 - Regulatingdistributing the Workflow

Load gives more details

In addition it might help to use specific servers (dialog instances) to carry out a given workflow or to use load

balancing Of course both options only apply for larger PI systems that consist of at least a central instance

and one dialog instance

52 Duration of Integration Process Steps

As described in section Processing Time in the Business Process Engine with PI 73 there is a new monitor

available in ldquoConfiguration and Monitoring Homerdquo on PI start page It shows the duration of individual steps

of an integration process in a very easy way and is the now tool to analyze performance related issues on

the ccBPM

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

46

For releases prior than PI 73 check in the workflow log for long-running steps as described in the previous

chapter Use the ldquoList with Technical Detailsrdquo to do so Since the design of integration processes varies

significantly there is no general rule for the duration of a step or for specific sequences of steps Instead

you have to get a feeling for your implemented integration processes

Did a specific process step decrease in performance over a period of time

Does one specific process step stick out with regard to the other steps of the same integration process

Do you observe long durations for a transformation step (ldquomappingrdquo)

Do you observe long durations of synchronous sendreceive steps in the integration process which correlates for example to a slow response from a connected remote system

Use the SAP Notes database to learn about recommendations and hints SAP Note 857530 - BPE

Composite Note Regarding Performance acts as a central note for all performance notes and might be used

as the entry point

Once you are able to answer the above questions there are several paths to follow

o Two common reasons should be taken into consideration for a performance decrease over time Firstly the load on PI might have increased over time as well Check the hardware resources (chapter 9) and carefully monitor the work process availability (chapter 51) Alternatively the underlying databases might slow down the integration processes Use chapter 54 to check this possibility

o If it is a specific process step that sticks out the process design itself has to be reviewed The SAP Notes mentioned above might help here

o If the mapping step takes a long time it is worth having a look at chapter 442 since the BPE and the IE use the same resources (mapping connections from the ABAP gateway to the J2EE server) to execute their mapping

o If there is a long processing time for a synchronous sendreceive step check the connected backend system for any performance issues

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

47

53 Advanced Analysis Load Created by the Business Process Engine (ST03N)

Since the Business Process Engine runs on the ABAP part of the PI Integration Server you can use the

statistical data written by the dialog work processes to analyze a performance problem

Procedure

Log in to your Integration Server and call transaction ST03N Switch to ldquoExpert Moderdquo and choose the

appropriate timeframe (this could be the ldquoLast Minutersquos Loadrdquo from the Detailed Analysis menu or a specific

day or week from the Workload menu) In the Analysis View navigate to User and Settlement Statistics and

then to User Profile The user you are interested in is the WF-BATCH user who does all ccBPM-related work

Compare the total CPU Time (in seconds) to the time frame (in seconds) of the analysis In the above example an analysis time frame of 1 hour has been chosen which corresponds to 3600 seconds To be exact these 3600 seconds have to be multiplied by the number of available CPUs Out of these 3600CPU seconds the WF-BATCH user used up 36 seconds

o The above value gives you a good idea of how much load the Business Process Engine creates on your PI Integration Server By comparison with the other users (as well as the CPU utilization from transaction ST06) you can determine if the Integration Server is able to handle this load with the available number of CPUs or not This does not help you to optimize the load but it does provide support when deciding how to distribute the workload of the BPE and the workload of the Integration Server (see Chapter 51)

Experienced ABAP administrators should also analyze the database time wait time and compare these values

54 Database Reorganization

The database is accessed many times during the processing of an Integration Process This is not usually

connected with performance problems but if a specific database table is large then statements may take

longer than for small database tables

Use transaction ST05 to collect a SQL trace while the integration process in question is being executed Restrict the trace to user WF-BATCH When the integration process is finished stop the trace and view it Look for high values in the duration column especially for SELECT statements

Use transaction SE16 to determine the number of entries for the typical workflow tables SWW_WI2OBJ and SWFRXI There are many more database tables used by BPE but if the number of entries is high for the above tables they are high for the rest as well See SAP Note 872388 - Troubleshooting Archiving and Deletion in XI and the links given in this note if you find high amounts of entries

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

48

55 Queuing and ccBPM (or increasing Parallelism for ccBPM Processes)

Until recently ccBPM processes accepted messages from only one queue (one queue XBQO$PEWS for

each workflow) If there was a high message throughput for this workflow then a high backlog occurred for

these queues A couple of enhancements have therefore been implemented to improve scalability if the

ccBPM runtime

o Inbound processing without buffering

o Use of one configurable queue that can be prioritized or assigned to a dedicated server

o Use of multiple inbound queues (queue name will be XBPE_WS)

o Configure transactional behavior of ccBPM process steps to reduce number of persistency steps

Details about the configuration and tuning of the ccBPM runtime are described at

httpswwwsdnsapcomirjsdnhowtoguides SAP Netweaver 70 End-to-End Process Integration

o How to Configure Inbound Processing in ccBPM Part I Delivery Mode

o How to Configure Inbound Processing in ccBPM Part II Queue Assignment

o How to Configure ccBPM Runtime Part III Transactional Behavior of an Integration Process

Procedure

Use transaction SMQ2 to check if the queues of type XBQO$PE show a high backlog

o Based on this information verify if the ccBPM process is suitable to run with multiple queues This is especially the case for collect patterns realized in BPE If you can use multiple queues configure the workflow based on the above guides

56 Message Packaging in BPE

Inbound processing takes up the most amount of processing time in many scenarios within BPE

Message packaging in BPE helps to improve performance by delivering multiple messages to BPE process

instances in one transaction This can lead to an increased throughput which means that the number of

messages that can be processed in a specific amount of time can increase significantly Message packaging

can also increase the runtime for individual messages (latency) due to the delay in the packaging process

The sending of packages can be triggered when the packages exceed a certain number of messages a

specific package size (in kB) or a maximum waiting time The extent of the performance improvement that

can be obtained depends on the type of scenario Scenarios with the following prerequisites are generally

most suitable

o Many messages are received for each process instance

o Messages that are sent to a process instance arrive together in a short period of time

o Generally high load on the process type

o Messages do not have to be delivered immediately

For example collect scenarios are particularly suitable for message packaging The higher the number of

messages is in a package the higher the potential performance improvements will be

Tests that have been performed have shown a high potential for throughput improvements up to factor 47 in

tested Collect Scenarios depending on the packaging size that has been configured For more details about

the functionality of Message Packaging in BPE and its configuration see SAP Note 1058623 - BPE

Message Packaging

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

49

Message packaging in BPE is independent of message packaging in PI (see Chapter 45) but they can be

used together

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

50

6 ANALYZING THE (ADVANCED) ADAPTER ENGINE

This chapter describes further checks and possible reasons for bottlenecks in the (Advanced) Adapter

Engine Although this is not the place to describe the architecture of the Adapter Framework in detail the

following aspects are important to know for the analysis of a performance problem on the AFW

o The AFW contains the Messaging System that is responsible for the persistence and queuing of messages Within the flow of a message it is placed in between the sender adapter (for example a file adapter that polls from a folder) and the Integration Server as well as between the Integration Server and the receiver adapter (for example a JMS adapter that sends a message out to an external system)

o The Messaging System uses 4 queues for each adapter type and these are responsible for receiving and sending messages To separate the message transfer each adapter (for example JMS SOAP JDBC and so on) has its own set of queues One queue for receiving synchronous messages (Request queue) one queue for sending synchronous messages (Call queue) one queue for receiving asynchronous messages (Receive queue) and one queue for sending asynchronous messages (Send queue) The name of the queue always states the adapter type and the queue name (for example JMS_httpsapcomxiXISystemReceive for the JMS receiver asynchronous queues)

o A configurable number of consumer threads are assigned on each queue These threads work on the queue in parallel meaning that the Messaging Queues are not strictly FirstIn-FirstOut Five threads are assigned by default to each queue and thus five messages can be processed in parallel But as will be discussed later the parallel execution of requests to the backend depends heavily on the adapter being used since not every adapter can work in parallel by default

o SAP NetWeaver PI 71 introduced a new queue The dispatcher queue The dispatcher queue is used to realize prioritization on the AAE (see chapter Prioritization in the Messaging System) Every message in the PI system is first assigned to the dispatcher queue before it is assigned to the adapter-specific queues

Viewed from the perspective of a message that enters the PI system using a J2EE Engine adapter (for

example File) and leaves the PI system using a J2EE Engine adapter (for example JDBC) asynchronously

the steps are as follows

1) Enter the sender adapter

2) Put into the dispatcher queue of the messaging system

3) Forwarded to the File send queue of the messaging system (based on Interface priority)

4) Retrieved from the send queue by a consumer thread

5) Sent from the messaging system to the pipeline of the ABAP Integration Engine (if no Integrated Configuration (AAE) is used

6) Processed by the pipeline steps of the Integration Engine

7) Sent to the messaging system by the Integration Engine

8) Put into the dispatcher queue of the messaging system

9) Forwarded to the JDBC receive queue of the messaging system (based on Interface priority)

10) Retrieved from the receive queue (based on the maxReceivers)

11) Sent to the receiver adapter

12) Sent to the final receiver

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

51

All the analysis is based on the information provided in the Audit Log With SAP NetWeaver PI 71 the audit

log is not persisted for successful messages in the database by default to avoid performance overhead

Therefore the audit log is only available in the cache for a limited period of time (based on the overall

message volume) More details can be found in chapter Persistence of Audit Log information in PI 710 and

higher

As described in chapter Adapter Engine Performance monitor in PI 731 and higher a new performance

monitor is available from PI 731 SP4 With this monitor you can display the processing time per interface in

a given interval and identify the processing steps in which a lot of time is spend

With 731 SP10 and 74 SP5 a download functionality for the performance monitor is provided as described

in SAP Note 1886761

61 Adapter Performance Problem

Compared to a bottleneck in the Messaging System it is relatively easy to detect a performance

problembottleneck in the adapter Since the timestamps for inbound and outbound messages are written at

different points in time the procedure is described separately for sender and receiver adapters

Note It is not generally possible for all sender and receiver adapters to work in parallel (discussed in section

Adapter Parallelism)

611 Adapter Parallelism

As mentioned before not all adapters are able to handle requests in parallel Thus increasing the number of

threads working on a queue in the messaging system will not solve a performance problembottleneck

There are 3 strategies to work around these restrictions

1) Create additional communication channels with a different name and adjust the respective senderreceiver agreements to use them in parallel

2) Add a second server node that will automatically have the same adapters and communication channel running as the first server node This does not work for polling sender adapters (File JDBC or Mail) since the adapter framework scheduler assigns only one server node to a polling communication channel

3) Install and use a non-central Adapter Framework for performance-critical interfaces to achieve better separation of interfaces

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

52

Some of the most frequently-used adapters and the possible options are discussed below

o Polling Adapters (JDBC Mail File)

At the sender side these adapters use an Adapter Framework Scheduler which assigns a server node that does the polling at the specified interval Thus only one server node in the J2EE cluster does the polling and no parallelization can be achieved Therefore scaling via additional server node is not possible More details will be presented in section Adapter Framework Scheduler For example since the channels are doing the same SELECT statement on the database or pick up files with the same file name parallel processing will only result in locking problems To increase the throughput for such interfaces the polling interval has to be reduced to avoid a big backlog of data that is to be polled If the volume is still too high you should think about creating a second interface for example the new interface would poll the data from a different directory or database table to avoid locking

At the receiver side the adapters work sequentially on each server node by default For example only one UPDATE statement can be executed for JDBC for each Communication Channel (independent of the number of consumer threads configured in the Messaging System) and all the other messages for the same Communication Channel will wait until the first one is finished This is done to avoid blocking situations on the remote database On the other hand this can cause blocking situations for whole adapters as discussed in section Avoid Blocking Caused by Single SlowHanging Receiver Interface

To allow better throughput for these adapters you can configure the degree of parallelism at the receiver side In the Processing tab of the Communication Channel in the field ldquoMaximum Concurrencyrdquo enter the number of messages to be processed in parallel by the receiver channel For example if you enter the value 2 then two messages are processed in parallel on one J2EE server node The parallel execution of these statements at database level of course depends on the nature of the statements and the isolation level defined on the database If all statements update the same database record database locking will occur and no parallelization can be achieved

o JMS Adapter

The JMS adapter is per default using a push mechanism on the PI sender side This means the data is pushed by the sending MQ provider Per default every Communication channel has one JMS connection established per J2EE server node On each connection it processes one message after the other so that the processing is sequential Since there is one connection per server node scaling via additional server nodes is an option

With Note 1502046 - JMS Adapter message pull mechanism the JMS protocol can be changed to use a pull mechanism By doing so you can specify a polling interval in the PI Communication Channel and PI will be the initiator of the communication Also here the Adapter Framework Scheduler is used which implies sequential processing But in contrary to JDBC or File sender channels the JMS polling sender channel will allow parallel processing on all server nodes of the J2EE cluster Therefore scaling via additional J2EE server nodes is an option

The JMS receiver side supports parallel operation out of the box Only some small parts during message processing (pure sending of the message to the JMS provider) are synchronized Therefore to enable parallel processing on the JMS receiver side no actions are necessary

o SOAP Adapter

The SOAP adapter is able to process requests in parallel The SOAP sender side has in general no limitations in the number of requests it can execute in parallel The limiting factor here is the FCAThreads available to process the incoming HTTP calls This is discussed in more detail in chapter Tuning SOAP sender adapter On the receiver side the parallelism depends on the number of the threads defined in the messaging system and the ability of the receiving system to cope with the load

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

53

o RFC Adapter

The RFC adapter offers parameters to adjust the degree of parallelism by defining the number of initial and maximum connections to be used The initial threads are allocated from the Application thread pool directly and are therefore not available for any other tasks Therefore the number of initial thread should be kept minimal To avoid bottlenecks during the peak time the maximum connections can be used A bottleneck is indicated by the following exception in the Audit Log

comsapaiiaframsapiDeliveryException error while processing message to remote systemcomsapaiiafrfccoreclientRfcClientException resource error could not get a client from JCOPool comsapmwjcoJCO$Exception (106) JCO_ERROR_RESOURCE Connection pool RfcClient hellip is exhausted The current pool size limit (max connections) is 1 connections

Also this should be done carefully since these threads will be taken from the J2EE application thread pool Therefore a very high value there can cause a bottleneck on the J2EE engine and therefore cause a major instability of the system As per RFC adapter online help the maximum number of connections are restricted to 50

o IDoc_AAE Adapter

The IDoc adapter on Java (IDoc_AAE) was introduced with PI 73 On the sender side the parallelization depends on the configuration mode chosen In Manual Mode the adapter works sequential per server node For channels in ldquoDefault Moderdquo it depends on the configuration of the inbound Resource Adapter (RA) Via the parameter MaxReaderThreadCount of the inbound RA you can configure how many threads are globally available for all IDoc adapters running in Default Mode Hence this determines the overall parallelization of the IDoc sender adapter per Java server node Currently the recommended maximum number of threads is 10

The receiver side of the Java IDoc adapter is working parallel per default

The table below gives a summary of the parallelism for the different adapter types

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

54

612 Sender Adapter

The first step(s) to be taken for a sender adapter depend on the type of adapter itself Afterwards however

the message flow is always the same First the message is processed in the Module Processor and

afterwards put into the queue of the Messaging System If the message is synchronous then it is the call

queue if the message is asynchronous that it is the send queue The final action of the Adapter Framework

is then to send the message from the messaging system to the Integration Server (for Non ICO interfaces)

Only the first entries of the audit log are of interest to establish if the sender adapter has a performance

problem From the first entry to the entry ldquoMessage successfully put into the queuerdquo These steps logically

belong to the sender adapter and the Module Processor while the subsequent steps belong to the

Messaging System The sender adapter uses one J2EE Engine application thread to execute its work This

is generally the same thread for all steps involved

Select the ldquoDetailsrdquo of a message in the AFW The first entry of the audit log will be the beginning of the activity of the sender adapter The end of the adapterrsquos activity is marked by the entry ldquoThe application sent the message asynchronously using connection Returning to applicationrdquo

6121 Tuning SOAP sender adapter As described above in general there is no limitation in the parallelization of the SOAP sender adapter itself

The incoming SOAP requests are limited by the available FCA Threads for HTTP processing The FCA

Threads are discussed in more detail in FCA Server Threads

Per default all interfaces using the SOAP adapter are using the same set of FCA Threads since they are all

using the same URL httplthostgtltportgtMessageServlet In case one interface is facing a very high load or

slow backend connections this can cause a blockage of the available FCA Threads and could cause a heavy

impact on the processing time of other interfaces

To avoid this SAP Note 1903431 allows the usage of different URLs for specific interfaces This way you

could isolate interfaces with a very high volume on a dedicated entry point using its own set of FCA Threads

In case of high load for this interface the other SOAP sender interfaces would not be affected by a shortage

of FCA Threads

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

55

To use this new feature you can specify on the sender system the following two entry points

o MessageServletInternal

o MessageServletExternal

More information about the URL of the SOAP adapter can be found here

613 Receiver Adapter

For a receiver adapter it is just the other way around when compared to the sender adapter First the

message is received from the Integration Server and then put into a queue of the Messaging System this

time either the request or the receive queue for asynchronous and the request queue for synchronous

messages The last step is then the activity of the receiver adapter itself

Select the ldquoDetailsrdquo of a message in the AFW The adapter activity starts after the entry ldquoThe message status set to DLNGrdquo The message will then be forwarded to the entry point of the AFW ldquoDelivering to channel ltchannel_namegtrdquo enters the Module Processor ldquoMP Entering Module Processorrdquo and ends with ldquoThe message was successfully delivered to the application using connection rdquo

Depending on the type of the adapter you now have to investigate why the receiver adapter needs a high amount of processing time This could be a complicated operating system command a bad network connection or a slow backend system among many other reasons

Here too custom modules can be the reason for a prolonged processing time Check Performance of Module Processing for more details

Note From 71 the audit log is not persisted any longer for performance reasons as described in Persistence of Audit Log information in PI 710 and higher

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

56

614 IDoc_AAE adapter tuning

A comprehensive and up-to date description about the tuning of the IDoc adapter is summarized in Note

1641541 - IDoc_AAE Adapter performance tuning Please refer to this Note for details

Similar to all other adapters running on the Adapter Engine also the IDOC_AAE adapter is using the Messaging System queues and the explanations of the next chapter Messaging System Bottleneck are also valid for the IDOC_AAE adapter Note that the IDOC_AAE adapter only supports async Scenarios ndash so it is only using the ldquoSendrdquo queue in Java-only scenarios and the ldquoSendrdquo and ldquoReceiverdquo queue in dual-stack scenariosrdquo

For the IDoc_AAE sender you can configure in addition bulk support as indicated in the screenshot below By doing so all messages received by PI in one RFC call from ERP will be processed as one message This is similar to the mechanism of the ABAP IDoc adapter described in the chapter Packaging on sender and receiver side except with the difference that the number of messages in one bulk cannot be further limited

It is also essential to ensure that the size of the IDocs or IDoc packages for sender or receiver IDocs is not

getting too large to avoid any negative impact on the Java heap memory and Garbage Collection In the

case of the IDoc_AAE adapter the XML size of the IDoc is less critical it is the number of segments in an

IDoc or the overall number of segments in all IDocs of an IDoc package that will influence the memory

allocation during message processing Based on the current implementation per IDoc segment around 5 KB

of memory during processing is required A package of 5 IDocs with 2000 segments for each IDoc would

consume roughly 500 MB during processing In such cases it is important to eg lower the package size

andor use large message queues as outlined in chapter Large message queues on PI Adapter Engine

615 Packaging for Proxy (SOAP in XI 30 protocol) and Java IDoc adapter

Due to message packaging in the Integration Engine (see PI Message Packaging for Integration Engine) the

ABAP based IDoc and Proxy adapter was always using packaging when sending asynchronous messages

to the receiver system With PI 731 packaging was also introduced for Java based IDoc and Proxy (SOAP in

XI 30 protocol) adapter The aim of packaging is to achieve similar benefits to packaging already existing in

ABAP pipeline

o Less calls to backend consuming less parallel Dialog Work Processes on the receiver

o Less overhead due to context switches authentication when calling the backend

Especially for high volume interfaces packaging should be enabled to avoid overloading of the backend

systems and impact on users operating on the system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

57

Important

In the ABAP stack the packaging is active per default (in 71 and higher) as described in chapter PI

Message Packaging for Integration Engine Experience shows that packaging can have a very high impact

on the DIA Work process usage on the ERP receiver side Especially for migration projects from dual-stack

PI to AEXPO it is important to evaluate packaging to avoid any negative impact on the receiving ECC due

to missing packaging

The packaging on Java works in general different then on ABAP While in ABAP the aggregation is done on

the individual qRFC queue level (which always contains messages to one receiver system only) this is not

possible on the Adapter Engine since messages for all receivers are located in the same queue Therefore

packages are built outside of the Adapter Engine queues The messages to be packaged will be forwarded

to a bulk handler which waits till either the time for the package is exceeded or the number of messages or

data size is reached After this the message is send by a bulk thread to the receiving backend system

Packaging on Java is always done for a server node individually If you have a high number of server nodes

packaging will work less efficiently due to the load balancing of messages across the available server nodes

When enabling message packaging the message status will stay in status ldquoDeliveringrdquo throughout all steps

described above In the audit log you can see the time it spends in packaging The audit log shown below

shows eg that the message was almost waiting one minute before the package was built and 9 IDocs were

sent with this package

Packaging can be enabled globally by setting the following parameters in Messaging System service as

described in Note 1913972

o messagingsystemmsgcollectorenabled Enable packaging or disable it globally

o messagingsystemmsgcollectormaxMemPerBulk The maximum size of a bulk message

o messagingsystemmsgcollectorbulkTimeout The wait time per bulk - default 60 seconds

o messagingsystemmsgcollectormaxMemTotal Maximum memory which can be used by message collector

o messagingsystemmsgcollectorpoolSize Number of parallel threads used to send out packages to the backend This value has to be tuned based on the general performance of the receiver system the network latency between the systems or the configuration of the IDoc posting on the ERP sender side (described in chapter IDoc posting on receiver side) Therefore tuning of these threads might be very important These threads can unfortunately not be monitored yet with any PI tool or Wily

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

58

Introscope To verify that not all packaging threads are in use you can use the Thread monitoring in the SAP Management Console and check for threads named ldquoBULK_EXECUTORrdquo as shown below

If you would like to adapt the packaging for specific communication channel this can also be done using the

configuration options in the Integration Directory

While in the SOAP adapter in XI 30 protocol you can define three different packaging KPIs this is currently

not possible for the IDoc adapter There you can only specify the package size based on number of

messages

616 Adapter Framework Scheduler

For polling adapters like File JDBC and JMS the Adapter Framework Scheduler determines on which server

node the polling takes place Per default a File or JDBC sender only works on one server node and the

Adapter Framework (AFW) Scheduler determines on which one the job is scheduled

The Adapter Frameworks scheduler had several performance and reliability issues in a cluster environment

especially in systems having many Java server nodes and many polling communication channels Based on

this and the symptoms described in Note 1355715 - AF Scheduler to avoid using cluster communication a

new scheduler was released We highly recommend using the new version of the AFW scheduler To

activate the new scheduler you have to set the parameter schedulerrelocMode of the Adapter framework

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

59

service property to some negative value For instance by setting the value to -15 after every 15th polling

interval a rebalancing of the channel within the J2EE cluster might happen This can allow for better

balancing of the incoming load across the available server nodes if the files are coming in on regular

intervals The balancing is achieved by doing a servlet call Based on the HTTP load balancing the channel

might then be dispatched to another server node To avoid the balancing overhead this value should not be

set too low A value between -10 and -15 should be acceptable in most cases For very short polling intervals

(eg every second) even lower values (eg -50) can be configured

In case many files are put at one given time on the directory (by eg using a batch process) all these files will

be processed by one polling process only Hence a proper load balancing cannot be achieved by the AFW

scheduler In such a case the only option is to use write the files with different names or different directories

so that you can configure multiple sender channels to pick up the files

Starting from PI 731 you can monitor the Adapter Framework Scheduler in pimon Monitoring

Background Job Processing Monitor Adapter Framework Scheduler Jobs There you can check on which

server node the Communication Channel is polling (Status=rdquoActiverdquo) Also you can see the time the channel

was polling the last time and will poll next time

You cannot influence the server node on which the channel is polling This is determined via the AFW

scheduler You can only influence the frequency after which a channel can potentially move to another

server node by tuning the parameter relocMode as outlined above

Prior to PI 731 you have to use the following URL to monitor the Adapter Framework Scheduler

httpserverportAdapterFrameworkschedulerschedulerjsp

Above you see an example of a File sender channel The type value determines if the channel runs only on

one server node (type = L (local)) or on all server nodes (Type = G (global)) The time in the brackets

[60000] determines the polling interval in ms ndash in this case 60 seconds The Status column shows the

status of the channel

o ldquoONrdquo Currently polling

o ldquoonrdquo Currently waiting for the next polling

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

60

o ldquooffrdquo not scheduled any longer (eg channel deactivated or scheduled on another server node)

62 Messaging System Bottleneck

As described earlier the Messaging System is responsible for persisting and queuing messages that come

from any sender adapter and go to the Integration Server pipeline or Java only interfaces (ICO) or that come

from the Integration Server pipeline and go to any receiver adapter The time difference between receiving

the message from the one side and delivering it to the other side is the value that can be used to analyze

bottlenecks in the Messaging System The following chapters describe how to determine the time difference

for both directions

A bottleneck (or rather a ldquofakerdquo bottleneck) in the Messaging System can usually be closely connected with a

performance problem in the receiver adapter You must therefore make sure you execute the check

described in chapter 613 If the receiver adapter needs a lot of time to process messages then of course

the messages get queued in the messaging system and remain there for a long time since the adapter is not

ready to process the next one yet It looks like the messaging system is not fast enough but actually the

receiver adapter is the limiting factor

Execute the checks described in chapter 621 and 622 to get an overview of the situation you should do this for multiple messages

Decide if a specific receiver adapter is causing long waiting times in the messaging system by executing the check described in chapter 613 An additional hint at a receiver adapter problem is if only messages to specific receivers show a long wait time in the queue

Use the queue monitoring in the Runtime Workbench (RWB) to analyze how many threads are active for each of the queue types and how large the queue size currently is To do so choose Component Monitoring Adapter Engine Press the Engine Status button and choose tab ldquoAdditional Datardquo This will open the page below showing the number of messages in a queue the threads currently working and the maximum available threads Note This view only shows the Messaging System of one J2EE server node To get the whole picture you have to check all the server nodes that are available in your system via the drop down list box

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

61

o Since checking all the server nodes with this browser frontend can be very tedious it is better to use Wily Introscope since it allows easy graphical analysis of the queues and thread usage of the messaging system across Java server nodes

The starting point in Wily Introscope is usually the PI triage showing all the backlogs in the queue In the screenshot below a backlog in the dispatcher queue (for 71 and higher only) can be seen

The dispatcher queue is not the problem here since it will forward messages to the adapter queue if any free

threads are available on the adapter specific consumer threads The analysis must start in the PI inbound

queues By using the navigation button in the upper right corner of the inbound queue size you can directly

jump to a more detailed view where you can see that the file adapter was causing the backlog

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

62

To see the consumer thread usage you can then follow the link to the file adapter In the screenshot below

you can see that all consumer threads were exceeded during the time of the backlog Thus increasing the

number of consumer threads could be a solution if the resulting delay is not acceptable for business

o If it is obvious that the messaging system does not have enough resources as in the case above increase the number of consumers for the queue in question Use the results of the previous check to decide which adapter queues need more consumers

The adapter specific queues in the messaging system have to be configured in the NWA using

service ldquoXPI Service AF Corerdquo and property messagingconnectionDefinition The default

values for the sending and receiving consumer threads are set as follows

(name=global messageListener=localejbsAFWListener exceptionListener=

localejbsAFWListener pollInterval=60000 pollAttempts=60

SendmaxConsumers=5 RecvmaxConsumers=5 CallmaxConsumers=5

RqstmaxConsumers=5)

To set individual values for a specific adapter type you have to add a new property set to the default set with the name of the respective adapter type for example

(name=JMS_httpsapcomxiXISystem

messageListener=localejbsAFWListener

exceptionListener=localejbsAFWListener pollInterval=60000

pollAttempts=60 SendmaxConsumers=7 RecvmaxConsumers=7

CallmaxConsumers=7 RqstmaxConsumers=7)

The name of the adapter specific queue is based on the pattern ltadapter_namegt_ ltnamespacegt for

example RFC_httpsapcomxiXISystem

Note that you must not change parameters such as pollInterval and pollAttempts For more details see SAP Note 791655 - Documentation of the XI Messaging System Service Properties

o Not all adapters use the parameter above Some special adapters like CIDX RNIF or Java Proxy can be changed by using the service ldquoXPI Service Messaging Systemrdquo and property messagingconnectionParams by adding the relevant lines for the adapter in question as described above

o A performance problem on the Messaging System can also be caused by a high logging as described in chapter Logging Staging on the AAE (PI 73 and higher)

o Important Make sure that your system is able to handle the additional threads that is monitor the CPU usage and application thread availability as well as the memory of the J2EE Engine after you have applied the change

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

63

o Important Keep in mind that not all adapters are able to handle requests in parallel even if you assign more threads on the Messaging System queues In such a case adding additional threads will not speed up processing significantly See chapter Adapter Parallelism for details

o A backlog of messages in the messaging system is highly critical for synchronous scenarios due to timeouts that might occur Therefore the number of synchronous threads always has to be tuned so that no bottleneck occurs Often a backlog in synchronous queues is caused due to bad performance on the receiving backend system

o More information is provided in the blog Tuning the PI Messaging System Queues

621 Messaging System in between AFW Sender Adapter and Integration Server (Outbound)

Open the Audit Log of a message and look for the following entry ldquoThe message was successfully retrieved

from the send queuerdquo Note that the queue name would be ldquoCall Queuerdquo if the message was synchronous

To determine how long the message waited to be picked up from the queue compare the timestamp of the

above entry with the timestamp for the step ldquoMessage successfully put into the queuerdquo A large time

difference between those two timestamps would therefore indicate a bottleneck for consumer threads on the

sender queues in the Messaging System

Instead of checking the latency of single messages we recommend using Wily Introscope to monitor the

queue behavior as shown above

Asynchronous messages only use one thread for processing in the messaging system Multiple threads are

used for synchronous messages The adapter thread puts the message into the messaging system queue

and will wait till the messaging system delivers the response The adapter thread is therefore not available

for other tasks until the response is returned A consumer thread on the call queue sends the message to the

Integration Engine The response will be received by a third thread (consumer thread of send queue) which

correlates the response with the original request After this the initiating adapter thread will be notified to

send the response to the original sender system This correlation can be seen in the audit log of the

synchronous message below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

64

622 Messaging System in Between Integration Server and AFW Receiver Adapter (Inbound)

Open the Audit Log of a message and look for the following entry ldquoMessage successfully put into the

queuerdquo Compare this timestamp with the timestamp for the step ldquoThe message was successfully retrieved

from the receive queuerdquo The time difference between these two timestamps is the time that the message

waited in the messaging system for a free consumer thread A large time difference between those two

timestamps indicates a bottleneck in the adapter specific inbound queues of the Messaging System

623 Interface Prioritization in the Messaging System

SAP NetWeaver PI 71 and higher offers the possibility for prioritization of interfaces similar to the ABAP

stack This feature is especially helpful if high volume interfaces run at the same time with business critical

interfaces To minimize the delay for critical messages the Messaging System now makes it possible for you

to define high medium and low priority processing at interface level Based on the priority the Dispatcher

Queue of the messaging system (which will be the first entry point for all messages) will forward the

messages to the standard adapter-specific queues The priority assigned to an interface determines the

number of messages that are forwarded once the adapter-specific queues have free consumer threads

available This is done based on a weighting of the messages to be reloaded The weights for the different

priorities are as follows High 75 Medium 20 Low 5

Based on this approach you can ensure that more resources can be used for high priority interfaces The

screenshot below shows the UI for message prioritization available in pimon Configuration and

Administration Message Prioritization

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

65

The number of messages per priority can be seen in a dashboard in Wily as shown below

You can find more details on configuring usage prioritization within the AAE at SAP Online Help navigate to

SAP NetWeaver SAP NetWeaver PI SAP NetWeaver Process Integration 71 Including Enhancement

Package 1 SAP NetWeaver Process Integration Library Function-Oriented View Process Integration

Process Integration Monitoring Component Monitoring Prioritizing the Processing of Messages

624 Avoid Blocking Caused by Single SlowHanging Receiver Interface

As already discussed above some receiver adapters process messages sequentially on one server node by

default This is independent of the number of consumer threads defined for the corresponding receiver

queue in the messaging system For example even though you have configured 20 threads for the JDBC

Receiver one Communication Channel will only be able to send one request to the remote database at a

given time If there are many messages for the same interface all of them will get a thread from the

messaging system but will be blocked when performing the adapter call

In the case of a slow message transmission for example for EDI interfaces using ISDN technology or

remote system with small network bandwidth this behavior can cause a ldquohangingrdquo situation for a specific

adapter since one interface can block all messaging threads As a result all the other interfaces will not get

any resources and will be blocked

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

66

Based on this a parameter messagingsystemqueueParallelismmaxReceivers was introduced which can

be set in NWA in service XPI Service Messaging System With this parameter you can restrict the maximum

number of receiver consumer threads that can be used by one interface In older releases (lower 731 SP11

and 74 SP6) this is a global parameter at the moment that affects all adapters It should therefore not be set

too restrictively If you need high parallel throughput one option is to set the maxReceiver parameter to 5 (so

that each interface can use 5 consumer threads for each server node) and increase the overall number of

threads on the receive queue for the adapter in question (for example JDBC) to 20 With this configuration it

will be possible for four interfaces to get resources in parallel before all threads are blocked For more

information see Note 1136790 - Blocking Receiver Channel May Affect the Whole Adapter Type and SDN

blog Tuning the PI Messaging System Queues

In the screenshot below you see the impact of maxReceivers parameter when a backlog for one interface

occurs Even though there are more free SOAP threads available they are not consumed Hence the free

SOAP threads could be assigned to other interfaces running in parallel

This parameter has a direct impact on the queue where message backlogs occur As mentioned earlier

usually a backlog should occur on the Dispatcher queue only since it is responsible to dispatch threads only

if there are free consumer threads in the adapter specific queue When setting maxReceiver you restrict the

threads per interface and this means that less consumer threads are blocked When one interface is facing a

high backlog there will always be free consumer threads and therefore the dispatcher queue can dispatch

the message to the adapter specific queue Hence the backlog will appear in the adapter specific queue so

that the message prioritization would not work properly any longer

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

67

Per default the maxReceiver parameter is only relevant for asynchronous message processing (ICo and

classical dual-stack scenarios) The reason here is that a wait time for synchronous scenarios should be

avoided by all means and therefore additional restrictions in the number of available threads can be very

critical Therefore it is usually advisable to not limit the threads per interface but increase the overall number

of available threads In case you have many high volume synchronous scenarios with different priority that

run in parallel it might be advisable to limit the threads each interface can use To do so you have to set the

parameter messagingsystemqueueParallelismqueueTypes to Recv ICoAll as described in SAP Note

1493502 - Max Receiver Parameter for Integrated Configurations

Enhancement with 731 SP11 and 74 SP6

With Note 1916598 - NF Receiver Parallelism per Interface an enhancement was introduced that allows

the specification of the maximum parallelization on more granular level This new feature has to be activated

by setting the parameter messagingsystemqueueParallelismperInterface to true Using a configuration UI

you can specify rules to determine the parallelization for all interfaces of a given receiver service If no rule

for a given interface is specified the global maxReceivers value will be considered If the receiver service

corresponds to a technical business system this configuration would help to restrict the parallel requests

from PI to that system This also works across protocols so that eg it could be used to restrict the number of

parallel requests using Proxy or IDoc to the same ERP receiver system Below you can find a screenshot of

the configuration UI in NWA SOA Monitoring

With the improvement mentioned above also the dispatching mechanism in the dispatcher queue is

changed so that it is aware of the maxReceiver settings This means that now again the backlog will now be

placed in the dispatcher queue and the prioritization will work properly

625 Overhead based on interface pattern being used

Also the interface pattern configured can cause some overhead When choosing the interface pattern

ldquoStateless Operationrdquo each message is parsed against its data type during inbound processing This can

cause performance and memory overhead but gives a syntactical check of the data received

When choosing ldquoStateless (XI30-Compatible)rdquo no check of the data type is performed and therefore the

overhead is avoided Therefore for interface with good data quality and high message throughput this

interface pattern should be chosen

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

68

63 Performance of Module Processing

If your analysis in chapter Adapter Performance Problem pointed to a problem in the Module Processing

then you must analyze the runtime of the modules used in the Communication Channel Adapter modules

can be combined so that one CC calls multiple modules in a defined sequence Adapter modules can be

custom-developed or SAP standard and can be used for many different purposes ndash like to transform the EDI

message before sending it to the partner or to change header attributes of a PI message before forwarding it

to a legacy application Due to the flexible usage of adapter modules there is also no standard way to

approach performance problems

In the audit log shown below you can see two adapter modules One module is a customer-developed

module called SimpleWaitModule The next module called CallSAPAdapter which is a standard module that

inserts the data into the messaging system queues

In the audit log you will get a first impression of the duration of the module In the example above you can

see that the SimpleWaitModule requires 4 seconds of processing time

Once again Wily Introscope offers a better overview of the processing time of modules In the screenshot

below you can see a dashboard showing the cumulativeaverage response times and number of invocations

of different modules If there is one that has been running for a very long time then it would be very easy to

identify since there will be a line indicating a much higher average response time The tooltip displays the

name of the module

Module

Processing

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

69

Once you have identified the module with the long running time you have to talk to the responsible developer

to understand why it is taking that long Eventually the module executes a look-up to a remote system using

JCo or JDBC which could be responsible for the delay In the best case the module will print out information

in addition to the audit log to detect such steps If not use the Wily Introscope transaction trace as explained

in appendix Wily Transaction Trace

64 Java only scenarios Integrated Configuration objects

From SAP NetWeaver PI 71 on you can configure scenarios that only run on the Java stack using the

Advance Adapter Engine (AAE) when only Java-based adapters are used A new configuration object is

used for this that is called Integrated Configuration When using this the steps executed so far in the ABAP

pipeline (Receiver Determination Interface Determination and Mapping) are also executed by the services in

the Adapter Engine

641 General performance gain when using Java only scenarios

The major advantage of AAE processing is a reduced overhead due to the context switches between ABAP

and Java Thus the overall throughput can be increased significantly and the overall latency of a PI

message can be reduced greatly If possible interfaces should always use the Integrated Configuration to

achieve best performance Therefore the best tuning option is to change a scenario that is using Java based

sender and receiver adapters from a classical ABAP-based scenario to an AAE-based one

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

70

Based on SAP internal measurements performed on 71 releases the throughput as well as the response

time could be improved significantly as shown in the diagrams below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

71

Based on these measurements the following statements can be made

1) Significant performance improvements can be achieved always with local processing (if available) for a certain scenario

2) This is valid for all adapter types huge mapping runtimes slow networks slow (receiver) applications reduce the throughput and therefore rated for the overall scenario also reduce the benefit of local processing in terms of overall throughput and response time measurements

3) Greatest benefit is recognized for small payloads (comparison 10k 50k 500k) and asynchronous messages

642 Message Flow of Java only scenarios

All the Java based tuning options mentioned in chapter Analyzing the (Advanced) Adapter Engine and Long

Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo are still valid Just the calling sequence is different

The following describes the steps used in Integrated Configuration to help you better understand the

message flow of an interface The example is JMS to Mail

1) Enter the JMS sender adapter

2) Put into the dispatcher queue of the messaging system

3) Forwarded to the JMS send queue of the messaging system

4) Message is taken by JMS send consumer thread

a No message split used

In this case the JMS consumer thread will do all the necessary steps previously done in the ABAP pipeline (like receiver determination interface determination and mapping) and will then also transfer the message to the backend system (remote database in our example) Thus all the steps are executed by one thread only

b Message split used (1n message relation)

In this case there is a context switch The thread taking the message out of the queue will process the message up to the message split step It will create the new message and put it in the Send queue again from where it will be taken by different threads (which will then map the child message and finally send it to the receiving system)

As we can see in this example for Integrated Configuration only one thread does all the different steps of a

message The consumer thread will not be available for other messages during the execution of these steps

The tuning of the Send queue (Call for synchronous messages) is therefore much more important for

scenarios using Integrated Configuration than for ABAP-based scenarios

The different steps of the message processing can be seen in the audit log of a message If for example a

long-running mapping or adapter module is indicated then you can use the relevant chapter of this guide

There is no difference in the analysis except that for mappings no JCo connection is required since the

mapping call is done directly from the Java stack

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

72

The example audit logs below are based on a Java only synchronous SOAP to SOAP scenario

In the highlighted areas you can see that all the steps are very fast except the mapping call which lasts

around 7 seconds To analyze the long duration of the mapping now the same steps as discussed in chapter

Long Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo have to be followed

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

73

643 Avoid blocking of Java only scenarios

Like classical ABAP based scenarios Java only scenarios can face backlog situations in case of a slow or

hanging receiver backend Since the Java only interfaces are only using send queues the restriction of

consumer threads on the receiver queue as described in chapter Avoid Blocking Caused by Single

SlowHanging Receiver Interface is no solution

Based on that an additional property messagingsystemqueueParallelismqueueTypes of service SAP XI AF

MESSAGING was introduced via Note 1493502 - Max Receiver Parameter for Integrated Configurations

The parameter can be set for synchronous and asynchronous interfaces Please keep in mind that

messagingsystemqueueParallelismmaxReceivers is a global parameter for all adapters and for all

configured Quality of Service This global restriction can be avoided starting with 731 SP11 740 SP06 as

per SAP Note 1916598 - NF Receiver Parallelism per Interface Usually a restriction for the parallelization

can be highly critical for synchronous interfaces Therefore we generally recommend to set

messagingsystemqueueParallelismqueueTypes in most cases to Recv IcoAsync only

644 Logging Staging on the AAE (PI 73 and higher)

Up to PI 711 on the PI Adapter Engine only one message version was persisted Especially for Java only

scenarios this was often not sufficient for troubleshooting since for instance the result of a mapping could not

be verified

Starting from PI 73 the persistence steps can be configured globally in the Adapter Engine for synchronous

and asynchronous interfaces This is similar to the LOGGING or LOGGING_SYNC on the ABAP stack In

later version of PI you will be able to do the configuration on interface level

The versions that can be persisted are shown in the diagram below (the abbreviations in green are the

values for the configuration)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

74

In the Messaging System we generally distinguish Staging (versioning) and Logging An overview is given

below

For details about the configuration please refer to the SAP online help Saving Message Versions

Similar to the LOGGING on the ABAP Integration Server persisting several versions of the Java message

can cause a high overhead on the DB and can cause a decrease in performance

The persistence steps can be seen directly in the audit log of a message

Also the Message Monitor shows the persisted versions

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

75

While this additional monitoring attributes are extremely helpful for troubleshooting they should be used

carefully from a performance perspective Especially the number of message versions and the

loggingstaging mode you choose can have a severe impact on performance Therefore you have to find the

balance between business requirement and performance overhead Some guidelines on how to use staging

and logging are summarized in SAP Note 1760915 - FAQ Staging and Logging in PI 73 and higher

65 J2EE HTTP load balancing

With NetWeaver version 71 and higher the Java HTTP load balancing is no longer done by the Java

Dispatcher but by the ICM The default load balancing rules of the ICM are designed with a stateful

application in mind giving priority to the balancing of HTTP sessions These default rules are not optimal for

stateless applications such as SAP PI

In the past we could observe an unequal load distribution across Java server nodes caused by this In case

of high backlogs this can cause a delay in the overall message processing of the interface because one

server node has more messages assigned than others and therefore not all available resources are used

A typical example can be seen in the Wily screenshot below

SAP Note 1596109 ndash Uneven distribution of HTTP requests on Java server nodes introduces new load

balancing rules for stateless applications like PI Please follow the description in the Note to ensure the

messages are distributed equally across the available server nodes In the meantime these load balancing

rules can also be executed using the CTC templates as described in Notes 1756963 and 1757922 Also this

has been included in the PI Initial Setup wizard in order to execute this task automatically as a post-

installation step (see Note 1760700 - PI CTC Add HTTP loadbalancing to initial setup)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

76

For the example given above we could see a much better load balancing after the new load balancing rules

were implemented This can be seen in the following screenshot

Please Note The load balancing rules mentioned above are only responsible to balance the messages

across the available server nodes of one instance HTTP load balancing across application servers is done

by the SAP Web Dispatcher More information about this is provided in the Guide How To Scale SAP PI 71

66 J2EE Engine Bottleneck

All tuning that can be done in the AFW is limited by the ability of the J2EE Engine to handle a given amount

of threads and to provide enough memory that is needed to process all requests Of course the CPU is also

a limiting factor but this will be discussed in Chapter 91

From SAP NetWeaver PI 71 onwards the SAP VM is used on all platforms Therefore the analysis of all

platforms can use the same tools

661 Java Memory

The first analysis of the J2EE memory is usually done using the garbage collection (GC) behavior of the

Java Virtual Machine (JVM) To get the information of the GC written to the standard log file the JVM has to

be started with the parameter ndashverbosegc

Analyze the file std_serverltngtout for the frequency of GCs their duration and the allocated memory

Look for unusual patterns indicating an Out-of-Memory error or an unintentional restart of your J2EE engine This would be visible if the allocated memory drops to 0 at a given point in time A healthy Garbage Collection output would display a saw tooth pattern ndash that is the memory usage would increase over time but then go down to its original value as soon as a full Garbage Collection is triggered You should see the memory usage go down to initial values during low volume times (night-time)

Pay attention to GCs with a high duration This is important because no PI message can be mapped or processed by the J2EE Adapter Framework during a GC in a JAVA VM One of the many reasons for long GCs that should be mentioned here is paging (see chapter 92) If the heap space of the J2EE Engine is exhausted all objects in the heap are evaluated for their references If there is still a reference the object is kept If there is no reference the object is deleted If some or all of the objects have been swapped out due to insufficient RAM then they have to be swapped in to be evaluated This leads to very long runtimes for GCs GCs of more than 15 minutes have been observed on swapping systems

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

77

Different tools exist for the Garbage Collection Analysis

1) Solution Manager Diagnostics

Solution Manager Diagnostics (SMD) offers a memory analysis application that is able to read the GC output in the std_serverltngtout files To start call the Root Cause Analysis section In the Common Task list you find the entry Java Memory Analysis which will give you the chance to upload your std_serverltngtout file It shows the memory allocated after the GC and also prints the duration of each GC (black dots with duration scale on right axis) A normal GC output should show a saw tooth pattern where the memory always goes down to an initial value (especially during low volume times) This is shown in the example screenshot below

2) Wily Introscope

Again Wily Introscope offers different dashboards that can be used to check the Garbage Collection behavior on the J2EE engine The dashboard can be found using the J2EE Overview J2EE GC Overview and shows important KPIs for the GC such as the count of GC or the duration for each interval The GC Time dashboards shows the ratio of time spent in GC An example is shown in the screenshot below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

78

3) Netweaver Administrator

The NWA also offers a view to monitor the Java heap usage However the important information about the duration of the GC is not available This monitor can be found by navigating to Availability and Performance Management Java System Reports Choose Report System Health

In NetWeaver 73 you find this data in Availability and Performance Management Resource Monitoring History Reports There you can build your own Memory report of the monitoring data provided The screenshot below eg shows the Used Memory per server node

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

79

4) SAP JVM Profiler

The SAP JVM profiler allows the analysis of the extensive GC information stored in the sapjvm_gcprf files of the server node folders or to directly connect to the server node to retrieve this information More information can be found at httpwikiscnsapcomwikidisplayASJAVAJava+Profiling

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

80

Procedure

o Search for restarts of the J2EE Engine by searching for the string ldquois startingrdquo in the file std_serverltngtout Apart from the first of these entries (which simply marks the initial start of the J2EE Engine) the second and any later entries mark a fatal situation after which the J2EE server node had to be restarted This could for example be an outndashof-memory situation Search the log above those entries for a possible reason

o Search for exceptions and fatal entries Was an out-of-memory error reported (search for pattern OutOfMemory) Was a thread shortage reported

o This document does not deal with tuning the J2EE Engine If at this point it becomes clear that the J2EE Engine is the limiting factor proceed with SAP Notes or open a customer incident Only some very basic recommendations are given here

o The parameters for the SAP VM are defined by Zero Admin templates From time to time new Zero Admin templates might be released which change important parameters for the SAP VM Thus you should check SAP Note 1248926 - AS Java VM Parameters for NetWeaver 71-based Products regularly and apply the changes accordingly

o The problem might be caused by too high parallelization when processing large messages This should be restricted to avoid overloading of the Java heap memory

1) In case the interfaces are using the ABAP Integration Server use the large message queue filters to restrict the parallelization of mapping calls from ABAP queues To do so set parameter EO_MSG_SIZE_LIMIT to eg 5000 to direct all large messages to dedicated XBTL or XBTM queues

2) To restrict the parallelization on the Java Messaging System you can use the large message queue mechanism described in chapter Large message queues on the Messaging System

o If it is already obvious that the memory of your J2EE Engine is getting short and nothing can be changed on the scenarios itself you have several options for scaling your PI landscape Especially for large PI installations with high message throughput or large messages SAP Note 1502676 - Scaling up large PI installations describes potential options

1) Increase the Java Heap of your server node If sufficient physical memory is available increase the default heap size of 2 GB to a higher value such as 3 or 4 GB Experience shows that the SAP VM can handle these heap sizes without major impact on performance Larger heap sizes can cause longer processing time of GCs which in turn can affect the PI application negatively Increasing the maximum heap size on the J2EE engine will automatically adapt the new area of the heap due to a dynamic configuration (in newer J2EE versions this will be per default set to 16 of the overall heap) After increasing the heap size monitor the duration of the GC execution

2) Add an additional server node to distribute the load over more processes SAP recommends that productive systems have at least 2 Java server nodes per instance Prerequisite is that your physical host has enough RAM and CPU To minimize the overhead for J2EE cluster communication it is in general recommended configuring larger server nodes (as described above) then a high number of server nodes

3) Add an additional application server consisting of a double stack Web Application Server (ABAP and J2EE Engine) to distribute the load to an additional hardware Find details on the scaling of PI with multiple instances in the Guide How To Scale up SAP NetWeaver Process Integration

4) Use a non-central Adapter Engine to separate large messages that cause the memory problem from business critical scenarios

662 Java System and Application Threads

A system or application thread is required for every task that the J2EE Engine executes Each of these

thread types are maintained using a thread pool If the pool of available threads is expired no more requests

can be processed It is therefore important that you have enough threads available at all times

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

81

o In general SAP recommends that you increase the system and application thread count for a PI system As described in SAP Note 937159 - XI Adapter Engine is stuck the application threads should be increased to 350 and the system threads to lsquo120rsquo for all J2EE server nodes

o There are two options for checking the thread usage

1) Wily Introscope

Wily Introscope provides a dashboard in the J2EE Overview section called J2EE CPU and Memory Detail that can also be used to monitor the thread usage on a PI engine Different views exist for Application and System Threads as shown in the screenshot below

2) NetWeaver Administrator (NWA)

The NWA also offers a monitor similar to the one available for the memory mentioned above This monitor can be found by navigating to Availability and Performance Management Java System Reports Choose Report System Health and looking at the windows shown in the screenshot below

Again in NetWeaver 73 and higher the monitor can be found at Availability and Performance

Management Resource Monitoring History Report An example for the Application Thread

Usage is shown below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

82

Check if the ThreadPoolUsageRate is below 80 for both the application threads and the system threads Also check the maximum value by choosing a longer history in Wily Introscope

Has the thread usage increased over the last daysweeks

Is there one server node that shows a higher thread usage than others

3) SAP Management Console

The SAP Management Console is an essential tool to analyze different aspects of the J2EE engine It also offers a thread dump view similar to SM50 in ABAP showing the activity of all the threads of a given Java Application Server More information about the SAP MC can be found in Note 1014480 - SAP Management Console (SAP-MC) and the online help The SAP Management Console can be started as Java web-start application using httpltservergt5ltInstanc_numbergt13

Navigate to AS Java Threads and check for any threads in red status (gt20 secs running time) This indicates long running threads and you should check for the related thread stacks The example below shows a long running thread related to the PI directory You can see the user doing the request and can see that the thread is waiting for an HTTP response from the CPACache

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

83

If you are not sure what is causing the problem you should take multiple thread dumps (at least 3 every 30 seconds) for all server nodes This can be done in the Management Console navigating to AS Java Process Table

The Thread Dump Viewer tool (TDV) from Note 1020246 - Thread Dump Viewer for SAP Java Engine can be used to analyze the produced thread dumps (inside the results archive that can be downloaded to your local PC)

663 FCA Server Threads

An additional type of thread was introduced with SAP NetWeaver PI 71 that is responsible for HTTP traffic

FCA Server Threads The FCA Server Threads are responsible for receiving HTTP calls at the Java side

(after the Java dispatcher is no longer available) Also FCA Threads use a Thread Pool Fifteen FCA

Threads are configured by default but based on SAP Note 1375656 - SAP Netweaver PI System Parameters

we recommend that you increase it to 50 This can be done in the NWA by changing the parameter

FCAServerThreadCount of service HTTP Provider

FCA Server Threads are particularly crucial for synchronous message transfer and for HTTP-based

scenarios like WebServices or calls from the ABAP engine to the Adapter Engine If there is a long duration

of the HTTP call due to a slow backend system then the thread is blocked for the whole time and not

available to send other HTTP requests (other PI messages)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

84

The configured FCAServerThreadCount represents the maximum number of threads working on a single

entry point An entry point in the PI sense could eg be the SOAP Adapter Servlet (share with all channels

as described in Tuning SOAP sender adapter)

o If response times for specific entry point are high (gt15 seconds) additional FCA threads will be spawned to be available for parallel HTTP based incoming requests using different entry points only This ensures that in case of problems with one application not all other applications will be blocked constantly

o An overall maximum of 20xFCAServerThreadCount may be used in the J2EE engine

There is currently no standard monitor available for FCA Threads except the thread view in the SAP

Management Console A new dashboard will be integrated into Wily Introscope soon and will also show the

FCA Server Threads that are in use

664 Switch Off VMC

The Virtual Machine Container (VMC) is enabled by default for an ABAP WebAS 71 Since PI does not use

VMC at runtime the VMC can be switched off on a PI system to avoid resource overhead This

recommendation is based on SAP Note 1375656 - SAP NetWeaver PI System Parameters

You can activate the VM Container setting the profile parameters vmcjenable = off which can be changed

using profile maintenance (transaction RZ10) Once you have made the change you need to restart the SAP

Web Application Server instance

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

85

7 ABAP PROXY SYSTEM TUNING

Every ABAP WebAS gt 620 includes an ABAP Proxy runtime This enables a SAP system to talk native PI

protocol (XI-SOAP) so that no costly transformation is necessary

In general ABAP proxy is just a framework that is triggered by (sender) or triggers (receiver) application-

specific coding It is not possible to give a general tuning recommendation because the applications and use

cases of ABAP proxy can differ so greatly

In this section we would like to highlight the system tuning options that can be applied to improve the

throughput at the ABAP Proxy backend side

In general you can increase the throughput of EO interfaces by changing the queue parallelization Two

different parameters exist to change the number of qRFC queues on the ABAP Proxy backend As in PI

ABAP Proxy processing is based on qRFC inbound queues (SMQ2) with specific names A sender proxy

uses XBTS queues (10 parallel by default) and a receiver proxy XBTR (20 parallel by default)

The sender proxy only forwards the message to PI by executing an HTTP call that must not take too much

time and which is not resource-critical On the other hand the receiver proxy executes the inbound

processing of the messages based on the application context (and which can be very time-consuming) It is

therefore more likely to face a backlog situation in the Receiver Proxy queues In the screenshot below you

can see the performance header of a receiver proxy message that required around 20 minutes in the

PLSRV_CALL_INBOUND_PROXY (which corresponds to the posting of the application data)

Of course such a long-running message will block the queue and all messages behind it will face a higher

latency Since this step is purely application-related it is only possible to perform tuning at the application

side

The number of sender and receiver ABAP Proxy queues can be changed in transaction SXMB_ADM on the

proxy backend with parameter EO_INBOUND_PARALLEL and sub parameter SENDER (XBTS queues)

and RECEIVER (XBTR queues) The prerequisite for increasing the number of queues is that there are

enough qRFC resources are available at the proxy system (as discussed in section qRFC Resources

(SARFC))

Also as described in chapter Message Prioritization on the ABAP Stack prioritization can be configured in

the ABAP Proxy system This can be used to separate runtime-critical interfaces from interfaces with a long

application processing (as shown above)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

86

71 New enhancements in Proxy queuing

The queuing mechanism was enhanced for sender and receiver proxy systems with Note 1802294

(Receiver) and Note 1831889 (Sender) After implementing these two Notes interface specific queues will be

used and tuning of these queues on interface level will be possible This can be very helpful in cases where

one receiver interface shows very long posting time in the application coding that cannot be further

improved Other messages for more business critical interfaces will eventually be blocked by this message

due to the common usage of the XBTR queues On the sender side of a Proxy system the processing of the

queues call the central PI hub In general the processing time there should be fast But in case of high

volume interfaces you might want to slow down less business critical interfaces to avoid overloading of your

central PI hub The prioritization mechanisms available prior to these Notes did not provide such options

This is achieved by adding an interface specific identifier to the queue-name In the screenshot below you

see a comparison in the queue names for the old framework (red) and the new framework (blue)

For the sender queues the following new parameters are introduced

o Parameter EO_QUEUE_PREFIX_INTERFACE with sub parameter SENDERSENDER_BACK Should be set to lsquo1rsquo to active interface specific queue This is a global switch affecting all sender interfaces

o Parameter EO_INBOUND_PARALLEL_SENDER without sub parameter determines the general number of queues per interface

Using a sub parameter ltsender IDgt the parallelization of individual interfaces can be changed This can eg be used to assign more queues for high priority interfaces

Wildcards can be used in the ltsender IDgt to group interfaces sharing eg the same service name or the same interface name but different services

For the receiver queues the following new parameters are introduced

o Parameter EO_QUEUE_PREFIX_INTERFACE with sub parameter RECEIVER Should be set to lsquo1rsquo to active interface specific queue This is a global switch affecting all receiver interfaces

o Parameter EO_INBOUND_PARALLEL_RECEIVER without sub parameter determines the general number of queues per interface

Using a sub parameter ltsender IDgt the parallelization of individual interfaces can be changed This can eg be used to assign more queues for high priority interfaces

Wildcards can be used in the ltsender IDgt to group interfaces sharing eg the same service name or the same interface name but different services

This new feature also replaces the currently existing prioritization since it is in general more flexible and

powerful We therefore recommend adjusting existing prioritization rules using the new queuing possibilities

described above

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

87

8 MESSAGE SIZE AS SOURCE OF PERFORMANCE PROBLEMS

The message size directly influences the performance of an interface The size of a PI message depends on

two elements The PI header elements with a rather static size and the payload which can vary greatly

between interfaces or over time for one interface (for example larger messages during year-end closing)

The size of the PI message header can cause a major overhead for small messages of only a few kB and

can cause a decrease in the overall throughput of the interface Furthermore many system operations (like

context switches or database operations) are necessary for only a small payload The larger the message

payload the smaller the overhead due to the PI message header On the other hand large messages

require a lot of memory on the Java stack which can cause heavy memory usage on ABAP or excessive

garbage collection activity (see section Java Memory) that will also reduce the overall system performance

Very large messages can even crash the PI system by causing an Out-of-Memory exception for example

You therefore have to find a compromise for the PI message size

Below you see throughput measurements performed by SAP In general the best throughput was identified

for messages sizes of 1 to 5 MB

The message size in these measurements corresponds to the XML message size processed in PI and not

the size of the file or IDoc being sent to PI You can use the Runtime Header of the ABAP stack to check the

message size Below you can see an example of a very small message While the MessageSizePayload

field describes the size of the payload in bytes (here 433 bytes) the MessageSizeTotal describes the total

message size (header + payload) In the example this is around 14 KB demonstrating the overhead by the PI

header for small messages The next two lines describe the payload size before and after the mapping In

the example below the mapping reduces the payload size The last two lines determine the size of the

response message that is sent back to PI before and after the response mapping for synchronous

messages

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

88

Based on the above observations we highly recommend that you use a reasonable message size for your

interfaces During the design and implementation of the interface we therefore recommend using a message

size of 1 to 5 MB if possible Messages can be collected at the sender side or by using IDoc packaging as

described in Tuning the IDoc Adapter to achieve this In case of large messages a split has to be performed

by changing the sender processing or by using the split functions available in the structure conversion of the

File adapter

81 Large message queues on PI ABAP

In case the interfaces are using the ABAP Integration Server use the large message queue filters to restrict

the parallelization of mapping calls from ABAP queues To do so set the parameter EO_MSG_SIZE_LIMIT of

category TUNING to eg 5000 to direct all messages larger 5 MB to dedicated XBTL or XBTM queues

The value of the parameter depends on the number of large messages and the acceptable delay that might

be caused due to a backlog in the large message queue

To reduce the backlog the number of large message queues can also be configured via parameter

EO_MSG_SIZE_LIMIT_PARALLEL of category TUNING The default value is 1 so that all messages larger

than the defined threshold will be processed in one single queue Naturally the parallelization should not be

set higher than 2 or 3 to avoid overloading of the Java memory due to parallel large message requests

82 Large message queues on PI Adapter Engine

SAP Note 1727870 - Handling of large messages in the Messaging System introduces large message

queues (virtual queues no adapter behind) also for the Java based Adapter Engine Contrary to the

Integration Engine it is not the size of a single large message only that determines the parallelization

Instead the sum of the size of the large messages across all adapters on a given Java server node is limited

to avoid overloading the Java heap This is based on so called permits that define a threshold of a message

size Each message larger than the permit threshold is considered as large message The number of permits

can be configured as well to determine the degree of parallelization Per default the permit size is 10 MB and

10 permits are available This means that large messages will be processed in parallel if 100 MB are not

exceeded

To show this let us look at an example using the default values Let us assume we have 5 messages waiting

to be processed (status ldquoTo Be Deliveredrdquo on one server node Message A has 5 MB message B has 10

MB message C has 50 MB message D 150 MB message E 50 MB and message F 40 MB Message A is

not considered large since the size is smaller than the permit size and is not considered large and can be

immediately processed Message B requires 1 permit message C requires 5 Since enough permits are

available processing will start (status DLNG) Hence for message D all available 10 permits would be

required Since the permits are currently not available it cannot be scheduled If blacklisting is enabled the

message will be put to error status (NDLV) since it exceed the maximum number of defined permits In that

case the message would have to be restarted manually Message E requires 5 permits and can also not be

scheduled But since there are 4 permits left message F is put to DLNG Due to the smaller size message B

and message F finish first releasing 5 permits This is sufficient to schedule message E which requires 5

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

89

permits Only after message E and C have finished message D can be scheduled consuming all available

permits

The example above shows a potential delay a large message could face due to the waiting time for the

permits But the assumption is that large messages are not time critical and therefore additional delay is less

critical than potential overload of the system

The large message queue handling is based on the Messaging System queues This means that restricting

the parallelization is only possible after the initial persistence of the message in the Messaging System

queues Per default this is only done after the Receiver Determination Therefore if you have a very high

parallel load of incoming large requests this feature will not help Instead you would have to restrict the size

of incoming requests on the sender channel (eg file size limit in the file adapter or the

icmHTTPmax_request_size_KB limit in the ICM for incoming HTTP requests) If you have very complex

extended receiver determination or complex content based routing it might be useful to configure staging in

the first processing step of the Messaging System (BI=3) as described in Logging Staging on the AAE (PI

73 and higher)

The number of permits consumed can be monitored in PIMON Monitoring Adapter Engine Status The

number of threads corresponds to the number of consumed permits

In newer Wily versions there is also a dashboard showing the number of consumed permits as well The

Worker Threads on the right hand side of the screenshot correspond to the permits

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

90

9 GENERAL HARDWARE BOTTLENECK

During all tuning actions discussed in the previous chapters you must keep in mind that the limitation of all

activities is set by the underlying CPU and memory capacity The physical server and its hardware have to

provide resources for three PI runtimes the Adapter Engine the Integration Engine and the Business

Process Engine Tuning one of the engines for high throughput leaves fewer resources for the remaining

engines Thus the hardware capacity has to be monitored closely

91 Monitoring CPU Capacity

When monitoring the CPU capacity it is not enough to use the average that is displayed in the ldquoPrevious

Hoursrdquo section of transaction ST06 Instead you should start your interface and monitor the CPU while it is

running The best procedure is to use operating system tools to monitor the CPU usage (particularly if using

hardware virtualization)

The SMD Host Agent also reports CPU data to Wily Introscope which can be viewed using dashboards as

shown below This example shows two systems where one is facing a temporary CPU overload and the

other a permanent one

Also the NWA offers a view on the CPU activity from NetWeaver 73 on via Availability and Performance

Management Resource Monitoring History Reports There you can build your own report based on the

ldquoCPU utilizationrdquo data as shown in the screenshot

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

91

You can also monitor the CPU usage in ST06 as described below

Procedure

Log on to your Integration Server and call transaction ST06 Take a look at the snapshot that is provided on

the entry screen than navigate to ldquoDetailed Analysis Menurdquo

Snapshot view Is the idle time (upper left corner) at a reasonable value at all times for example around 10 or higher

Snapshot view The load average is a good indicator of the dimension of the bottleneck It describes the number of processes for each CPU that are in a wait queue before they are assigned to a free CPU As long as the average remains at one process for each available CPU the CPU resources are sufficient Once there is an average of around three processes for each available CPU there is a bottleneck at the CPU resources In connection with a high CPU usage a high value here can indicate that too many processes are active on the server In connection with a low CPU usage a high value here can indicate that the main memory is too small The processes are then waiting due to excessive paging

Detailed Analysis View TOP CPU Which thread uses up the highest amount of CPU Is it the J2EE Engine the work processes or the database

o There are different follow-up actions to be taken depending on the findings of the second check The first option is of course to simply increase the hardware This should be accompanied by a thorough sizing (for which SAP provides the Quicksizer in the Service Marketplace httpservicesapcomquicksizer )

o The second option is to identify the CPU-consuming threads and to think about a reduction of the load in this component For example if the J2EE Engine is the largest consumer then it is worth re-evaluating the number of concurrent mapping threads andor consumer queues as described in the previous chapters

o SAP Note 742395 - Analyzing High CPU Usage by the J2EE Engine provides a good entry point if the J2EE Engine consumes too much CPU

o If it is the work processes that consume a lot of CPU use transaction ST03N to figure out if it is the Integration Server or the Business Process Engine You can do that by looking at the CPU usage per user BPE uses the user WF-BATCH while the Integration Server uses the last user who activated a qRFC queue (typically PIAFUSER or PIAPPLUSER) Use the previous chapter to restrict these activities

92 Monitoring Memory and Paging Activity

As stated above paging is very critical for the Java stack since it influence the Java GC behavior directly

Therefore paging should be avoided in every case for a Java-based system

The OS tools are the most reliable for monitoring the paging activity on your system Transaction ST06 can

also be used to perform a snapshot analysis Take a look at the snapshot that is provided on the entry

screen

Snapshot view Is there enough physical memory available

Snapshot view Is there considerable paging Paging can have a negative influence on your J2EE Engine If it does this can be seen in long GC times as described in Chapter 661

93 Monitoring the Database

Monitoring database performance is a rather complex task and requires information to be gathered from a

variety of sources Only a basic set of indicators is given for the purpose of this guide A more detailed

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

92

analysis is required if these indicators point to a major performance problem in the database If assistance is

needed open a support message under the component BC-DB-ORA (or MSS SDB DB6 respectively)

931 Generic J2EE database monitoring in NWA

The next sections of this chapter will be mainly based on ABAP transactions to do a detailed analysis of the

database performanceThe NWA however offers possibilities to monitor the database performance for the

Java based tables This is especially helpful for Java only AEXPO or Non-Central AAEs In the NWA you

can use the ldquoOpen SQL Monitorrdquo in the Troubleshooting section of NWA The official documentation can be

found in the Online Help Monitoring DB via DBACOCKPIT transaction (ABAP based) is possible also for the

Java-only systems from SAP Solution Manager (DBA Cockpit must be configured in the Managed System

Setup)

It will be too much to outline all the available functionalities here Instead only a few key capabilities will be

demonstrated If you want to eg see the number of select update or insert statements in your system you

can use the ldquoTable Statistics Monitorrdquo It clearly shows you which tables in your system are accessed the

most frequently

To see the processing time of the different statements on your system you can use the ldquoOpen SQL statisticsrdquo

monitor There you can see the count the total average and max processing time of the individual SQL

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

93

statements and can therefore identify the expensive statements on your system

Per default the recorded period is always from the last restart of the system If you would like to just look at

the statistics for a specific time period (eg during a test) you have the option of resetting the statistics in the

individual monitor prior to the test

932 Monitoring Database (Oracle)

Procedure

Log on to your Integration Server and call transaction ST04

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

94

Many of the numbers in the screenshot above depend on each other The following checklist names a few

key performance figures of your Oracle database

The data buffer quality for instance is based on the ratio of physical reads versus the total number of reads The lower the ratio the better the buffer quality The data buffer quality should be better than 94 the statistics should be based on 15 million total reads This number of reads ensures that the database is in an equilibrated state

Ratio of user and recursive calls A good performance is indicated by ratio values greater than 2 Otherwise the number of recursive calls compared to the number of user calls is too high Over time this ratio always declines because more and more SQL statements get parsed in the meantime

Number of reads for each user call If this value exceeds 30 blocks for each user call this indicates an expensive SQL statement

Check the value of TimeUser call Values larger than 15 ms often indicate an optimization issue

Compare busy wait time versus CPU time A ratio of 6040 generally indicates a well-tuned system Significantly higher values (for example 8020) indicate room for improvement

The DD-cache quality should be better than 80

933 Monitoring Database (MS SQL)

Procedure

To display the most important performance parameters of the database call Transaction ST04 or choose

Tools Administration Monitor Performance Database Activity An analysis is only meaningful if

the database has been running for several hours with a typical workload To ensure a significant database

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

95

workload we recommend a minimum of 500 CPU busy seconds Note The default values displayed in

section Server Engine are relative values To display the absolute values press button Absolute values

Check the values in (1)

The cache hit ratio (2) which is the main performance indicator for the data cache shows the

average percentage of requested data pages found in the cache This is the average value since startup The value should always be above 98 (even during heavy workload) If it is significantly below 98 the data cache could be too small To check the history of these values use Transaction ST04 and choose Detail Analysis Menu Performance Database A snapshot is collected every 2 hours

Memory setting (5) shows the memory allocation strategy used and shows the following

FIXED SQL Server has a constant amount of memory allocated which is set by SQL Server configuration parameters min server memory (MB) = max server memory (MB)

RANGE SQL Server dynamically allocates memory between min server memory (MB) lt gt max server memory (MB)

AUTO SQL Server dynamically allocates memory between 4 MB and 2 PB which is set by min server memory (MB) = 0 and max server memory (MB) = 2147483647

FIXED-AWE SQL Server has a constant amount of memory allocated which is set by min server memory (MB) = max server memory (MB) In addition the address windowing extension functionality of Windows 2000 is enabled

934 Monitoring Database (DB2)

Procedure

To get an overview of the overall buffer pool usage catalog and package cache information go to

transaction ST04 choose Performance Database section Buffer Pool (or section Cache respectively)

1 1

1 2

1 3

1 5

1 4

gt 6h

gt 98

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

96

Buffer Pools Number Number of buffer pools configured in this system

Buffer Pools Total Size The total size of all configured buffer pools in KB If more than one buffer pool is used choose Performance Buffer Pools to get the size and buffer quality for every single buffer pool

Overall buffer quality This represents the ratio of physical reads to logical reads of all buffer pools

Data hit ratio In addition to overall buffer quality you can use the data hit ratio to monitor the database (data logical reads - data physical reads) (data logical reads) 100

Index hit ratio In addition to overall buffer quality you can use the index hit ratio to monitor the database (index logical reads - index physical reads) (index logical reads) 100

Data or Index logical reads The total number of read requests for data or index pages that went through the buffer pool

Data or Index physical reads The total number of read requests that required IO to place data or index pages in the buffer pool

Data synchronous reads or writes Read or write requests performed by db2agents

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

97

Catalog cache size Maximum size of the catalog cache that is used to maintain the most frequently accessed sections of the catalog

Catalog cache quality Ratio of catalog entries (inserts) to reused catalog entries (lookups)

Catalog cache overflows Number of times that an insert in the catalog cache failed because the catalog cache was full (increase catalog cache size)

Package cache size Maximum size of the package cache that is used to maintain the most frequently accessed sections of the package

Package cache quality Ratio of package entries (inserts) to reused package entries (lookups)

Package cache overflows Number of times that an insert in the package cache failed because the package cache was full (increase package cache size)

935 Monitoring Database (MaxDB SAP DB)

Procedure

As with the other database types call transaction ST04 to display the most important performance

parameters

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

98

With the Database Performance Monitor (transaction ST04) and its submonitors the CCMS in the SAP R3

System allows you to view in the SAP R3 System all of the information that can be used to identify

bottleneck situations

The SQL Statements section provides information about the number of SQL statements executed and related sizes

The IO Activity section lists physical and logical read and write accesses to the database

The Lock Activity section (in transaction ST04) allows you to identify possible bottlenecks in the assignment of locks by the database

The Logging Activity section combines information about the log area

The Scan and sort activity section can be helpful in identifying that suitable indexes are missing

The Cache Activity section provides information about the usage of the caches and the associated hit rates

o The data cache Hit rate should be 99 in a balanced system If this is not the case you need to check if the low hit rate is due to the size of the data cache being too small or due to an unsuitable SQL statement If necessary you can increase the data cache by increasing the MaxDB parameter Data_Cache_Size (lower than 74) or Cache_Size (74 or higher)

o The catalog cache hit rate should be 85 in a balanced system It is not a cause for concern if the catalog hit rate is temporarily lower since accesses to the system tables and an active command monitor can temporarily impair the hit rate If the catalog cache is busy the pages are paged out in the data cache rather than on the hard disk

o Log entries are not written directly to the log volume but first into a buffer (LOG_IO_QUEUE) so as to be able to write several log entries together asynchronously with one IO on the hard disk Only once a LOG_IO_QUEUE page is full is it written to the hard disk However there are situations in which you need to write LOG entries from the LOG_QUEUE onto the hard disk as quickly as possible for example if transactions are completed (COMMITROLLBACK) The transactions wait until the log writer reports the OK informing them that the log entry is on the hard disk Firstly this means that is important to use the quickest possible hard disks for the log volume(s) and secondly you must ensure that no LOG_QUEUE_OVERFLOWS occur in production operation If the LOG_IO_QUEUE is full then all transactions that want to write LOG entries must wait until free memory becomes available again in the LOG_IO_QUEUE At LOG_QUEUE_OVERFOLWS (Transaction DB50 -gt Current Status -gt Activities Overview -gt LOG_IO_QUEUE Overflow) you need to increase the LOG_IO_QUEUE parameter

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

99

o In accordance with Note 819324 - FAQ MaxDB SQL optimization check which SQL statements are responsible for the most disk accesses and whether they can be optimized You should also optimize SQL statements that have a poor runtime and poor selectivity

Bottleneck Analysis

The Bottleneck Analysis function is available in transaction ST04 under the Problem Analyzes

Performance Database Analyzer Bottlenecks You can use this function to activate the corresponding

analysis tool dbanalyzer

The dbanalyzer then collects important performance data every 15 minutes and evaluates this for possible

bottlenecks

The result of the bottleneck analysis is output in text format to provide a quick overview of possible causes of

performance problems For a more detailed description of the bottleneck messages see the online

documentation (httphelpsapcom) in the SAP Web Application Server area and search for SAP DB

bottleneck analysis messages

The dbanalyzer tool can also be started at operating system level It logs its analysis results in files with date

stamps in the subdirectory ltrundirectorygtanalyzer (you can find the actual directory by double-clicking on

lsquoPropertiesrsquo in DB50) of the relevant database instance

94 Monitoring Database Tables

Some of the tables of SAP PI grow very quickly and can cause severe performance problems if archiving or

deletion does not take place frequently For troubleshooting see SAP Note 872388 ndash Troubleshooting

Archiving and Deletion in PI

Procedure

Log on to your Integration Server and call transaction SE16 Enter the table names as listed below and execute In the following screen simply press the button ldquoNumber of Entriesrdquo The most important tables are SXMSPMAST (cleaned up by XML message archivingdeletion)

If you use the switch procedure you have to check SXMSPMAST2 as well

SXMSCLUR SXMSCLUP (cleaned up by XML message archivingdeletion)

If you use the switch procedure you have to check SXMSPCLUR2 and SXMSCLUP2 as well

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

100

SXMSPHIST (cleaned up by the deletion of history entries)

If you use the switch procedure you have to check SXMSPHIST2 as well

SXMSPFRAWH and SXMSPFRAWD (cleaned up by the performance jobs see SAP Note 820622 ndash Standard Jobs for XI Performance Monitoring)

SWFRXI (cleaned up by specific jobs see SAP Note 874708 - BPE HT Deleting Message Persistence Data in SWFRXI)

SWWWIHEAD (cleaned up by work item archivingdeletion)

Use an appropriate database tool for example SQLPLUS for Oracle for the database tables of the J2EE schema Main tables that could be affected of growth

PI Message tables BC_MSG and BC_MSG_AUDIT (when audit log persistence is enabled)

PI message logging information when stagingloging is used BC_MSG_LOG_VERSION

XI IDoc tables (if IDoc persistence is activated) XI_IDOC_IN_MSG and XI_IDCO_OUT_MSG

Check for all tables if the number of entries is reasonably small or remains roughly constant over a period of time If that is not the case check your archivingdeletion setup

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

101

10 TRACES LOGS AND MONITORING DECREASING PERFORMANCE

101 Integration Engine

There are three important locations for the configuration of tracing logging in the Integration Engine The

pipeline settings of PI itself the Internet Communication Manager (ICM) and the gateway All three are

heavily involved in message processing and therefore their tracing or logging settings can have an impact on

performance On top of this you might have changed the SM59 RFC destinations and have switched on the

trace in order to analyze a problem If so then this needs to be reset as well

Procedure

Start transaction SXMB_ADM and navigate to Integration Engine Configuration Specific Configuration and search for the following entries

Category Parameter Subparameter Current Value Default

RUNTIME TRACE_LEVEL ltnonegt ltyour valuegt 1

RUNTIME LOGGING ltnonegt ltyour valuegt 0

RUNTIME LOGGING_SYNC ltnonegt ltyour valuegt 0

Set the above parameters back to the default value which is the recommended value by SAP

Start transaction SMICM and check the value of the Trace Level that is displayed in the overview

(third line) Set the trace level to lsquo1rsquo if not already the case You can do so by navigating to Goto Trace Level Set

Start transaction SMGW navigate to Goto Parameters Display and check the parameter Trace Level Set the trace level to lsquo1rsquo if not already the case You can do so by navigating to Goto Trace Gateway Reduce Level (or Increase Level respectively)

Start transaction SM59 and search the following RFC destination for the flag ldquoTracerdquo on the tab Special Options and remove the Flag ldquoTracerdquo if it has been set

AI_RUNTIME_JCOSERVER

AI_DIRECTORY_JCOSERVER

LCRSAPRFC

SAPSLDAPI It is possible that other RFC destinations are used for sending out IDocs and similar

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

102

102 Business Process Engine

You only have to check the Event Trace in the Business Process Engine

Procedure

Call transaction SWELS and check if the Event Trace is switched on The recommended setting is lsquooffrsquo (as shown in the screenshot below)

103 Adapter Framework

Since the Adapter Framework runs on the J2EE Engine the tracing and logging can be conveniently

checked and controlled using NetWeaver Administrator To improve the performance SAP recommends that

you set all trace levels to default Higher trace levels are only acceptable in productive usage during problem

analysis Below you will find a description on how to set the default trace level for all locations at once It is of

possible to do it for every location separately but this way you might forget to reset one of the locations

Procedure

Start the NetWeaver Administrator and navigate to Problem Management (or Troubleshooting in NW 73 and higher) Logs and Traces and choose Log Configuration In the Show drop down list box choose Tracing Locations Select the Root Location in the tree and press the button Default Configuration This will set the default configuration for all locations

Start the NetWeaver Administrator and navigate to Troubleshooting Logviewer (direct link nwalinks) Open the View ldquoDeveloper Tracerdquo and check if you have very frequent reoccurring exceptions that fill up the trace Analyze the exception and check what is causing this (eg a wrong configuration of a Communication Channel causing repeated traces If you see very frequent errors for which you cannot find the root cause open a customer incident at SAP

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

103

An aggregated view of Java exceptions (by number date instance server node etc) are reported and available in SAP Solution Manager Root Cause Analysis Exception Analysis functionality

1031 Persistence of Audit Log information in PI 710 and higher

With SAP NetWeaver PI 71 the audit log is not persisted for successful messages in the database by default

to avoid performance overhead Therefore the audit log is only available in the cache for a limited period of

time (based on the overall message volume)

Audit log persistence can be activated temporarily for performance troubleshooting where audit log

information is required To do so use the NWA and change the parameter

ldquomessagingauditLogmemoryCacherdquo to false in service XPI Service Messaging System by setting the

parameter to false Details can be found in SAP Note 1314974 - PI 71 AF Messaging System audit log

persistence Only do so temporarily during the time of troubleshooting to avoid any performance problems

from the additional persistence

After implementing Note 1611347 - New data columns in Message Monitoring additional information like

message processing time in ms and server node is visible in the message monitor

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

104

11 ERRORS AS SOURCE OF PERFORMANCE PROBLEMS

If errors occur within messages or within technical components of the SAP Process Integration then this can

have a severe impact on the overall performance Thus to solve performance problems it is sometimes

necessary to analyze the log files of technical components and to search for messages with errors To

search for messages with errors use the PI Admin Check which is available using SAP Note 884865 - PIXI

Admin Check

Procedure

ICM (Internet Communication Manager)

Start transaction SMICM open the log file (Shift + F5 or GoTo Trace File Show All) and check for errors

Gateway

Start transaction SMGW open the Log File (CTRL + Shift + F10 or Goto Trace Gateway Display File) and check for errors

System Log

Start transaction SM21 choose an appropriate time interval and search the log entries for errors Repeat the procedure for remote systems if you are using dialog instances

ABAP Runtime Errors

Start transaction ST22 choose an appropriate time interval and search for PI-related dumps In general the number of ABAP dumps should be very small on a PI system and therefore all occurring dumps should be analyzed

Alerts CCMS

Start transaction RZ20 and search for recent alerts

Work Process and RFC trace

In the PI work directory check all files which begin with dev_rfc or dev_w for errors

J2EE Engine

In the PI work directory search for errors in file dev_server0out If you are using more than one J2EE server node check dev_serverltngtout as well

Applications running on the J2EE Engine

The traces for application-related errors are the defaulttrc files located in j2eeclusterserverltngtlog directory To monitor exceptions across multiple server nodes the best tool to use is the LogViewer service of the NWA This is available via Problem Management Logs and Traces and choose Log Viewer Search for exceptions in the area of the performance problem for example the PI Messaging system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

105

APPENDIX A

A1 Wily Introscope Transaction Trace

As mentioned above the Wily Transaction trace can be used to trace expensive steps that were noticed

during the mapping or module processing The transaction trace will allow you to drill down further into Java

performance problems and to distinguish if it is a pure coding problem or caused by a look-up to a remote

system or a slow connection to the local database

The Wily Transaction trace can be used to trace all steps that exceed a specific duration on the Java stack If

a user is known (for example for SAP Enterprise Portal) the tracing can be restricted to a specific user In PI

if you encounter a module that lasts several seconds then you can restrict the tracing as shown below

Note Starting the Transaction Trace increases the data collection on the satellite system (PI) and is

therefore only recommended in productive environment for troubleshooting purposes

With the selection above the trace will run for 10 minutes (this is the maximum and it can be canceled earlier)

and will trace all steps exceeding 3 seconds for the agents of the PI system If such long-running steps are

found then a new window will be displayed listing these steps

In the screenshot below you can see the result in the Trace View The Trace View shows the elapsed time

from left to rightndash in the example below around 48 seconds From top to bottom we can see the call stack of

the thread In general we are interested in long-running threads on the bottom of the trace view A long-

running block at the bottom means that this is the lowest level coding that was instrumented and which is

consuming all the time

In the example below we see a mapping call that is performing many individual database statements ndash this

will become visible by highlighting the lowest level In such a case you have to review the coding of the

mapping to see if the high amount of database calls can be summarized in one call

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

106

Another case that is often seen is that a lookup using JDBC to a remote database or RFC to an ABAP

system takes a long time in the mapping or adapter module In such a case there will be one long block at

the bottom of the transaction trace that also gives you some details about the statement that was executed

A2 XPI inspector for troubleshooting and performance analysis

The XPI Inspector is an additional tool that provides enhanced logging and tracing capabilities with a main

focus on troubleshooting But the tool can also be used for troubleshooting of performance issues

General information about the tool can be found at Note 1514898 - XPI Inspector for troubleshooting PI The

tool can be called using the following URL on your system httphostjava-portxpi_inspector

To analyze performance problems typically the example ldquo51 ndash ldquoPerformance Problem)rdquo is used As a basic

measurement it allows you to take multiple thread dumps in specific time intervals These thread dumps can

be analyzed later by your administrators or SAP support to identify what is causing the problem The Thread

Dump Viewer tool (TDV) from Note 1020246 - Thread Dump Viewer for SAP Java Engine can be used to

analyze the produced thread dumps (inside the results archive that can be downloaded to your local PC)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

107

In addition the tool allows you to do JVM profiling be either doing JVM Performance tracing or JVM Memory

Allocation tracing This can help to understand in detail which steps of the processing are taking a long time

As output the tool provides prf files that can be loaded into the JVM Profiler More information can be found

at httpwikiscnsapcomwikidisplayASJAVAJava+Profiling Please Note that these traces (especially the

memory profiling) will cause a high overhead on the J2EE engine and are therefore not recommended to be

used in production Instead it is advisable to reproduce the problem on your QA system and use the tool

there

Document Version 30 March 2014

wwwsapcom

copy 2014 SAP AG All rights reserved

SAP R3 SAP NetWeaver Duet PartnerEdge ByDesign SAP

BusinessObjects Explorer StreamWork SAP HANA and other SAP

products and services mentioned herein as well as their respective

logos are trademarks or registered trademarks of SAP AG in Germany

and other countries

Business Objects and the Business Objects logo BusinessObjects

Crystal Reports Crystal Decisions Web Intelligence Xcelsius and

other Business Objects products and services mentioned herein as

well as their respective logos are trademarks or registered trademarks

of Business Objects Software Ltd Business Objects is an SAP

company

Sybase and Adaptive Server iAnywhere Sybase 365 SQL

Anywhere and other Sybase products and services mentioned herein

as well as their respective logos are trademarks or registered

trademarks of Sybase Inc Sybase is an SAP company

Crossgate mgic EDDY B2B 360deg and B2B 360deg Services are

registered trademarks of Crossgate AG in Germany and other

countries Crossgate is an SAP company

All other product and service names mentioned are the trademarks of

their respective companies Data contained in this document serves

informational purposes only National product specifications may vary

These materials are subject to change without notice These materials

are provided by SAP AG and its affiliated companies (SAP Group)

for informational purposes only without representation or warranty of

any kind and SAP Group shall not be liable for errors or omissions

with respect to the materials The only warranties for SAP Group

products and services are those that are set forth in the express

warranty statements accompanying such products and services if

any Nothing herein should be construed as constituting an additional

warranty

wwwsapcom

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

6

1 INTRODUCTION

SAP NetWeaver Process Integration (PI) consists of the following functional components Integration Builder

(including Enterprise Service Repository (ESR) Service Registry (SR) and Integration Directory) Integration

Server (including Integration Engine Business Process Engine and Adapter Engine) Runtime Workbench

(RWB) and System Landscape Directory (SLD) The SLD in contrast to the other components may not

necessarily be part of the PI system in your system landscape because it is used by several SAP NetWeaver

products (however it will be accessed by the PI system regularly) Additional components in your PI

landscape might be the Partner Connectivity Kit (PCK) a J2SE Adapter Engine one or several non-central

Advanced Adapter Engines or an Advanced Adapter Engine Extended An overview is given in the graphic

below The communication and accessibility of these components can be checked using the PI Readiness

Check (SAP Note 817920)

Processing at runtime is carried out by the Integration Server (IS) with the aid of the SLD The IS in this

graphic stands for the Web Application Server 70 or higher In the classic environment PI is installed as a

double stack an ABAP and a Java stack (J2EE Engine) The ABAP stack is responsible for pipeline

processing in the Integration Engine (Receiver Determination Interface Determination and so on) and the

processing of integration processes in the Business Process Engine Every message has to pass through

the ABAP stack The Java stack executes the mapping of messages (with the exception of ABAP mappings)

and hosts the Adapter Framework (AFW) The Adapter Framework contains all XI adapters except the plain

HTTP WSRM and IDoc adapters

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

7

With 71 the so called Advanced Adapter Engine (AAE) was introduced The Adapter Engine was enhanced

to provide additional routing and mapping call functionality so that it is possible to process messages locally

This means that a message that is handled by a sender and receiver adapter based on J2EE does not need

to be forwarded to the ABAP Integration Engine This saves a lot of internal communication reduces the

response time and significantly increases the overall throughput The deployment options and the message

flow for 71 based systems and higher are shown below Currently not all the functionalities available in the

PI ABAP stack are available on the AAE but each new PI release closes the gap further

In SAP PI 73 and higher the Adapter Engine Extended (AEX) was introduced In addition to the AAE the

Adapter Engine Extended also allows local configuration of the PI objects The AEX can therefore be seen

as a complete PI installation running on Java only From the runtime perspective no major differences can be

seen compared to the AAE and therefore no differentiation is made in this guide

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

8

Looking at the different involved components we get a first impression of where a performance problem

might be located In the Integration Engine itself in the Business Process Engine or in one of the Advanced

Adapter Engines (central non-central plain PCK) Of course a performance problem could also occur

anywhere in between for example in the network or around a firewall Note that there is a separation

between a performance issue ldquoin PIrdquo and a performance issue in any attached systems If viewed ldquothrough

the eyesrdquo of a message (that is from the point of view of the message flow) then the PI system technically

starts as soon as the message enters an adapter or (if HTTP communication is used) as soon as the

message enters the pipeline of the Integration Engine Any delay prior to this must be handled by the

sending system The PI system technically ends as soon as the message reaches the target system for

example a given receiver adapter (or in case of HTTP communication the pipeline of the Integration Server)

has received the success message of the receiver Any delay after this point in time must be analyzed in the

receiving system

There are generally two types of performance problems The first is a more general statement that the PI

system is slow and has a low performance level or does not reach the expected throughput The second is

typically connected to a specific interface failing to meet the business expectation with regard to the

processing time The layout of this check is based on the latter First you should try to determine the

component that is responsible for the long processing time or the component that needs the highest absolute

time for processing Once this has been clarified there is a set of transactions that will help you to analyze

the origin of the performance problem

If the recommendations given in this guide are not sufficient SAP can help you to optimize the service via

SAP consulting or SAP MaxAttention services SAP might also handle smaller problems restricted to a

specific interface if you describe your problem in a SAP customer incident Support offerings like ldquoSAP

Going Live Analysis (GA)rdquo and ldquoSAP Going Live Verification (GV)rdquo can be used to ensure that the main

system parameters are configured according to SAP best practice You can order any type of service using

SAP Service Marketplace or your local SAP contacts

If you have already worked with the PI Admin Check (SAP Note 884865) you will recognize some of the

transactions This is because SAP considers the regular checking of the performance to be an important

administrational task However this check tries to show the methodology to approach performance problems

to its reader Also it offers the most common reasons for performance problems and links to possible follow-

up actions but does not refer to any regular administrational transactions

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

9

Wily Introscope is a prerequisite for analyzing performance problems at the Java side It allows you to

monitor the resource usage of multiple J2EE server nodes and provides information about all the important

components on the Java stack like mapping runtime messaging system or module processor Furthermore

the data collected by Wily is stored for several days so it is still possible to analyze a performance problem at

a later date Wily Introscope is provided free-of-charge for SAP customers It is delivered as part of Solution

Manager Diagnostics but can also be installed separately For more information see

httpservicesapcomdiagnostics or SAP Note 797147

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

10

2 WORKING WITH THIS DOCUMENT

If you are experiencing performance problems on your Process Integration system start with Chapter 3

Determining the Bottleneck It helps you to determine the processing time for the different runtimes within PI

Integration Engine Business Process Engine and Adapter Engine or connected Proxy System Once you

have identified the area in which the bottleneck is most probably located continue with the relevant chapter

This is not always easy to do because a long processing time is not always an indication for a bottleneck It

can make sense to do this if a complicated step is involved (Business Process Engine) if an extensive

mapping is to be executed (IS) or if the payload is quite large (all runtimes) For this reason you need to

compare the value that you retrieve from Chapter 3 with values that you have received previously for

example The history data provided for the Java components by Wily Introscope is a big help If this is not

possible you will have to work with a hypothesis

Once the area of concern has been identified (or your first assumption leads you there) Chapters 4 5 6

and 7 will help you to analyze the Integration Engine the Business Process Engine the PI Adapter

Framework and the ABAP Proxy runtime

After (or preferably during) the analysis of the different process components it is important to keep in mind

that the bottlenecks you have observed could also be caused by other interfaces processing at the same

time It will lead you to slightly different conclusions and tuning measures You therefore have to distinguish

the following cases based on where the problem occurs

A with regard to the interface itself that is it occurs even if a single message of this interface is processed

B or with regard to the volume of this interface that is many messages of this interface are processed at the same time

C or with regard to the overall message volume processed on PI

This can be analyzed by repeating the procedure of Chapter 3 for A) a single message that is processed

while no other messages of this interface are processed and while no other interfaces are running Then

compare this value with B) the processing time for a typical amount of messages of this interface not simply

one message as before If the values of measurement A) and B) are similar repeat the procedure with C) a

typical volume of all interfaces that is during a representative timeframe on your productive PI system or

with the help of a tailored volume test

These three measurements ndash A) processing time of a single message B) processing time of a typical

amount of messages of a single interface and C) processing time of a typical amount of messages of all

interfaces ndash should enable you to distinguish between the three possible situations

o A specific interface has long processing steps that need to be identified and improved The tuning options are usually limited and a re-design of the interface might be required

o The mass processing of an interface leads to high processing times This situation typically calls for tuning measures

o The long processing time is a result of the overall load on the system This situation can be solved by tuning measures and by taking advantage of PI features for example to establish separation of interfaces If the bottleneck is hardware-related it could also require a re-sizing of the hardware that is used

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

11

Chapters 8 9 10 and 11 deal with more general reasons for bad performance such as heavy

tracinglogging error situations and general hardware problems They should be taken into account if the

reason for slow processing cannot be found easily or if situation C from above applies (long processing times

due to a high overall load)

Important

Chapter 9 provides the basic checks for the hardware of the PI server It should not only be used when

analyzing a hardware bottleneck due to a high overall load andor an insufficient sizing but should also be

used after every change made to the configuration of PI to ensure that the hardware is able to handle the

new situation This is important because a classical PI system uses three runtime engines (IS BPE AFW)

Tuning one engine for high throughput might have a direct impact on the others With every tuning action

applied you have to be aware of the consequences on the other runtimes or the available hardware

resources

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

12

3 DETERMINING THE BOTTLENECK

The total processing time in PI consists of the processing time in the Integration Server processing time in

the Business Process Engine (if ccBPM is used) as well as the processing time in the Adapter Framework (if

adapters except IDoc plain HTTP or ABAP proxies are used) In the subsequent chapters you will find a

detailed description of how to obtain these processing times

For reasons of completeness we will also have a look at the ABAP Proxy runtime and the available

performance evaluations

31 Integration Engine Processing Time

You can use an aggregated view of the performance (details about activation in Note 820622 - Standard jobs

for XI performance monitoring) data for the Integration Engine as shown below

Procedure

Log on to your Integration Engine and call transaction SXMB_IFR In the browser window follow the link to

the Runtime Workbench In there click ldquoPerformance Monitoringrdquo Change the display to ldquoDetailed Data

Aggregatedrdquo select lsquoISrsquo as the Data Source (or lsquoPMIrsquo if you have configured the Process Monitoring

Infrastructurersquo) and ldquoXIIntegrationServerltyour_hostgtrdquo as the component Then choose an appropriate time

interval for example the last day You have to enter the details of the specific interface you want to monitor

here

The interesting value is the ldquoProcessing Time [s]rdquo in the fifth column It is the sum of the processing time of the single steps in the Integration Server For now you are not interested in the single steps that are listed on the right side since you are still trying to find out the part of the PI system where the most time is lost Chapter 44 will work with the individual steps

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

13

If you do not know which interface is affected yet you first have to get an overview Instead of navigating to

Detailed Data Aggregated in the Runtime Workbench choose Overview Aggregated Use the button Display

Options and check the options for sender component receiver component sender interface and receiver

interface Check the processing times as described above

32 Adapter Engine Processing Time

Procedure

Log on to your Integration Server and call transaction SXMB_IFR In the browser window follow the link to

the Runtime Workbench In there click Message Monitoring choose Adapter Engine lthostgt from Database

from the drop down lists and press Display Enter your interface details and ldquoStartrdquo the selection Select one

message using the radio button and press Details

Calculate the difference between the start and end timestamp indicated on the screen

Do the above calculation for the outbound (AFW IS) as well as the inbound (IS AFW) messages

The audit log of successful message is no longer persisted in SAP NetWeaver PI 71 by default in order to minimize the load on the database The audit log has a high impact on the database load since it usually consists of many rows that are inserted individually in the database In newer releases therefore a cache was implemented that keeps the audit log information for a period of time only Thus no detailed information is available any longer from a historical point of view Instead you should use Wily Introscope for historical analyses of the performance of successful messages

The audit log can be persisted on the database to allow historical analysis of performance problems

on the Java stack To do so change the parameter ldquomessagingauditLogmemoryCacherdquo from

true to false for service ldquoXPI Service Messaging Systemrdquo using NetWeaver Administrator (NWA) as described in SAP Note 1314974 Note Only do this temporarily if you have identified the bottleneck on the AFW First try using Wily to determine the root cause of the problem

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

14

Due to above mentioned limitations Wily Introscope is the right tool for performance analysis of the Adapter Engine Wily persists the most important steps like queuing module processing and average response time in historical dashboards (aggregated data per intervals for all server nodes of a system or all systems with filters possible etc)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

15

Especially for Java-Only runtime using Integrated Configuration Wily shows the complete processing time after the initial sender adapter steps (receiver determination mapping and time in the backend call) as can be seen in the Message Processing Time dashboard below

Using the ldquoShow Minimum and Maximumrdquo functionality deviations from the average processing time can be seen In the example below a message processing step takes up to 22 seconds The reason for this could either be a slow mapping or a delay in the call to the receiver system (in the example below Java IDoc adapter is used to transfer the data to ERP)

In case of high processing times in one of your steps further analysis might be required using thread dumps or the Java Profiler as outlined in the appendix section A2 XPI inspector for troubleshooting and performance analysis

321 Adapter Engine Performance monitor in PI 731 and higher

Starting from PI 731 SP4 there is a new performance monitor available for the Adapter Engine More

information on the activation of the performance monitor can be found at Note 1636215 ndash Performance

Monitoring for the Adapter Engine Similar to the ABAP monitor in the RWB the data is displayed in an

aggregated format On the PI start page use the link ldquoConfiguration and Monitoring Homerdquo Go to ldquoAdapter

Enginerdquo and select ldquoPerformance Monitorrdquo The data is persisted on an hourly basis for the last 24 hours and

on a daily basis for the last 7 days The data is further grouped on interface level and per Java server node

With the information provided you can therefore see the minimum maximum and average response time of

an individual interface on a specific Java server node All individual steps of the message processing like

time spent in the Messaging System queues or Adapter modules are listed In the example below you can

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

16

see that most of the time (5 seconds) is spent in a module called SimpleWaitModule This would be the entry

point for the further analysis

Starting with 731 SP10 (740 SP05) this data can be exported to excel in two ways Overview or Detailed (details in SAP Note 1886761)

33 Processing Time in the Business Process Engine

Procedure

Log on to your integration server and call transaction SXMB_MONI Adjust your selection in a way that you

select one or more messages of the respective interface Once the messages are listed navigate to the

Outbound column (or Inbound if your integration process is the sender) and click on PE Alternatively you

can select the radio button for ldquoProcess Viewrdquo on the entranceselection screen of SXMB_MONI

To judge the performance of an integration process it is essential to understand the steps that are executed A long-running integration process in itself is not critical because it can wait for a trigger event or can include a synchronous call to a backend Therefore it is important to understand the steps that are included before analyzing an individual workflow log For example the screenshot below shows a long waiting time after the mapping This is perhaps due to an integrated wait step

Calculate the time difference between the first and the last entry Note that this time for the workflow does not include the duration of RFC calls for example To see this processing time navigate to the ldquoList with Technical Detailsrdquo (second button from the left on the screenshot below or shift + F9) Repeat this step for several messages to get an overview about the most time consuming steps

Please Note that the processing time mentioned above does not include the waiting time of the messages in the ccBPM queues Therefore it is essential to monitor a queue backlog in ccBPM queues as discussed in section ldquoQueuing and ccBPM (or increasing Parallelism for ccBPM Processes)rdquo

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

17

Via transaction SWI2_DURA you can display an aggregated performance overview across all ccBPM

processes running on your PI system In the initial screen you have to choose ldquo(Sub-) Workflow and the time

range you would like to look at

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

18

The results here allow you to compare the average performance of a process It shows you the processing

time based on barriers The 50 barrier means that 50 of the messages were processed faster than the

value specified Furthermore you also see a comparison of the processing time to eg the day before By

adjusting the selection criteria of the transaction you can therefore get a good overview about the normal

processing time of the Integration process and can judge if there is a general performance problem or just a

temporary one

The number of process instances per integration process can be easily checked via transaction

SWI2_FREQ The data shown allows you to check the volume of the Integration Processes and allow you to

judge if your performance problem is caused by an increase of volume which could cause a higher latency in

the ccBPM queues

New in PI 73 and higher

Starting from PI 73 a new monitoring for ccBPM processes is available This monitor can be started from

transaction SXMB_MONI_BPE Integration Process Monitoring (also available in ldquoConfiguration and

Monitoring Homerdquo on PI start page) This is new browser based view that allows a simplified and aggregated

view on the PI Integration Processes

On the initial screen you get an overview about all the Integration Processes executed in the selected time

interval Therefore you can immediately see the volume of each Integration Process

From there you can navigate to the Integration process facing the performance issues and look at the

individual process instances and the start and end time Furthermore there is a direct entry point to see the

PI messages that are assigned to this process

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

19

Choosing one special process instance ID you will see the time spend in each individual processing step

within the process In the example below you can see that most of the time is spend in the Wait Step

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

20

4 ANALYZING THE INTEGRATION ENGINE

If chapter Determining the Bottleneck has shown that the bottleneck is clearly in the Integration Engine there

are several transactions that can help you to analyze the reason for this To understand why this selection of

transactions helps to analyze the problem it is important to know that the processing within the Integration

Engine is done within the pipeline The central pipeline of the Integration Server executes the following main

steps

Central Pipeline Steps Description of Central Pipeline Steps

PLSRV_XML_VALIDATION_RQ_INB XML Validation Inbound Channel Request

PLSRV_RECEIVER_DETERMINATION Receiver Determination

PLSRV_INTERFACE_DETERMINATION Interface Determination

PLSRV_RECEIVER_MESSAGE_SPLIT Branching of Messages

PLSRV_MAPPING_REQUEST Mapping

PLSRV_OUTBOUND_BINDING Technical Routing

PLSRV_XML_VALIDATION_RQ_OUT XML Validation Outbound Channel Request

PLSRV_CALL_ADAPTER Transfer to Respective Adapter IDoc HTTP Proxy AFW

PLSRV_XML_VALIDATION_RS_INBXML Validation Inbound Channel Response

PLSRV_MAPPING_RESPONSE Response Message Mapping

PLSRV_XML_VALIDATION_RS_OUT Validation Outbound Channel Response

The last three steps highlighted in bold are available in case of synchronous messages only and reflect the

time spent in the mapping of the synchronous response message

The steps indicated in red are new in SAP NetWeaver PI 71 and higher and can be used to validate the

payload of a PI message against an XML schema These steps are optional and can be executed at different

times in the PI pipeline processing for example before and after a mapping (as shown above)

With PI 731 an additional option of using an external Virus Scan during PI message processing can be

activated as described in the Online Help The Virus Scan can be configured on multiple steps in the pipeline

ndash eg after the mapping (Virus Scan Outbound Channel Request) when receiving a synchronous response

(Virus Scan Inbound Channel Response) and after the response mapping (Virus Scan Outbound Channel

Response)

It is important to know is that these pipeline steps are executed based on qRFC Logical Unit of Work (LUW)

by using dialog work processes

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

21

With 711 and higher versions you can activate the Lean Message Search (LMS) to be able to search for

payload content This is described in more detail in chapter Long Processing Times for

ldquoLMS_EXTRACTIONrdquo

41 Work Process Overview (SM50SM66)

The work process overview is the central transaction to get an overview of the current processes running in

the Integration Engine For message processing PI is only using Dialog work processes (DIA WP) Therefore

it is essential to ensure that enough DIA WPs available

The process overview is only a snapshot analysis You therefore have to refresh the data multiple times to

get a feeling for the dynamics of the processes The most important questions to be asked are as follows

How many work processes are being used on average Use the CPU Time (clock symbol) to check that not all DIA WPs are used In case all DIA WPs have a high CPU time this indicates a WP bottleneck

Which users are using these processes It depends on the usage of your PI system All asynchronous messages are processed by the QIN scheduler who will start the message processing in DIA WPs The user that is shown in SM66 will be the one that triggered the QIN scheduler This can eg be a communication user for a connected backend (usually a copy of PIAPPLUSER) or the user used to send data from the Adapter Engine (per default PIAFUSER)

Activities related to the Business Process Engine (ccBPM) will be indicated by user WF-BATCH If you see high WF-BATCH activity take a look at chapter 51

Are there any long running processes and if so what are these processes doing with regard to the reports used and the tables accessed

o As a rule of thumb for the initial configuration the number of DIA work processes should be around six times the value of CPU in your PI system (rdispwp_no_dia = 6 to 8 times CPUs cores based on SAP Note 1375656 - SAP NetWeaver PI System Parameters)

If you would like to get an overview for an extended period of time without actually refreshing the transaction

at regular intervals use report SDFMON and schedule the Daily Monitoring It allows you to collect metrics

such as CPU usage amount of free Dialog work processes and most importantly a snapshot of the work

process activity as provided in transactions SM50 and SM66 The frequency for the data collection can be as

low as 10 seconds and the data can be viewed after the configurable timeframe of the analysis Note

SDFMON is only intended for analysis of performance issues and should only be scheduled for a limited

period of time

If you have Solution Manager Diagnostics installed and all the agents connected to PI you can also monitor

the ABAP resource usage using Wily Introscope As you can see in the dashboard below you can use it to

monitor the number of free work processes (prerequisite is that the SMD agent is running on the PI system)

The major advantage of Wily Introscope is that this information is also available from the past and allows

analysis after the problem has occurred in the system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

22

42 qRFC Resources (SARFC)

Depending on the configuration of the RFC server group not all dialog work processes can be used for

qRFC processing in the Integration Server As stated earlier all (ABAP based) asynchronous messages are

processed using qRFC Therefore tuning the RFC layer is one of the most important tasks to achieve good

PI performance

In case you have a high volume of (usually runtime critical) synchronous scenarios you have to ensure that

enough DIA WPs are kept free by the asynchronous interfaces running at the same time Since this is a very

difficult tuning exercise we usually recommend implementing runtime critical synchronous interfaces using

Java only configuration (ICO) whenever possible If this is not possible you have to ensure via SARFC tuning

that enough work processes are kept free to process the synchronous messages

The current resource situation in the system can be monitored using transaction SARFC

Procedure

First check which application servers can be used for qRFC inbound processing by checking the AS Group

assigned in transaction SMQR

Call transaction SARFC and refresh several times since the values are only snapshot values

Is the value for ldquoMax Resourcesrdquo close to your overall number of dialog work processes Max Resources is a fixed value and describes how many DIA work processes may be used for RFC communication If the value is too low your RFC parameters need tuning to provide the Integration Server with enough resource for the pipeline processing The RFC parameters can be displayed by double-clicking on ldquoServer Namerdquo

o A good starting point is to set the values as follows

Max no of logons = 90

Max disp of own logons = 90

Max no of WPs used = 90

Max wait time = 5

Min no of free WP = 3-10 (depending on the size of the application server)

These values are specified in SAP Note 1375656 ndash SAP NetWeaver PI System Parameters (this parameter must be increased if you plan to have synchronous interfaces) To avoid any bottleneck and blocking situation free DIA WPs should always be available The value should be adjusted based on the overall number of DIA WPs available at the system and the above requirements

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

23

Note You have to set the parameters using the SAP instance profile Otherwise the changes are lost

after the server is restarted

The field ldquoResourcesrdquo shows the number of DIA WPs available for RFC processing Is this value reasonably high for example above zero all the time If the ldquoResourcesrdquo value is zero then the Integration Server cannot process XML messages because no dialog work process is free to execute the necessary qRFC There are several conclusions that can be drawn from this observation as follows

1) Depending on the CPU usage (see Chapter ldquoMonitoring CPU Capacityrdquo) it might be necessary at this time to increase the number of dialog work processes in your system If the CPU usage however is already very high increasing the number of DIA will not solve the bottleneck

2) Depending on the number of concurrent PI queues (see Chapter qRFC Queues (SMQ2)) it might be necessary to decrease the degree of parallelism in your Integration Server because the resources are obviously not sufficient to handle the load The next section describes how to tune PI inbound and outbound queues

The data provided in SARFC is also collected and shown in a graphical and historical way using Solution

Manager Diagnostic and Wily Introscope (as can be seen in the screenshot below) This allows easy

monitoring of the RFC resources across all available PI application servers

43 Parallelization of PI qRFC Queues

The qRFC inbound queues on the ABAP stack are one of the most important areas in PI tuning Therefore it

is essential to understand the PI queuing concept

PI generally uses two types of queues for message processing ndash PI inbound and PI outbound queues Both

types are technical qRFC inbound queues and can therefore be monitored using SMQ2 PI inbound queues

are named XBTI (EO) or XBQI (EOIO) and are shared between all interfaces running on PI by default The

PI outbound queues are named XBTO (EO) and XBQO (EOIO) The queue suffix (in red XBTO0___0004)

specifies the receiver business system This way PI is using dedicated outbound queues for each receiver

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

24

system All interfaces belonging to the same receiver business system are contained in the same outbound

queue Furthermore there are dedicated queues for prioritization of separation of large messages To get an

overview about the available queues use SXMB_ADM Manage Queues

PI inbound and outbound queues execute different pipeline steps

PI Inbound Queues

Pipeline Steps Description of Pipeline Steps

PLSRV_XML_VALIDATION_RQ_INB XML Validation Inbound Channel Request

PLSRV_RECEIVER_DETERMINATION Receiver Determination

PLSRV_INTERFACE_DETERMINATION Interface Determination

PLSRV_RECEIVER_MESSAGE_SPLIT Branching of Messages

PI Outbound Queues

Pipeline Steps Description of Pipeline Steps

PLSRV_MAPPING_REQUEST Mapping

PLSRV_OUTBOUND_BINDING Technical Routing

PLSRV_XML_VALIDATION_RQ_OUT XML Validation Outbound Channel Request

PLSRV_CALL_ADAPTER Transfer to Respective Adapter IDoc HTTP Proxy AFW

Before the steps above are executed the messages are placed in a qRFC queue (SMQ2) The messages

will wait till they are first in queue and till a free DIA work process is assigned to a queue The wait time in the

queue are recorded in the performance header in DB_ENTRY_QUEUING (PI inbound queue) and

DB_SPLITTER_QUEUING (PI outbound queue) Most often you will see the long processing times there as

discussed in chapter Long Processing Times for ldquoDB_ENTRY_QUEUEINGrdquo and Long Processing Times for

ldquoDB_SPLITTER_QUEUEINGrdquo

Looking at the steps above it is clear that tuning the queues will have a direct impact on the connected PI

components and also backend systems For example by increasing the number of parallel outbound queues

more mappings will be executed in parallel which will in turn put a greater load on the Java stack or more

messages will be forwarded in parallel to the backend system in case of a Proxy call Thus when tuning PI

queues you must always keep the implications for the connected systems in mind

The tuning of PI ABAP queue parameters is done in transaction SXMB_ADM Integration Engine

Configuration and by selecting the category TUNING

For productive usage we always recommend to use inbound and outbound queues (parameter

EO_INBOUND_TO_OUTBOUND=1) Otherwise only inbound queues will be used which are shared across

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

25

all interfaces Hence a problem with one single backend system will affect all interfaces running on the

system

The principle of less is sometimes more also applies for tuning the number of parallel PI queues Increasing

the number of parallel queues will result in more parallel queues with fewer entries per queue In theory this

should result in a lower latency if enough DIA WPs are available

But practically this is not true for high volume systems The main reason for this is the overhead involved in

the reloading of queues (see details below) Furthermore important tuning measures like PI packaging (see

Chapter ldquoPI Message Packagingrdquo) aim to increase the throughput based on a higher number of messages in

the queue Thus from a throughput perspective it is definitely advisable to configure fewer queues

If you have very high runtime requirements you should prioritize these interfaces and assign a different

parallelism for high priority queues only This can be done using parameters

EO_INBOUND_PARALLEL_HIGH_PRIO and EO_OUTBOUND_PARALLEL_HIGH_PRIO as described in

section ldquoMessage Prioritization on the ABAP Stackrdquo

To tune the parallelism of inbound and outbound queues the relevant parameters are

EO_INBOUND_PARALLEL and EO_OUTBOUND_PARALLEL have to be used

Below you can see a screenshot of SMQ2 You can see the PI inbound queues and outbound queues Also

ccBPM queues (XBQO$PE) are displayed and will be discussed in section 5

Procedure

Log on to your Integration Server call transaction SMQ2 and execute If you are running ABAP proxies on a

separate client on the same system enter lsquorsquo for the client Transaction SMQ2 provides snapshots only and

must therefore be refreshed several times to get viable information

o The first value to check is the number of queues concurrently active over a period of time Since each queue needs a dialog work process to be worked on the number of concurrent queues must not be too high compared to the number of dialog work processes usable for RFC communication (see Chapter qRFC Resources (SARFC)) The optimum throughput can be achieved if the number of active queues is equal to or slightly higher than the number of dialog work processes (provided that the number of DIA work processes can be handled by the CPU of the system see transaction ST06)

In case many of the queues are EOIO queues (eg because the serialization is done on the material number) try to reduce the queues by following Chapter EOIO tuning

o The second value of importance is the number of entries per queue Hit refresh a few times to check if the numbers increasedecreaseremain the same An increase of entries for all queues or a specific

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

26

queue points to a bottleneck or a general problem on the system The conclusion that can be drawn from this is not simple Possible reasons have been found to include

1) A slow step in a specific interface

Bad processing time of a single message or a whole interface can be caused by expensive processing steps as for example the mapping step or receiver determination This can be confirmed by looking at the processing time for each step as shown in ldquoAnalyzing the runtime of pipeline stepsrdquo

2) Backlog in Queues

Check if inbound or outbound queues face a backlog A backlog in a queue is generally nothing critical since the queues ensure that the PI components as well as the backend systems are not overloaded For instance batch triggered interfaces are usually causing high backlogs that get processed over time Only in case the backlog prevents you to meet the business requirements for an interface this should be analyzed

If the backlog is caused by a high volume of messages arriving in a short period of time one solution to minimize the backlog is to increase the number of queues available for this interface (if sufficient hardware is available) If one interface is having a backlog in the outbound queues you could eg specify EO_OUTBOUND_PARALLEL with a sub parameter specifying your interface to increase the parallelism for this interface

But in general a backlog can also be caused by a long processing time in one of the pipeline steps as discussed in 1) or a performance problem with one of the connected components like PI Java stack or connected backend systems Due to the steps performed in the outbound queues (mapping or call adapter) it is more likely that a backlog will be seen in the outbound queues Inbound queue processing is only expensive if Content Based Routing or Extended Receiver Determination is used To understand which step is taking long follow once more chapter ldquoAnalyzing the runtime of pipeline stepsrdquo

In addition to tuning of the number of inbound and outbound queues also ldquoMessage Prioritization on the ABAP Stackrdquo and ldquoPI Message Packagingrdquo can help

The screenshot below shows such a backlog situation in Wily Introscope Please Note that the information about the queues is only collected by Wily in 5 minutes intervals explaining the gaps between the measurement points On the dashboard ldquoTotal qRFC Inbound Queue Entriesrdquo you can see that the number of messages in SMQ2 is increasing The number of inbound queues does not increase at the same time It also offers a dashboard for the number of entries by queue name and the time entries remain in SMQ2 With the information provided here it is also possible to distinguish a blocking situation or a general resource problem after the problem is no longer present in the system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

27

3) Queues stay in status READY in SMQ2

To see the status of the queues use the filter button in SMQ2 as shown below

It is a normal situation to see many queues in READY status for a limited amount of time The reason for this is the way the QIN scheduler reloads LUWs (see point 2 below) A waiting time of up to 5 seconds per queuing step is considered normal during normal load situation (even if there is no backlog in the queue and enough resources are available) If you observe a situation where many queues are in READY status for longer period of time the following situations can apply

A resource bottleneck with regard to RFC resources Confirm by following section qRFC Resources (SARFC)

To avoid overloading a system with RFC load the QIN scheduler only reloads new queues after a certain amount of queues have finished processing This can lead to long waiting times in the queues as explained in SAP Note 1115861 - Behaviour of Inbound Scheduler after Resource Bottleneck Since PI is a non-user system we recommend setting rfcinb_sched_resource_threshold to 3 as described in SAP Note 1375656 ndash SAP NetWeaver PI System Parameters

Long DB loading times for RFC tables If using Oracle ensure that SAP Note 1020260 - Delivery of Oracle statistics is applied

Additional overhead during the pipeline executions occurs due to PI internal mechanisms Per default a lock is set during processing of each message This is no longer necessary and should

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

28

be switched off by setting the parameter LOCK_MESSAGE of category RUNTIME to 0 as described in SAP Note 1058915 In addition during the processing of an individual message the message repeatedly calls an enqueue which leads to a deterioration of the throughput This can be avoided by setting parameter CACHE_ENQUEUE to 0 as described in SAP Note 1366904

Sometimes a queue stays in READY status for a very long time while other queues are getting processed fine After manual unlocking the queue is processed but then stops after some time again A potential reason is described in Note 1500048 - SMQ2 inbound Queue remains in READY state In a PI environment queues in status RUNNING for more than 30 minutes usually happen due to one individual step taking long (as discussed in step 1) or a problem on the infrastructure (eg memory problems on Java or HTTP 200 lost on network) After 30 minutes the QIN scheduler removes this queue from the scheduling and therefore it remains in READY To solve such cases the root cause for the blocking has to be understood

4) Queue in READY status due to queue blacklisting

If a queue is for a very long time in status RUNNING (eg due to a problem with a mapping or due to a very slow backend) it is possible that the QIN scheduler excludes this queue from queue scheduling Hence the queue stays in status READY even though enough work processes are available The ldquoblacklistingrdquo of queues takes place when the runtime of a queue exceeds the ldquoMonitoring thresholdrdquo configured on this queue Notes 1500048 - queues stay in ready (schedule_monitor) and Note 1745298 - SMQR Inbound queue scheduler identify excluded entries provide more input on this topic The screenshot below shows a setting of 2400 seconds (40 minutes) for this parameter

Check if a queue stays in READY status for a long time while others are processing without any issue Ensure Note 1745298 is implemented and check the system log for the following exception ldquoQINEXCLUDE ltqueue namegt lttimedate at which the queue scheduler startedgt If this is the case check why the long runtime occurs (see section Analyzing the runtime of PI pipeline steps) or increase the schedule_monitor in exceptional cases only (if eg the runtime of an individual processing step cannot be improved further)

5) An error situation is blocking one or more queues

Click the Alarm Bell (Change View) push button once to see only queues with error status Errors delay the queue processing within the Integration Server and may decrease the throughput if for example multiple retries occur Navigate to the first entry of the queue and analyze the problem by forward navigation to the relevant message in SXMB_MONI

Often you see EO queues in SYSFAIL or RETRY due to the problems during the processing of an individual message This can be prevented by following the description of Chapter Prevent blocking of EO queues

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

29

44 Analyzing the runtime of PI pipeline steps

The duration of the pipeline steps is the ultimate answer to long processing times in the Integration Engine

since it describes exactly how much time was spent at which point The recommended way to retrieve the

duration of the pipeline steps is the RWB and will be described below Advanced users may use the

Performance Header of the SOAP message using transaction SXMB_MONI but the timestamps are not

easy to read If you still prefer the latter method here is the explanation 20110409092656165 must be read

as yyyymmddhh(min)(min)(sec)(sec)(microseconds) that is the timestamp corresponds to April 9 2011 at

092656 and 165 microseconds Please note that these timestamps in Performance Header of

SXI_MONITOR transaction are stored and displayed in UTC time therefore conversion to system time must

be done when analyzing them

In case PI message packaging is configured the performance header will always reflect the processing time

per package Hence a duration of 50 seconds does not mean a single message took 50 seconds but

eventually the package contained 100 messages so that every message took 05 ms More details about

this can be found in section PI Message Packaging

Procedure

Log on to your Integration Server and call transaction SXMB_IFR In the browser window follow the link to

the Runtime Workbench In there click Performance Monitoring Change the display to Detailed Data

Aggregated and choose an appropriate time interval for example the last day For this selection you have to

enter the details of the specific interface you want to monitor

You will now see in the lowerndashpart of the screen a split of the processing time into single steps Check the time difference between the steps Does any step take longer than expected In the example in the screenshot below the DB_ENTRY_QUEUEING starts after 0032 seconds and ends after 0243 seconds which means it took 0211 seconds (211ms)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

30

Compare the processing times for the single steps for different measurements as outlined in Chapter

Pipeline Steps (SXMB_MONI or RWB) For example is a single step only long if many messages

are processed or if a single message is processed This helps you to decide if the problem is a

general design problem (single message has long processing step) or if it is related to the message

volume (only for a high number of messages this process step has large values)

Each step has different follow-up actions that are described next

441 Long Processing Times for ldquoPLSRV_RECEIVER_ DETERMINATIONrdquo PLSRV_INTERFACE_DETERMINATION

Steps that can take a long time in inbound processing are the Receiver and Interface Determination In these

steps the receiver system and interface is calculated Normally this is very fast but PI offers the possibility of

enhanced receiver determinations In these cases the calculation is based on the payload of a message

There are different implementation options

o Content-Based Routing (CBR)

CBR allows defining XPath expressions that are evaluated during runtime These expressions can be combined by logical operators for example to check the value for multiple fields The processing time of such a request directly correlates to the number and complexity of the conditions defined

No tuning options exist for the system in regard to CBR The performance of this step can only be changed by reducing the number of rules by changing the design of the interfaces Another option is to use an extended receiver determination (executing a mapping program) since this is faster than CBR using many rules

o Mapping to determine Receivers

A standard PI mapping (ABAP Java XSLT) can also be used to determine the receiver If you observe high runtimes in such a receiver determination follow the steps outlined in the next section since the same mapping runtime is used

442 Long Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo

Before analyzing the mapping you must understand which runtime is used Mappings can be implemented

in ABAP as graphical mappings in the Enterprise Service Builder as self-developed Java mappings or

XSLT mappings One interface can also be configured to use a sequence of mappings executed

sequentially In such a case analysis is more difficult because it is not clear which mapping in the sequence

is taking a long time

Normal debugging and tracing tools (transaction ST12) can be used for mappings executed in ABAP Any

type of mapping can be tested from ABAP using transaction SXI_MAPPING_TEST Knowing

senderreceiver partyserviceinterfacenamespace and source message payload it is possible to check the

target message (after mapping execution) and detailed trace output (similar to contents of trace of the

message in SXMB_MONI with TRACE_LEVEL = 3) It can also be used for debugging at runtime by using

the standard debugging functionality

For ABAP based XSLT mappings it is also possible via transaction XSLT_TOOL to test trace and debug

XSLT transformations on the ABAP stack

In general the mapping response time is heavily influenced by the complexity of the mapping and the

message size Therefore to analyze a performance problem in the mapping environment you should

compare the mapping runtime during the time of the problem with values reported several days earlier to get

a better understanding

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

31

If no AAE is used the starting point of a mapping is the Integration Engine that will send the request by RFC

destination AI_RUNTIME_JCOSERVER to the gateway There it will be picked up by a registered server

program The registered server program belongs to the J2EE Engine The request will be forwarded to the

J2EE Engine by a JCo call and then executed by the Java runtime When the mapping has been executed

the result is sent back to the ABAP pipeline There are therefore multiple places to check when trying to

determine why the mapping step took so long

Before analyzing the mapping runtime of the PI system check if only one interface is affected or if you face a

long mapping runtime for different interfaces To do so check the mapping runtime of messages being

processed at the same time in the system

The best tool for such an analysis is Wily Introscope which offers a dashboard for all mappings being

executed at a given time Each line in the dashboard represents one mapping and shows the average

response time and the number of invocations

In the screenshot below you can see that many different mapping steps have required around 500 seconds

for processing Comparing the data during the incident with the data from the day before will allow you to

judge if this might be a problem of the underlying J2EE engine as described in section J2EE Engine

Bottleneck

If there is only one mapping that faces performance problems there would be just one line sticking out in the

Wily graphs If you face a general problem that affects different interfaces you can choose a longer

timeframe that allows you to compare the processing times in a different time period and verify if it is only a

ldquotemporaryrdquo issue ndash this would for example indicate an overload of the mapping runtime

If you have found out that only one interface is affected then it is very unlikely to be a system problem but

rather a problem in the implementation of the mapping of that specific interface

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

32

Check the message size of the mapping in the runtime header using SXMB_MONI Verify if the message size is larger than usual (which would explain the longer runtime)

There can also be one or many lookups to a remote system (using RFC or JDBC) causing the long processing time of a mapping Together with the application you then have to check if the connection to the backend is working properly Such a mapping with lookups can be analyzed using Wily Transaction Trace as explained in the appendix section Wily Transaction Trace

If not one but several interfaces are affected a potential system bottleneck occurs and this is described in

the following

o Not enough resources (registered server programs) available That could either be the case if too many mapping requests are issued at the same time or if one J2EE server node is down and has not registered any server programs

To check if there were too many mapping requests for the available registered server programs compare the number of outbound queues that are concurrently active with the number of registered server programs The number of outbound queues that are concurrently active can be monitored with transaction SMQ2 and counting queues with the names XBTO amp XBQOXB2O XBTA and XBQAXB2A (high priority) XBTZ and XBQZXB2Z (low priority) and XBTM (large messages) In addition to this you have to take into account mapping steps that are executed by synchronous messages or in a ccBPM process Mappings executed by a ccBPM process are not queued but the ccBPM process calls the mapping program directly using tRFC The number of registered server programs can be determined with transaction SMGW Goto Logged On Clients and filtering by the program (ldquoTP Namerdquo) AI_RUNTIME_ltSIDgt In general we recommend configuring 20 parallel mapping connections per server node

If the problem occurred in the past it is more difficult to determine the number of queues that are concurrently active One way is to use transaction SXMB_MONI and specify every possible outbound queue (field ldquoQueue IDrdquo) in the advanced selection criteria (Note wildcard search not available) The timeframe can be restricted in the upper-part of the advanced selection criteria Advanced users could use transaction SE16 to search directly in table SXMSPMAST using the field ldquoQUEUEINTrdquo

The other option is to use the Wily ldquoqRFC Inbound Queue Detailrdquo dashboard as described in qRFC Queue Monitoring (SMQ2)

To check if the J2EE is not available or if the server program is not registered for a different reason use transaction SM59 to test the RFC destination AI_RUNTIME_JCOSERVER If the problem occurred in the past search the gateway trace (transaction SMGW GoTo Trace Gateway Display File) and the RFC developer traces dev_rfc files available in the work directory for the string ldquoAI_RUNTIMErdquo In addition check the std_serverltngtout in the work directory for restarts of the J2EE Engine

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

33

o Depending on the checks above there are two options to resolve the bottleneck Of course if the J2EE was not available the reason for this has to be found and prevented in the future

The two options for tuning are

1) the number of outbound queues that are concurrently active and

2) the number of mapping connections from the J2EE server to the ABAP gateway

The increase of mapping connections is only recommended for strong J2EE servers since each mapping thread needs resources like CPU and Heap memory Too many mapping connections mean too many mappings being executed on the J2EE Engine In turn this might reduce the performance of other parts of the PI system for example the pipeline processing in the Integration Engine or the processing within the adapters of the AFW or can even cause out-of-memory errors on the J2EE engine Add additional J2EE server nodes to balance the parallel requests Each J2EE server node will

register new destinations at the gateway and will therefore take a part of the mapping load

Another option is to reduce the number of outbound queues that are concurrently active to solve the bottleneck This will certainly result in higher backlogs in the queue and is therefore only applicable for less runtime critical interfaces This option can be used if the backlog is caused by a master data interface with high volume but no critical processing time requirements In such a case you can only lower the number of parallel queues for this interface by using a sub-parameter of EO_OUTBOUND_PARALLEL in transaction SXMB_ADM By doing so you slow down one interface to ensure other (more critical) interfaces have sufficient resources available

o If multiple application servers configured on the PI system ensure that all the server nodes are connected to their local gateway (local bundle option) as described in the guide How To Scale PI

443 Long Processing Times for ldquoPLSRV_CALL_ADAPTERrdquo

Call adapter is the last step executed in the PI pipeline and forwards the message to the next component

along the message flow In most cases (local Adapter Engine IDoc or BPE) this is a local call The IDoc

adapter only puts the message on the tRFC layer (SM58) of the PI system so that the actual transfer of the

IDoc is not included in the performance header For ABAP Proxies plain HTTP or calls to a central or

decentral Adapter Engine an HTTP call is made so that network time can have an influence here

Looking at the processing time we have to distinguish asynchronous (EO and EOIO) and synchronous (BE)

interfaces

In asynchronous interfaces the call adapter step includes the transfer of the message by the network and

the initial steps on the receiver side (in most cases this just means persisting the message on the receiverrsquos

database) A long duration can therefore have two reasons

o Network latency For large messages of several MB in particular the network latency is an important factor (especially if the receiving ABAP proxy system is located on another continent for example) Network latency has to be checked in case of a long call adapter step In case HTTP load balancers are used they are considered as part of the network

o Insufficient resources on receiver side Enough resources must be available at the receiver side to ensure quick processing of a message For instance enough ICM threads and dialog work processes must be available in the case of an ABAP proxy system Therefore the analysis of long call adapter steps always has to include the relevant backend system

For synchronous messages (requestresponse behavior) the call adapter step also includes the processing

time on the backend to generate the response message Therefore the call adapter for synchronous

messages includes the time of the transfer of the request message the calculation of the corresponding

response message at the receiver side and the transfer back to PI Therefore the processing time to

process a request at the receiving target system for synchronous messages must always be analyzed to find

the most costly processing steps

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

34

444 Long Processing Times for ldquoDB_ENTRY_QUEUEINGrdquo

The value for DB_ENTRY_QUEUEING describes the time that a message has spent waiting in a PI inbound

queue before processing started In case of errors the time also includes the wait time for restart of the LUW

in the queue The inbound queues (XBTI XBT1 XBT9 XBTL for EO messages and XBQIXB2I

XBQ1XB21 XBQ9XB29 for EOIO messages) process the pipeline steps for the Receiver

Determination the Interface Determination and the Message Split (and optionally XML inbound validation)

Thus if the step DB_ENTRY_QUEUEING has a high value the inbound queues have to be monitored using

transactions SMQ2 and SARFC The reasons are similar as those outlined in chapter qRFC Resources

(SARFC) and qRFC Queues (SMQ2)

o Not enough resources (DIA work processes for RFC communication) available

o A backlog in the queues caused by one of the following reasons

The number of parallel inbound queues is too low to handle the incoming messages in parallel Note that a simple increase of the parameter EO_INBOUND_PARALLEL might not always be the solution (as enough work processes also have to be available to process the queues in parallel) The number of work processes is in turn restricted by the number of CPUs that are available for your system If you would like to separate critical from non-critical sender systems define a dedicated set of inbound queues by specifying a sender ID as a sub parameter of EO_INBOUND_PARALLEL For example by doing so you can separate master data interfaces from business critical interfaces Please note that the separation on inbound queues only works on Business System level and not on interface level

The inbound queues were blocked by the first LUW Use transaction SMQ2 to check if that is still the case To identify if such a situation occurred in the past use the Wily Introscope dashboard ldquoABAP System qRFC Inbound Detailldquo It is generally not easy to find the one message that is blocking the queue It might be achieved by checking RFC traces or by searching for a single message with a long pipeline processing step For EO queues there is no business reason to block the queues in case of a single message error To prevent this follow the description in section Prevent blocking of EO queues

In case you have a different runtime of the messages in the queues (eg due to complex extended Receiver Determinations) the number of messages per queue might be uneven distributed between the queues In PI 73 a new balancing option is available as discussed in Avoid uneven backlogs with queue balancing

Problems when loading the LUW by the QIN scheduler as described in qRFC Queues (SMQ2)

Note If your system uses the parameter EO_INBOUND_TO_OUTBOUND = 0 you must also read chapter

445 for analyzing the reasons EO_INBOUND_TO_OUTBOUND only determines whether inbound queues

(value lsquo0rsquo) or inbound and outbound queues (value lsquo1rsquo) are used for the pipeline processing in the integration

server Check the value with transaction SXMB_ADM Integration Engine Configuration Specific

Configuration The default value is lsquo1rsquo meaning the usage of inbound and outbound queues (the

recommended behavior)

445 Long Processing Times for ldquoDB_SPLITTER_QUEUEINGrdquo

The value for DB_SPLITTER_QUEUEING describes the time that a message has spent waiting in a PI

outbound queue until a work process was assigned In case of errors the time also includes the wait time for

the restart of the LUW in the queue

The outbound queues (XBTO XBTA XBTZ XBTM for EO messages and XBQOXB2O

XBQAXB2A XBQZXB2Z for EOIO messages) process the pipeline steps for the mapping outbound

binding and call adapter

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

35

Thus if the step DB_SPLITTER_QUEUEING has a high value the outbound queues have to be monitored

using transactions SMQ2 and SARFC The reasons are similar as outlined in chapters qRFC Resources

(SARFC) and qRFC Queues (SMQ2) or described in the section above for PI inbound queues

o Not enough resources (DIA work processes for RFC communication) available

o A backlog in the queues caused by one of the following reasons

The number of parallel outbound queues is too low to handle the outgoing messages in parallel Note that simply increasing the parameter EO_OUTBOUND_ PARALLEL might not always be the solution as enough work processes have to be available to process the queues in parallel The number of work processes in turn is limited by the number of CPUs available for the system Also the parameter EO_OUTBOUND_PARALLEL is used differently than the parameter EO_INBOUND_ PARALLEL because it determines the degree of parallelism for each receiver system Thus it is possible to increase the number of parallel outbound queues for specific receivers whereas other receivers are handled with a lower degree of parallelism Note that not every receiver backend is able to handle a high degree of parallelism so that in case of an ABAP Proxy system it might not indicate a problem on PI in case you observe a high value for DB_SPLITTER_QUEUING

The outbound queues were blocked by the first LUW To identify if such a situation occurred in the past use the Wily Introscope dashboard ldquoABAP System qRFC Inbound Detailldquo In general it is not easy to find the one message that is blocking the queue It might be achieved by checking RFC traces or by searching for a single message with a long pipeline processing step For EO queues there is no business reason to block the queues in case of a single message error To prevent this follow the description in section Prevent blocking of EO queues

Since the outbound queues include a processing step that is bound to take longer than the other steps (that is the mapping and the call adapter step) this might mean a long waiting time for all subsequent messages As an example assume that a mapping takes 1 second and that 100 messages are waiting in a queue The first message gets executed at once the second message has to wait about 1 second (a bit more since more steps are executed than just the mapping but this should be ignored at the moment) the third takes 3 seconds and so on The 100

th message

has to wait 100 seconds until it is processed that is the value for DB_SPLITTER_QUEUING is as high as 100 For a given outbound queue you would therefore see an increase for the DB_SPLITTER_QUEUING value over time If you experience this situation proceed with Chapter 442

In case you have a different runtime of the messages in the queues (eg due to different message size) the number of messages per queue might be uneven distributed between the queues In PI 73 a new balancing option is available as discussed in Avoid uneven backlogs with queue balancing

o Problems when loading the LUW by the QIN scheduler as described in qRFC Queues (SMQ2)

Note The number of parallel outbound queues is also connected with the ability of the receiving system to

process a specific amount of messages for each time unit

In section Adapter Parallelism default restrictions of the specific adapters of the J2EE Engine are explained Some adapters (JDBC File Mail) can process only one message for each server node at a time (this can be changed) It does not make sense for such adapters to have too many parallel qRFC queues since this will only move the message backlog from ABAP to Java

For messages that are directed to the IDoc outbound adapter the value for EO_OUTBOUND_PARALLEL is connected to the MAXCONN value of the corresponding outbound destination You can define the maximum number of connections using transaction SMQS ndash the default value is 10 parallel connections If applicable SAP highly recommends that you use IDoc packaging in the outbound IDoc adapter to reduce the load during the transmission of tRFC calls from the PI system to the receiving system

For ABAP proxy systems the number of outbound queues directly determines the number of messages sent in parallel to the receiving system Thus you have to ensure that the resources

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

36

available on PI are aligned with those on the sendingreceiving ABAP proxy system

446 Long Processing Times for ldquoLMS_EXTRACTIONrdquo

Lean Message Search can be configured for newer PI releases as described in the Online Help After

applying Note 1761133 - PI runtime Enhancement of performance measurement an additional header for

LMS will be written to the performance header

The header could look like this indicating that around 25 seconds were spent in the LMS analysis

When using trace level two additional timestamps are written to provide details about this overall runtime

LMS_EXTRACTION_GET_VALUES

This timestamp describes the evaluation of the message according to user-defined filter criteria

LMS_EXTRACTION_ADJUST_VALUES

This timestamp describes the post-processing of the previously found values in the BAdI (as of PI 730)

The runtime of the LMS heavily depends on the number of payload attributes to be extracted Also the

payload size and the complexity of the XPath expression has a direct impact on the performance of the LMS

In general the number of elements to be indexed should be kept at a minimum and very deep and complex

XPath expressions should be avoided

If you want to minimize the impact of LMS on the PI message processing time you can define the extraction

method to use an external job In that case the messages will be indexed after processing only and it will

therefore have no performance impact during runtime Of course this method imposes a delay in the

indexing (based on the frequency of job SXMS_EXTRACT_MESSAGES) so that newly processed

messages cannot be searched using LMS If this delay is acceptable for the monitoring responsible the

messages should be indexed using an external job

447 Other step performed in the ABAP pipeline

As discussed earlier there are additional steps in the ABAP pipeline that can be executed based on the

configuration of your scenario Per default these steps will not be activated and should therefore not

consume any time All these steps are traced in the performance header of the message Below you can see

the details for

- XML validation

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

37

- Virus Scan

In case one of these steps is taking long you have to check the configuration of your scenario In the

example of the Virus scan the problem might be related to the external virus scanner used and tuning has to

happen on that side

45 PI Message Packaging for Integration Engine

PI message packaging was introduced with SAP NetWeaver PI 70 SP13 and is per default activated in PI

71 and higher Message packaging in BPE is independent of message packaging in PI (see chapter 56)

but they can be used together

To improve performance and message throughput asynchronous messages can be assembled to packages

and processed as one entity For this the qRFC scheduler was enabled to process a set of messages (a

package) in one LUW Thus instead of sending one message to the different pipeline steps (for example

mappingrouting) a package of messages will be sent that will reduce the number of context switches that is

required Furthermore access to databases is more efficient since requests can be bundled in one database

operation Depending on the number and size of messages in the queue this procedure improves

performance considerably In return message packaging can increase the runtime for individual messages

(latency) due to the delay in the packaging process

Message packaging is only applicable for asynchronous scenarios (QoS EO and EOIO) Due to the potential

latency of individual messages packaging is not suitable for interfaces with very high runtime requirements

In general packaging has the highest benefit for interfaces with high volume and small message size

The performance improvement achieved directly relates to the number of the messages bundled in each

package Message packaging must not solely be used in the PI system Tests have shown that the

performance improvement significantly increases if message packaging is configured end-to-end - that is

from the sending system using PI to the receiving system Message packaging is mainly applicable for

application systems connected to PI by ABAP proxy

From the runtime perspective no major changes are introduced when activating packaging Messages

remain individual entities in regards to persistence and monitoring Additional transactions are introduced

allowing the monitoring of the packaging process

Messages are also treated individually for error handling If an error occurs in a message processed in a

package then the package will be disassembled and all messages will be processed as single messages Of

course in a case where many errors occur (for example due to interface design) this will reduce the benefits

of message packaging SAP systems connected to PI via ABAP Proxy can also make use of the message

packaging to reduce the HTTP calls necessary to transmit messages between the systems Other sending

and receiving applications in particular will not see any changes because they send and receive individual

messages and receive individual commits

The PI message packaging can be adapted to meet your specific needs In general the packaging is

determined by three parameters

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

38

1) Message count Maximum number of messages in a package (default 100)

2) Maximum package size Sum of all messages in kilobytes (default 1 MB)

3) Delay time Time to wait before the queue is processed if the number of messages does not reach

the message count (default 0 meaning no waiting time)

These values can be adjusted using transactions SXMS_BCONF and SXMS_BCM To better use the

performance improvements offered by packaging you could for example define a specific packaging for

interfaces with very small messages to allow up to 1000 messages for each package Another option could

be to increase the waiting time (only if latency is not critical) to create bigger packages

See SAP Note 1037176 - XI Runtime Message Packaging for more details about the necessary

prerequisites and configuration of message packaging More information is also available at

httphelpsapcom SAP NetWeaver SAP NetWeaver PIMobileIdM 71 SAP NetWeaver Process

Integration 71 Including Enhancement Package 1 SAP NetWeaver Process Integration Library

Function-Oriented View Process Integration Integration Engine Message Packaging

46 Prevent blocking of EO queues

In general messages for interfaces using Quality of Service Exactly Once are independent of each other In

case of an error in one message there is no business reason to stop the processing of other messages But

exactly this happens in case an EO queue goes into error due to an error in the processing of a single

message The queue will then be automatically retried in configurable intervals This retry will cause a delay

of all other messages in the queue which cannot be processed due to the error of the first message in the

queue

To avoid this a new error handling was introduced in SAP Notes 1298448 - XI runtime No automatic retry

for EO message processing (set EO_RETRY_AUTOMATIC = 0 and schedule restart job

RSXMB_RESTART_MESSAGES) and SAP Note 1393039 - XI runtime Dump stops XI queue After

applying this Note EO queues will not go in SYSFAIL status any longer These Notes are recommended for

all PI systems and are per default activated on PI 73 systems By specifying a receiver ID as sub parameter

this behavior can be configured for individual interfaces only

In PI 73 you can also configure the number of messages that would go into error before the whole queue

would be stopped For permanent errors (eg due to a problem with the receiving backend) the queue would

go into SYSFAIL so that not too many messages are going into error This also reduces the number of alerts

for Message Based Alerting To define the threshold the parameter EO_RETRY_AUT_COUNT can be set

The default value is 0 and indicates that the number of messages is not restricted

47 Avoid uneven backlogs with queue balancing

From PI 73 on a new mechanism is available to address the distribution of messages in queues Per default

in all PI versions the messages are getting assigned to the different queues randomly In case of different

runtimes of LUWs caused by eg different message sizes or different runtimes in mappings this can cause

an uneven distribution of messages within the different queues This can increase the latency of the

messages waiting at the end of that queue

To avoid such an unequal distribution the system checks the queue length before putting a new message in

the queue Hence it tries to achieve an equal balancing during inbound processing A queue which has a

higher backlog will get less new messages assigned Queues with fewer entries will get more messages

assigned Therefore it is different than the old BALANCING parameter of category TUNING) which was used

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

39

to rebalance messages already assigned to a queue The assignment of messages to queues is shown in

the diagram below

To activate the new queue balancing you have to set the parameters EO_QUEUE_BALANCING_READ and

EO_QUEUE_BALANCING_SELECT of category TUNING in SXMB_ADM If the value of the parameter

EO_QUEUE_BALANCING_READ is 0 (default value) then the messages are distributed randomly across

the queues that are available If however EO_QUEUE_BALANCING_READ is set to a value n greater than

zero then on average the current fill level of the queue is determined after every nth message and stored in

the shared memory of the application server This data is used as the basis for determining the queue (see

description of parameter EO_QUEUE_BALANCING_SELECT) relevant for balancing

The parameter EO_QUEUE_BALANCING_SELECT specifies the relative fill levels of the queues in percent

Relative here means in relation to the maximum filled queue If there are queues with a lower fill level than

defined here then only these are taken into consideration for distribution If all queues have a higher fill level

then all queues are taken into consideration

Please note that determining the fill level takes place using database access and therefore impacts system

performance The value of EO_QUEUE_BALANCING_READ should for this reason be oriented towards the

message throughput and specific requirements or even distribution For higher volume system a higher value

should be chosen

Letrsquos look at an example In case you set the parameter EO_QUEUE_BALANCING_READ to 1000 after

every 1000th incoming message the queue distribution will be checked This requires a database access

and can therefore cause a performance impact The fill level for each queue will then be written to shared

memory In our example we configured EO_QUEUE_BALANCING_SELECT to 20 At a given point in time

you have three outbound queues on the system XBTO__A contains 500 entries XBTO__B 150 and

XBTO__C contains 50 messages Therefore XBTO__B has a fill level of 30 and XBTO__C of 10 That

means the only queue relevant for balancing is XBTO__C which will get more messages assigned

Based on this example you can see that it is important to find the correct values As a general guideline to

minimize performance impact you should set EO_QUEUE_BALANCING_READ to a rather large value The

correct value of EO_QUEUE_BALANCING_SELECT depends on the criticality of a delay caused by a

backlog

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

40

48 Reduce the number of parallel EOIO queues

As discussed in Chapter Parallelization of PI qRFC Queues it is important to keep the number of queues

limited As a rule of thumb the number of active queues should be equal to the number of available work

processes in the system It is especially crucial to not have a lot of queues containing only one message

since this will cause a high overhead on the QIN scheduler due to very frequent reloading of the queues

Especially for EOIO queues we often see a high number of parallel queues containing only one message

The reason for this is for example that the serialization has to be done on document number to eg ensure

that updates to the same material are not transferred out of sequence Typically during a batch triggered

load of master data this can result in hundreds of EOIO queues in SMQ2 just containing only one or two

messages This is very bad from a performance point of view and will cause significant performance

degradation for all other interfaces running at the same time To overcome this situation the overall number

of EOIO queues (for all interfaces) can be limited as of PI 71 as described in Change Number of EOIO

Queues

The maximum number of queues can be set in SXMB_ADM Configure Filter for Queue Prioritization

Number of EOIO queues

During runtime a new set of queues will be used with the name XB2 as can be seen below for the outbound

queues

These queues will be shared between all EOIO interfaces for the given interface priority and by setting the

number of queues the parallelization for all EOIO interfaces is limited Thus more messages will be using the

same EOIO queue and therefore PI message packaging will work better and also the reloading of the

queues by the QIN scheduler will show much better performance

In case of errors the messages will be removed from the XB2 queues and will be moved to the standard

XBQ queues All other messages for the same serialization context will be moved to the XBQ queue

directly to ensure the serialization is maintained This means that in case of an error the shared EOIO

queues will not be blocked and messages for other serialization contexts will be not delayed

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

41

49 Tuning the ABAP IDoc Adapter

Very often the IDoc adapter deals with very high message volume and tuning of this is very essential

491 ABAP basis tuning

As stated above the IDoc adapter uses the tRFC layer to transfer messages from PI to the receiver system

The IDoc adapter will only put the message in SM58 and from there the standard tRFC layer forwards the

LUW You therefore have to ensure that sufficient resources (mainly DIA WPs) are available for processing

tRFC requests

Wily Introscope also offers a dashboard (shown below) for the tRFC layer which shows the tRFC entries and

their different statuses With these dashboards you are also able to identify history backlogs on tRFC

In order to control the resources used when sending the IDocs from sender system to PI or from PI to the

receiver backend you can also think about registering the RFC destinations in SMQS in PI as known from

standard ALE tuning By changing the Max Conn or the Max Runtime values in SMQS you can limit or

increase the number of DIA WPs used by the tRFC layer to send the IDocs for a given destination This will

mitigate the risk of system overload on sender and receiver side

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

42

492 Packaging on sender and receiver side

One option for improving the performance of messages processed with the IDoc adapter is to use IDoc

packaging

The receiver IDoc adapter in PI (for sending messages from PI to the receiving application system) already

had the option for packaging IDocs in previous releases For this messages processed in the IDoc adapter

are bundled together before being sent out to the receiving system Thus only one tRFC call is required to

send multiple IDocs The message packaging that is being discussed in section PI Message Packaging uses

a similar approach and therefore replaces the ldquooldrdquo packaging of IDocs in the receiver IDoc adapter

Therefore we highly recommend configuring Message Packaging since this helps transferring data for the

IDoc adapter as well as the ABAP Proxy

For the sender IDoc adapter there was previously the option to activate packaging in the partner profile of

the sending system in case IDocs are not send immediately but in batch By specifying the ldquoMaximum

Number of IDocsrdquo in the report RSEOUT00 multiple IDocs are bundled in one tRFC calls At the PI side

these packages will be disassembled by the IDoc adapter and the messages will be processed individually

Therefore this will help mainly to reduce the tRFC resources involved in transferring the data between the

systems but not within PI

This behavior was changed with SAP NetWeaver PI 711 and higher If the sender backend sends an IDoc

package to PI a new option has been introduced which allows the processing of IDocs as a package on PI

as well You can specify an IDoc package size on the sender IDoc Communication Channel The necessary

configuration is described in detail in the following SDN blog - IDoc Package Processing Using Sender IDoc

Adapter in PI EhP1 A significant increase in throughput was recorded for high volume interfaces with small

IDocs

493 Configuration of IDoc posting on receiver side

As described in SAP Note 1828282 - IDoc adapter long runtime XI -gt IDoc-gt ALE also the way the IDocs

are posted on the receiver ERP backend will directly influence the processing of the LUW on the PI side In

general two options exist for IDoc posting

Trigger immediately The IDoc will be posted directly when it is received For this a free dialog workprocess will be required

Trigger by background program If this option is chosen the IDoc is persisted on the ERP side and the posting of the application data will happen asynchronously using report RBDAPP01 via a background job

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

43

When ldquotrigger immediatelyrdquo is configured the sender system (PI) has to wait till the IDoc is posted While the

workprocess will roll out you will see the LUW in ldquoTransaction Executingrdquo during this time For the

IDOC_AAE the RFC connection and the java threads will be kept busy until the IDOC is posted in the

backend therefore leading to backlogs on the IDOC_AAE

The coding for posting the application data can be very complex and therefore this operation can take a long

time consuming unnecessary resources also on the sender side With background processing of the IDocs

the sender and receiver systems are decoupled from each other

Due to this we generally recommend to use ldquotrigger immediatelyrdquo in exceptional cases where the data has to

be posted without any delay In all other cases like high volume replication of master data background

processing of IDocs should be chosen

With Note 1793313 and 1855518 a third option for posting IDocs on the ERP side was introduced This

option is using bgRFC on the ERP side to queue the IDocs As soon as the configured resources for the

bgRFC destination are available the IDocs will be posted By doing so the IDocs will be posted based on

resource availability on the receiver system and no additional background jobs will be required Furthermore

storing the LUWs in bgRFC will ensure that the sender and receiver systems are decoupled This option

therefore guarantees that the IDocs will be posted as soon as possible based on the available resources on

the receiver system without the requirement to schedule many background jobs This new option is therefore

the recommended way of IDoc posting in systems that fulfill the necessary requirements

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

44

410 Message Prioritization on the ABAP Stack

A performance problem is often caused by an overload of the system for example due to a high volume

master interface If business critical interfaces with a maximum response time are running at the same time

then they might face delays due to a lack of resources Furthermore as described in qRFC Queues (SMQ2)

messages of different interfaces use the same inbound queue by default which means that critical and less-

critical messages are mixed up in the same queue

We highly recommend using message prioritization on the ABAP stack to prevent such delays for critical

interfaces (for Java see Interface Prioritization in the Messaging System )

For more information about PI prioritization see httphelpsapcom and navigate to SAP NetWeaver SAP

NetWeaver PIMobileIdM 71 SAP NetWeaver Process Integration 71 Including Enhancement Package 1

SAP NetWeaver Process Integration Library Function-Oriented View Process Integration

Integration Engine Prioritized Message Processing

To balance the available resources further between the configured interfaces you can also configure a

different parallelization level for queues with different priorities Details for this can be found in SAP Note

1333028 ndash Different Parallelization for HP Queues

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

45

5 ANALYZING THE BUSINESS PROCESS ENGINE (BPE)

This chapter helps to analyze performance problems if Chapter Determining the Bottleneck has shown that

the BPE is the slowest step for a message during its time in the PI system Of course this type of runtime

engine is very susceptible to inefficient design of the implemented integration process Information about

best practices for designing BPE processes can be found in the documentation Making Correct Use of

Integration Processes In general the memory and resource consumption is higher than for a simple pipeline

processing in the Integration Engine As outlined in the document linked above every message that is sent

to BPE and every message that is sent from BPE is duplicated In addition work items are created for the

integration process itself as well as for every step More database space is therefore required than

previously expected and more CPU time is needed for the additional steps

51 Work Process Overview

The Business Process Engine executes the steps (work items) of an integration process using tRFC calls to

the RFC destination WORKFLOW_LOCAL_ltclientgt To check if there are enough resources available to

process the work items call transaction SM50 (on each application server) while one of the integration

processes is running

o Look for the user ldquoWF-BATCHrdquo and follow its activities by refreshing the transaction

o Are there always dialog work processes available

The most important point to keep in mind here is the concurrent activity of the Business Process Engine (for

ccBPM processes) and the Integration Engine (for pipeline processing) Both runtime environments need

dialog work processes Thus performance problems in one of the engines cannot be solved by restricting

the other Rather an appropriate balance between the two runtimes has to be found

The activity of the WF-BATCH user is not restricted by default It is highly recommended that you register the

RFC destination WORKFLOW_LOCAL_ltclientgt with transaction SMQS and assign a suitable amount of

maximum connections Otherwise ccBPM activity may block the pipeline processing of the Integration

Server and reduce the throughput of PI drastically SAP Note 888279 - Regulatingdistributing the Workflow

Load gives more details

In addition it might help to use specific servers (dialog instances) to carry out a given workflow or to use load

balancing Of course both options only apply for larger PI systems that consist of at least a central instance

and one dialog instance

52 Duration of Integration Process Steps

As described in section Processing Time in the Business Process Engine with PI 73 there is a new monitor

available in ldquoConfiguration and Monitoring Homerdquo on PI start page It shows the duration of individual steps

of an integration process in a very easy way and is the now tool to analyze performance related issues on

the ccBPM

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

46

For releases prior than PI 73 check in the workflow log for long-running steps as described in the previous

chapter Use the ldquoList with Technical Detailsrdquo to do so Since the design of integration processes varies

significantly there is no general rule for the duration of a step or for specific sequences of steps Instead

you have to get a feeling for your implemented integration processes

Did a specific process step decrease in performance over a period of time

Does one specific process step stick out with regard to the other steps of the same integration process

Do you observe long durations for a transformation step (ldquomappingrdquo)

Do you observe long durations of synchronous sendreceive steps in the integration process which correlates for example to a slow response from a connected remote system

Use the SAP Notes database to learn about recommendations and hints SAP Note 857530 - BPE

Composite Note Regarding Performance acts as a central note for all performance notes and might be used

as the entry point

Once you are able to answer the above questions there are several paths to follow

o Two common reasons should be taken into consideration for a performance decrease over time Firstly the load on PI might have increased over time as well Check the hardware resources (chapter 9) and carefully monitor the work process availability (chapter 51) Alternatively the underlying databases might slow down the integration processes Use chapter 54 to check this possibility

o If it is a specific process step that sticks out the process design itself has to be reviewed The SAP Notes mentioned above might help here

o If the mapping step takes a long time it is worth having a look at chapter 442 since the BPE and the IE use the same resources (mapping connections from the ABAP gateway to the J2EE server) to execute their mapping

o If there is a long processing time for a synchronous sendreceive step check the connected backend system for any performance issues

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

47

53 Advanced Analysis Load Created by the Business Process Engine (ST03N)

Since the Business Process Engine runs on the ABAP part of the PI Integration Server you can use the

statistical data written by the dialog work processes to analyze a performance problem

Procedure

Log in to your Integration Server and call transaction ST03N Switch to ldquoExpert Moderdquo and choose the

appropriate timeframe (this could be the ldquoLast Minutersquos Loadrdquo from the Detailed Analysis menu or a specific

day or week from the Workload menu) In the Analysis View navigate to User and Settlement Statistics and

then to User Profile The user you are interested in is the WF-BATCH user who does all ccBPM-related work

Compare the total CPU Time (in seconds) to the time frame (in seconds) of the analysis In the above example an analysis time frame of 1 hour has been chosen which corresponds to 3600 seconds To be exact these 3600 seconds have to be multiplied by the number of available CPUs Out of these 3600CPU seconds the WF-BATCH user used up 36 seconds

o The above value gives you a good idea of how much load the Business Process Engine creates on your PI Integration Server By comparison with the other users (as well as the CPU utilization from transaction ST06) you can determine if the Integration Server is able to handle this load with the available number of CPUs or not This does not help you to optimize the load but it does provide support when deciding how to distribute the workload of the BPE and the workload of the Integration Server (see Chapter 51)

Experienced ABAP administrators should also analyze the database time wait time and compare these values

54 Database Reorganization

The database is accessed many times during the processing of an Integration Process This is not usually

connected with performance problems but if a specific database table is large then statements may take

longer than for small database tables

Use transaction ST05 to collect a SQL trace while the integration process in question is being executed Restrict the trace to user WF-BATCH When the integration process is finished stop the trace and view it Look for high values in the duration column especially for SELECT statements

Use transaction SE16 to determine the number of entries for the typical workflow tables SWW_WI2OBJ and SWFRXI There are many more database tables used by BPE but if the number of entries is high for the above tables they are high for the rest as well See SAP Note 872388 - Troubleshooting Archiving and Deletion in XI and the links given in this note if you find high amounts of entries

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

48

55 Queuing and ccBPM (or increasing Parallelism for ccBPM Processes)

Until recently ccBPM processes accepted messages from only one queue (one queue XBQO$PEWS for

each workflow) If there was a high message throughput for this workflow then a high backlog occurred for

these queues A couple of enhancements have therefore been implemented to improve scalability if the

ccBPM runtime

o Inbound processing without buffering

o Use of one configurable queue that can be prioritized or assigned to a dedicated server

o Use of multiple inbound queues (queue name will be XBPE_WS)

o Configure transactional behavior of ccBPM process steps to reduce number of persistency steps

Details about the configuration and tuning of the ccBPM runtime are described at

httpswwwsdnsapcomirjsdnhowtoguides SAP Netweaver 70 End-to-End Process Integration

o How to Configure Inbound Processing in ccBPM Part I Delivery Mode

o How to Configure Inbound Processing in ccBPM Part II Queue Assignment

o How to Configure ccBPM Runtime Part III Transactional Behavior of an Integration Process

Procedure

Use transaction SMQ2 to check if the queues of type XBQO$PE show a high backlog

o Based on this information verify if the ccBPM process is suitable to run with multiple queues This is especially the case for collect patterns realized in BPE If you can use multiple queues configure the workflow based on the above guides

56 Message Packaging in BPE

Inbound processing takes up the most amount of processing time in many scenarios within BPE

Message packaging in BPE helps to improve performance by delivering multiple messages to BPE process

instances in one transaction This can lead to an increased throughput which means that the number of

messages that can be processed in a specific amount of time can increase significantly Message packaging

can also increase the runtime for individual messages (latency) due to the delay in the packaging process

The sending of packages can be triggered when the packages exceed a certain number of messages a

specific package size (in kB) or a maximum waiting time The extent of the performance improvement that

can be obtained depends on the type of scenario Scenarios with the following prerequisites are generally

most suitable

o Many messages are received for each process instance

o Messages that are sent to a process instance arrive together in a short period of time

o Generally high load on the process type

o Messages do not have to be delivered immediately

For example collect scenarios are particularly suitable for message packaging The higher the number of

messages is in a package the higher the potential performance improvements will be

Tests that have been performed have shown a high potential for throughput improvements up to factor 47 in

tested Collect Scenarios depending on the packaging size that has been configured For more details about

the functionality of Message Packaging in BPE and its configuration see SAP Note 1058623 - BPE

Message Packaging

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

49

Message packaging in BPE is independent of message packaging in PI (see Chapter 45) but they can be

used together

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

50

6 ANALYZING THE (ADVANCED) ADAPTER ENGINE

This chapter describes further checks and possible reasons for bottlenecks in the (Advanced) Adapter

Engine Although this is not the place to describe the architecture of the Adapter Framework in detail the

following aspects are important to know for the analysis of a performance problem on the AFW

o The AFW contains the Messaging System that is responsible for the persistence and queuing of messages Within the flow of a message it is placed in between the sender adapter (for example a file adapter that polls from a folder) and the Integration Server as well as between the Integration Server and the receiver adapter (for example a JMS adapter that sends a message out to an external system)

o The Messaging System uses 4 queues for each adapter type and these are responsible for receiving and sending messages To separate the message transfer each adapter (for example JMS SOAP JDBC and so on) has its own set of queues One queue for receiving synchronous messages (Request queue) one queue for sending synchronous messages (Call queue) one queue for receiving asynchronous messages (Receive queue) and one queue for sending asynchronous messages (Send queue) The name of the queue always states the adapter type and the queue name (for example JMS_httpsapcomxiXISystemReceive for the JMS receiver asynchronous queues)

o A configurable number of consumer threads are assigned on each queue These threads work on the queue in parallel meaning that the Messaging Queues are not strictly FirstIn-FirstOut Five threads are assigned by default to each queue and thus five messages can be processed in parallel But as will be discussed later the parallel execution of requests to the backend depends heavily on the adapter being used since not every adapter can work in parallel by default

o SAP NetWeaver PI 71 introduced a new queue The dispatcher queue The dispatcher queue is used to realize prioritization on the AAE (see chapter Prioritization in the Messaging System) Every message in the PI system is first assigned to the dispatcher queue before it is assigned to the adapter-specific queues

Viewed from the perspective of a message that enters the PI system using a J2EE Engine adapter (for

example File) and leaves the PI system using a J2EE Engine adapter (for example JDBC) asynchronously

the steps are as follows

1) Enter the sender adapter

2) Put into the dispatcher queue of the messaging system

3) Forwarded to the File send queue of the messaging system (based on Interface priority)

4) Retrieved from the send queue by a consumer thread

5) Sent from the messaging system to the pipeline of the ABAP Integration Engine (if no Integrated Configuration (AAE) is used

6) Processed by the pipeline steps of the Integration Engine

7) Sent to the messaging system by the Integration Engine

8) Put into the dispatcher queue of the messaging system

9) Forwarded to the JDBC receive queue of the messaging system (based on Interface priority)

10) Retrieved from the receive queue (based on the maxReceivers)

11) Sent to the receiver adapter

12) Sent to the final receiver

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

51

All the analysis is based on the information provided in the Audit Log With SAP NetWeaver PI 71 the audit

log is not persisted for successful messages in the database by default to avoid performance overhead

Therefore the audit log is only available in the cache for a limited period of time (based on the overall

message volume) More details can be found in chapter Persistence of Audit Log information in PI 710 and

higher

As described in chapter Adapter Engine Performance monitor in PI 731 and higher a new performance

monitor is available from PI 731 SP4 With this monitor you can display the processing time per interface in

a given interval and identify the processing steps in which a lot of time is spend

With 731 SP10 and 74 SP5 a download functionality for the performance monitor is provided as described

in SAP Note 1886761

61 Adapter Performance Problem

Compared to a bottleneck in the Messaging System it is relatively easy to detect a performance

problembottleneck in the adapter Since the timestamps for inbound and outbound messages are written at

different points in time the procedure is described separately for sender and receiver adapters

Note It is not generally possible for all sender and receiver adapters to work in parallel (discussed in section

Adapter Parallelism)

611 Adapter Parallelism

As mentioned before not all adapters are able to handle requests in parallel Thus increasing the number of

threads working on a queue in the messaging system will not solve a performance problembottleneck

There are 3 strategies to work around these restrictions

1) Create additional communication channels with a different name and adjust the respective senderreceiver agreements to use them in parallel

2) Add a second server node that will automatically have the same adapters and communication channel running as the first server node This does not work for polling sender adapters (File JDBC or Mail) since the adapter framework scheduler assigns only one server node to a polling communication channel

3) Install and use a non-central Adapter Framework for performance-critical interfaces to achieve better separation of interfaces

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

52

Some of the most frequently-used adapters and the possible options are discussed below

o Polling Adapters (JDBC Mail File)

At the sender side these adapters use an Adapter Framework Scheduler which assigns a server node that does the polling at the specified interval Thus only one server node in the J2EE cluster does the polling and no parallelization can be achieved Therefore scaling via additional server node is not possible More details will be presented in section Adapter Framework Scheduler For example since the channels are doing the same SELECT statement on the database or pick up files with the same file name parallel processing will only result in locking problems To increase the throughput for such interfaces the polling interval has to be reduced to avoid a big backlog of data that is to be polled If the volume is still too high you should think about creating a second interface for example the new interface would poll the data from a different directory or database table to avoid locking

At the receiver side the adapters work sequentially on each server node by default For example only one UPDATE statement can be executed for JDBC for each Communication Channel (independent of the number of consumer threads configured in the Messaging System) and all the other messages for the same Communication Channel will wait until the first one is finished This is done to avoid blocking situations on the remote database On the other hand this can cause blocking situations for whole adapters as discussed in section Avoid Blocking Caused by Single SlowHanging Receiver Interface

To allow better throughput for these adapters you can configure the degree of parallelism at the receiver side In the Processing tab of the Communication Channel in the field ldquoMaximum Concurrencyrdquo enter the number of messages to be processed in parallel by the receiver channel For example if you enter the value 2 then two messages are processed in parallel on one J2EE server node The parallel execution of these statements at database level of course depends on the nature of the statements and the isolation level defined on the database If all statements update the same database record database locking will occur and no parallelization can be achieved

o JMS Adapter

The JMS adapter is per default using a push mechanism on the PI sender side This means the data is pushed by the sending MQ provider Per default every Communication channel has one JMS connection established per J2EE server node On each connection it processes one message after the other so that the processing is sequential Since there is one connection per server node scaling via additional server nodes is an option

With Note 1502046 - JMS Adapter message pull mechanism the JMS protocol can be changed to use a pull mechanism By doing so you can specify a polling interval in the PI Communication Channel and PI will be the initiator of the communication Also here the Adapter Framework Scheduler is used which implies sequential processing But in contrary to JDBC or File sender channels the JMS polling sender channel will allow parallel processing on all server nodes of the J2EE cluster Therefore scaling via additional J2EE server nodes is an option

The JMS receiver side supports parallel operation out of the box Only some small parts during message processing (pure sending of the message to the JMS provider) are synchronized Therefore to enable parallel processing on the JMS receiver side no actions are necessary

o SOAP Adapter

The SOAP adapter is able to process requests in parallel The SOAP sender side has in general no limitations in the number of requests it can execute in parallel The limiting factor here is the FCAThreads available to process the incoming HTTP calls This is discussed in more detail in chapter Tuning SOAP sender adapter On the receiver side the parallelism depends on the number of the threads defined in the messaging system and the ability of the receiving system to cope with the load

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

53

o RFC Adapter

The RFC adapter offers parameters to adjust the degree of parallelism by defining the number of initial and maximum connections to be used The initial threads are allocated from the Application thread pool directly and are therefore not available for any other tasks Therefore the number of initial thread should be kept minimal To avoid bottlenecks during the peak time the maximum connections can be used A bottleneck is indicated by the following exception in the Audit Log

comsapaiiaframsapiDeliveryException error while processing message to remote systemcomsapaiiafrfccoreclientRfcClientException resource error could not get a client from JCOPool comsapmwjcoJCO$Exception (106) JCO_ERROR_RESOURCE Connection pool RfcClient hellip is exhausted The current pool size limit (max connections) is 1 connections

Also this should be done carefully since these threads will be taken from the J2EE application thread pool Therefore a very high value there can cause a bottleneck on the J2EE engine and therefore cause a major instability of the system As per RFC adapter online help the maximum number of connections are restricted to 50

o IDoc_AAE Adapter

The IDoc adapter on Java (IDoc_AAE) was introduced with PI 73 On the sender side the parallelization depends on the configuration mode chosen In Manual Mode the adapter works sequential per server node For channels in ldquoDefault Moderdquo it depends on the configuration of the inbound Resource Adapter (RA) Via the parameter MaxReaderThreadCount of the inbound RA you can configure how many threads are globally available for all IDoc adapters running in Default Mode Hence this determines the overall parallelization of the IDoc sender adapter per Java server node Currently the recommended maximum number of threads is 10

The receiver side of the Java IDoc adapter is working parallel per default

The table below gives a summary of the parallelism for the different adapter types

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

54

612 Sender Adapter

The first step(s) to be taken for a sender adapter depend on the type of adapter itself Afterwards however

the message flow is always the same First the message is processed in the Module Processor and

afterwards put into the queue of the Messaging System If the message is synchronous then it is the call

queue if the message is asynchronous that it is the send queue The final action of the Adapter Framework

is then to send the message from the messaging system to the Integration Server (for Non ICO interfaces)

Only the first entries of the audit log are of interest to establish if the sender adapter has a performance

problem From the first entry to the entry ldquoMessage successfully put into the queuerdquo These steps logically

belong to the sender adapter and the Module Processor while the subsequent steps belong to the

Messaging System The sender adapter uses one J2EE Engine application thread to execute its work This

is generally the same thread for all steps involved

Select the ldquoDetailsrdquo of a message in the AFW The first entry of the audit log will be the beginning of the activity of the sender adapter The end of the adapterrsquos activity is marked by the entry ldquoThe application sent the message asynchronously using connection Returning to applicationrdquo

6121 Tuning SOAP sender adapter As described above in general there is no limitation in the parallelization of the SOAP sender adapter itself

The incoming SOAP requests are limited by the available FCA Threads for HTTP processing The FCA

Threads are discussed in more detail in FCA Server Threads

Per default all interfaces using the SOAP adapter are using the same set of FCA Threads since they are all

using the same URL httplthostgtltportgtMessageServlet In case one interface is facing a very high load or

slow backend connections this can cause a blockage of the available FCA Threads and could cause a heavy

impact on the processing time of other interfaces

To avoid this SAP Note 1903431 allows the usage of different URLs for specific interfaces This way you

could isolate interfaces with a very high volume on a dedicated entry point using its own set of FCA Threads

In case of high load for this interface the other SOAP sender interfaces would not be affected by a shortage

of FCA Threads

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

55

To use this new feature you can specify on the sender system the following two entry points

o MessageServletInternal

o MessageServletExternal

More information about the URL of the SOAP adapter can be found here

613 Receiver Adapter

For a receiver adapter it is just the other way around when compared to the sender adapter First the

message is received from the Integration Server and then put into a queue of the Messaging System this

time either the request or the receive queue for asynchronous and the request queue for synchronous

messages The last step is then the activity of the receiver adapter itself

Select the ldquoDetailsrdquo of a message in the AFW The adapter activity starts after the entry ldquoThe message status set to DLNGrdquo The message will then be forwarded to the entry point of the AFW ldquoDelivering to channel ltchannel_namegtrdquo enters the Module Processor ldquoMP Entering Module Processorrdquo and ends with ldquoThe message was successfully delivered to the application using connection rdquo

Depending on the type of the adapter you now have to investigate why the receiver adapter needs a high amount of processing time This could be a complicated operating system command a bad network connection or a slow backend system among many other reasons

Here too custom modules can be the reason for a prolonged processing time Check Performance of Module Processing for more details

Note From 71 the audit log is not persisted any longer for performance reasons as described in Persistence of Audit Log information in PI 710 and higher

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

56

614 IDoc_AAE adapter tuning

A comprehensive and up-to date description about the tuning of the IDoc adapter is summarized in Note

1641541 - IDoc_AAE Adapter performance tuning Please refer to this Note for details

Similar to all other adapters running on the Adapter Engine also the IDOC_AAE adapter is using the Messaging System queues and the explanations of the next chapter Messaging System Bottleneck are also valid for the IDOC_AAE adapter Note that the IDOC_AAE adapter only supports async Scenarios ndash so it is only using the ldquoSendrdquo queue in Java-only scenarios and the ldquoSendrdquo and ldquoReceiverdquo queue in dual-stack scenariosrdquo

For the IDoc_AAE sender you can configure in addition bulk support as indicated in the screenshot below By doing so all messages received by PI in one RFC call from ERP will be processed as one message This is similar to the mechanism of the ABAP IDoc adapter described in the chapter Packaging on sender and receiver side except with the difference that the number of messages in one bulk cannot be further limited

It is also essential to ensure that the size of the IDocs or IDoc packages for sender or receiver IDocs is not

getting too large to avoid any negative impact on the Java heap memory and Garbage Collection In the

case of the IDoc_AAE adapter the XML size of the IDoc is less critical it is the number of segments in an

IDoc or the overall number of segments in all IDocs of an IDoc package that will influence the memory

allocation during message processing Based on the current implementation per IDoc segment around 5 KB

of memory during processing is required A package of 5 IDocs with 2000 segments for each IDoc would

consume roughly 500 MB during processing In such cases it is important to eg lower the package size

andor use large message queues as outlined in chapter Large message queues on PI Adapter Engine

615 Packaging for Proxy (SOAP in XI 30 protocol) and Java IDoc adapter

Due to message packaging in the Integration Engine (see PI Message Packaging for Integration Engine) the

ABAP based IDoc and Proxy adapter was always using packaging when sending asynchronous messages

to the receiver system With PI 731 packaging was also introduced for Java based IDoc and Proxy (SOAP in

XI 30 protocol) adapter The aim of packaging is to achieve similar benefits to packaging already existing in

ABAP pipeline

o Less calls to backend consuming less parallel Dialog Work Processes on the receiver

o Less overhead due to context switches authentication when calling the backend

Especially for high volume interfaces packaging should be enabled to avoid overloading of the backend

systems and impact on users operating on the system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

57

Important

In the ABAP stack the packaging is active per default (in 71 and higher) as described in chapter PI

Message Packaging for Integration Engine Experience shows that packaging can have a very high impact

on the DIA Work process usage on the ERP receiver side Especially for migration projects from dual-stack

PI to AEXPO it is important to evaluate packaging to avoid any negative impact on the receiving ECC due

to missing packaging

The packaging on Java works in general different then on ABAP While in ABAP the aggregation is done on

the individual qRFC queue level (which always contains messages to one receiver system only) this is not

possible on the Adapter Engine since messages for all receivers are located in the same queue Therefore

packages are built outside of the Adapter Engine queues The messages to be packaged will be forwarded

to a bulk handler which waits till either the time for the package is exceeded or the number of messages or

data size is reached After this the message is send by a bulk thread to the receiving backend system

Packaging on Java is always done for a server node individually If you have a high number of server nodes

packaging will work less efficiently due to the load balancing of messages across the available server nodes

When enabling message packaging the message status will stay in status ldquoDeliveringrdquo throughout all steps

described above In the audit log you can see the time it spends in packaging The audit log shown below

shows eg that the message was almost waiting one minute before the package was built and 9 IDocs were

sent with this package

Packaging can be enabled globally by setting the following parameters in Messaging System service as

described in Note 1913972

o messagingsystemmsgcollectorenabled Enable packaging or disable it globally

o messagingsystemmsgcollectormaxMemPerBulk The maximum size of a bulk message

o messagingsystemmsgcollectorbulkTimeout The wait time per bulk - default 60 seconds

o messagingsystemmsgcollectormaxMemTotal Maximum memory which can be used by message collector

o messagingsystemmsgcollectorpoolSize Number of parallel threads used to send out packages to the backend This value has to be tuned based on the general performance of the receiver system the network latency between the systems or the configuration of the IDoc posting on the ERP sender side (described in chapter IDoc posting on receiver side) Therefore tuning of these threads might be very important These threads can unfortunately not be monitored yet with any PI tool or Wily

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

58

Introscope To verify that not all packaging threads are in use you can use the Thread monitoring in the SAP Management Console and check for threads named ldquoBULK_EXECUTORrdquo as shown below

If you would like to adapt the packaging for specific communication channel this can also be done using the

configuration options in the Integration Directory

While in the SOAP adapter in XI 30 protocol you can define three different packaging KPIs this is currently

not possible for the IDoc adapter There you can only specify the package size based on number of

messages

616 Adapter Framework Scheduler

For polling adapters like File JDBC and JMS the Adapter Framework Scheduler determines on which server

node the polling takes place Per default a File or JDBC sender only works on one server node and the

Adapter Framework (AFW) Scheduler determines on which one the job is scheduled

The Adapter Frameworks scheduler had several performance and reliability issues in a cluster environment

especially in systems having many Java server nodes and many polling communication channels Based on

this and the symptoms described in Note 1355715 - AF Scheduler to avoid using cluster communication a

new scheduler was released We highly recommend using the new version of the AFW scheduler To

activate the new scheduler you have to set the parameter schedulerrelocMode of the Adapter framework

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

59

service property to some negative value For instance by setting the value to -15 after every 15th polling

interval a rebalancing of the channel within the J2EE cluster might happen This can allow for better

balancing of the incoming load across the available server nodes if the files are coming in on regular

intervals The balancing is achieved by doing a servlet call Based on the HTTP load balancing the channel

might then be dispatched to another server node To avoid the balancing overhead this value should not be

set too low A value between -10 and -15 should be acceptable in most cases For very short polling intervals

(eg every second) even lower values (eg -50) can be configured

In case many files are put at one given time on the directory (by eg using a batch process) all these files will

be processed by one polling process only Hence a proper load balancing cannot be achieved by the AFW

scheduler In such a case the only option is to use write the files with different names or different directories

so that you can configure multiple sender channels to pick up the files

Starting from PI 731 you can monitor the Adapter Framework Scheduler in pimon Monitoring

Background Job Processing Monitor Adapter Framework Scheduler Jobs There you can check on which

server node the Communication Channel is polling (Status=rdquoActiverdquo) Also you can see the time the channel

was polling the last time and will poll next time

You cannot influence the server node on which the channel is polling This is determined via the AFW

scheduler You can only influence the frequency after which a channel can potentially move to another

server node by tuning the parameter relocMode as outlined above

Prior to PI 731 you have to use the following URL to monitor the Adapter Framework Scheduler

httpserverportAdapterFrameworkschedulerschedulerjsp

Above you see an example of a File sender channel The type value determines if the channel runs only on

one server node (type = L (local)) or on all server nodes (Type = G (global)) The time in the brackets

[60000] determines the polling interval in ms ndash in this case 60 seconds The Status column shows the

status of the channel

o ldquoONrdquo Currently polling

o ldquoonrdquo Currently waiting for the next polling

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

60

o ldquooffrdquo not scheduled any longer (eg channel deactivated or scheduled on another server node)

62 Messaging System Bottleneck

As described earlier the Messaging System is responsible for persisting and queuing messages that come

from any sender adapter and go to the Integration Server pipeline or Java only interfaces (ICO) or that come

from the Integration Server pipeline and go to any receiver adapter The time difference between receiving

the message from the one side and delivering it to the other side is the value that can be used to analyze

bottlenecks in the Messaging System The following chapters describe how to determine the time difference

for both directions

A bottleneck (or rather a ldquofakerdquo bottleneck) in the Messaging System can usually be closely connected with a

performance problem in the receiver adapter You must therefore make sure you execute the check

described in chapter 613 If the receiver adapter needs a lot of time to process messages then of course

the messages get queued in the messaging system and remain there for a long time since the adapter is not

ready to process the next one yet It looks like the messaging system is not fast enough but actually the

receiver adapter is the limiting factor

Execute the checks described in chapter 621 and 622 to get an overview of the situation you should do this for multiple messages

Decide if a specific receiver adapter is causing long waiting times in the messaging system by executing the check described in chapter 613 An additional hint at a receiver adapter problem is if only messages to specific receivers show a long wait time in the queue

Use the queue monitoring in the Runtime Workbench (RWB) to analyze how many threads are active for each of the queue types and how large the queue size currently is To do so choose Component Monitoring Adapter Engine Press the Engine Status button and choose tab ldquoAdditional Datardquo This will open the page below showing the number of messages in a queue the threads currently working and the maximum available threads Note This view only shows the Messaging System of one J2EE server node To get the whole picture you have to check all the server nodes that are available in your system via the drop down list box

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

61

o Since checking all the server nodes with this browser frontend can be very tedious it is better to use Wily Introscope since it allows easy graphical analysis of the queues and thread usage of the messaging system across Java server nodes

The starting point in Wily Introscope is usually the PI triage showing all the backlogs in the queue In the screenshot below a backlog in the dispatcher queue (for 71 and higher only) can be seen

The dispatcher queue is not the problem here since it will forward messages to the adapter queue if any free

threads are available on the adapter specific consumer threads The analysis must start in the PI inbound

queues By using the navigation button in the upper right corner of the inbound queue size you can directly

jump to a more detailed view where you can see that the file adapter was causing the backlog

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

62

To see the consumer thread usage you can then follow the link to the file adapter In the screenshot below

you can see that all consumer threads were exceeded during the time of the backlog Thus increasing the

number of consumer threads could be a solution if the resulting delay is not acceptable for business

o If it is obvious that the messaging system does not have enough resources as in the case above increase the number of consumers for the queue in question Use the results of the previous check to decide which adapter queues need more consumers

The adapter specific queues in the messaging system have to be configured in the NWA using

service ldquoXPI Service AF Corerdquo and property messagingconnectionDefinition The default

values for the sending and receiving consumer threads are set as follows

(name=global messageListener=localejbsAFWListener exceptionListener=

localejbsAFWListener pollInterval=60000 pollAttempts=60

SendmaxConsumers=5 RecvmaxConsumers=5 CallmaxConsumers=5

RqstmaxConsumers=5)

To set individual values for a specific adapter type you have to add a new property set to the default set with the name of the respective adapter type for example

(name=JMS_httpsapcomxiXISystem

messageListener=localejbsAFWListener

exceptionListener=localejbsAFWListener pollInterval=60000

pollAttempts=60 SendmaxConsumers=7 RecvmaxConsumers=7

CallmaxConsumers=7 RqstmaxConsumers=7)

The name of the adapter specific queue is based on the pattern ltadapter_namegt_ ltnamespacegt for

example RFC_httpsapcomxiXISystem

Note that you must not change parameters such as pollInterval and pollAttempts For more details see SAP Note 791655 - Documentation of the XI Messaging System Service Properties

o Not all adapters use the parameter above Some special adapters like CIDX RNIF or Java Proxy can be changed by using the service ldquoXPI Service Messaging Systemrdquo and property messagingconnectionParams by adding the relevant lines for the adapter in question as described above

o A performance problem on the Messaging System can also be caused by a high logging as described in chapter Logging Staging on the AAE (PI 73 and higher)

o Important Make sure that your system is able to handle the additional threads that is monitor the CPU usage and application thread availability as well as the memory of the J2EE Engine after you have applied the change

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

63

o Important Keep in mind that not all adapters are able to handle requests in parallel even if you assign more threads on the Messaging System queues In such a case adding additional threads will not speed up processing significantly See chapter Adapter Parallelism for details

o A backlog of messages in the messaging system is highly critical for synchronous scenarios due to timeouts that might occur Therefore the number of synchronous threads always has to be tuned so that no bottleneck occurs Often a backlog in synchronous queues is caused due to bad performance on the receiving backend system

o More information is provided in the blog Tuning the PI Messaging System Queues

621 Messaging System in between AFW Sender Adapter and Integration Server (Outbound)

Open the Audit Log of a message and look for the following entry ldquoThe message was successfully retrieved

from the send queuerdquo Note that the queue name would be ldquoCall Queuerdquo if the message was synchronous

To determine how long the message waited to be picked up from the queue compare the timestamp of the

above entry with the timestamp for the step ldquoMessage successfully put into the queuerdquo A large time

difference between those two timestamps would therefore indicate a bottleneck for consumer threads on the

sender queues in the Messaging System

Instead of checking the latency of single messages we recommend using Wily Introscope to monitor the

queue behavior as shown above

Asynchronous messages only use one thread for processing in the messaging system Multiple threads are

used for synchronous messages The adapter thread puts the message into the messaging system queue

and will wait till the messaging system delivers the response The adapter thread is therefore not available

for other tasks until the response is returned A consumer thread on the call queue sends the message to the

Integration Engine The response will be received by a third thread (consumer thread of send queue) which

correlates the response with the original request After this the initiating adapter thread will be notified to

send the response to the original sender system This correlation can be seen in the audit log of the

synchronous message below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

64

622 Messaging System in Between Integration Server and AFW Receiver Adapter (Inbound)

Open the Audit Log of a message and look for the following entry ldquoMessage successfully put into the

queuerdquo Compare this timestamp with the timestamp for the step ldquoThe message was successfully retrieved

from the receive queuerdquo The time difference between these two timestamps is the time that the message

waited in the messaging system for a free consumer thread A large time difference between those two

timestamps indicates a bottleneck in the adapter specific inbound queues of the Messaging System

623 Interface Prioritization in the Messaging System

SAP NetWeaver PI 71 and higher offers the possibility for prioritization of interfaces similar to the ABAP

stack This feature is especially helpful if high volume interfaces run at the same time with business critical

interfaces To minimize the delay for critical messages the Messaging System now makes it possible for you

to define high medium and low priority processing at interface level Based on the priority the Dispatcher

Queue of the messaging system (which will be the first entry point for all messages) will forward the

messages to the standard adapter-specific queues The priority assigned to an interface determines the

number of messages that are forwarded once the adapter-specific queues have free consumer threads

available This is done based on a weighting of the messages to be reloaded The weights for the different

priorities are as follows High 75 Medium 20 Low 5

Based on this approach you can ensure that more resources can be used for high priority interfaces The

screenshot below shows the UI for message prioritization available in pimon Configuration and

Administration Message Prioritization

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

65

The number of messages per priority can be seen in a dashboard in Wily as shown below

You can find more details on configuring usage prioritization within the AAE at SAP Online Help navigate to

SAP NetWeaver SAP NetWeaver PI SAP NetWeaver Process Integration 71 Including Enhancement

Package 1 SAP NetWeaver Process Integration Library Function-Oriented View Process Integration

Process Integration Monitoring Component Monitoring Prioritizing the Processing of Messages

624 Avoid Blocking Caused by Single SlowHanging Receiver Interface

As already discussed above some receiver adapters process messages sequentially on one server node by

default This is independent of the number of consumer threads defined for the corresponding receiver

queue in the messaging system For example even though you have configured 20 threads for the JDBC

Receiver one Communication Channel will only be able to send one request to the remote database at a

given time If there are many messages for the same interface all of them will get a thread from the

messaging system but will be blocked when performing the adapter call

In the case of a slow message transmission for example for EDI interfaces using ISDN technology or

remote system with small network bandwidth this behavior can cause a ldquohangingrdquo situation for a specific

adapter since one interface can block all messaging threads As a result all the other interfaces will not get

any resources and will be blocked

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

66

Based on this a parameter messagingsystemqueueParallelismmaxReceivers was introduced which can

be set in NWA in service XPI Service Messaging System With this parameter you can restrict the maximum

number of receiver consumer threads that can be used by one interface In older releases (lower 731 SP11

and 74 SP6) this is a global parameter at the moment that affects all adapters It should therefore not be set

too restrictively If you need high parallel throughput one option is to set the maxReceiver parameter to 5 (so

that each interface can use 5 consumer threads for each server node) and increase the overall number of

threads on the receive queue for the adapter in question (for example JDBC) to 20 With this configuration it

will be possible for four interfaces to get resources in parallel before all threads are blocked For more

information see Note 1136790 - Blocking Receiver Channel May Affect the Whole Adapter Type and SDN

blog Tuning the PI Messaging System Queues

In the screenshot below you see the impact of maxReceivers parameter when a backlog for one interface

occurs Even though there are more free SOAP threads available they are not consumed Hence the free

SOAP threads could be assigned to other interfaces running in parallel

This parameter has a direct impact on the queue where message backlogs occur As mentioned earlier

usually a backlog should occur on the Dispatcher queue only since it is responsible to dispatch threads only

if there are free consumer threads in the adapter specific queue When setting maxReceiver you restrict the

threads per interface and this means that less consumer threads are blocked When one interface is facing a

high backlog there will always be free consumer threads and therefore the dispatcher queue can dispatch

the message to the adapter specific queue Hence the backlog will appear in the adapter specific queue so

that the message prioritization would not work properly any longer

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

67

Per default the maxReceiver parameter is only relevant for asynchronous message processing (ICo and

classical dual-stack scenarios) The reason here is that a wait time for synchronous scenarios should be

avoided by all means and therefore additional restrictions in the number of available threads can be very

critical Therefore it is usually advisable to not limit the threads per interface but increase the overall number

of available threads In case you have many high volume synchronous scenarios with different priority that

run in parallel it might be advisable to limit the threads each interface can use To do so you have to set the

parameter messagingsystemqueueParallelismqueueTypes to Recv ICoAll as described in SAP Note

1493502 - Max Receiver Parameter for Integrated Configurations

Enhancement with 731 SP11 and 74 SP6

With Note 1916598 - NF Receiver Parallelism per Interface an enhancement was introduced that allows

the specification of the maximum parallelization on more granular level This new feature has to be activated

by setting the parameter messagingsystemqueueParallelismperInterface to true Using a configuration UI

you can specify rules to determine the parallelization for all interfaces of a given receiver service If no rule

for a given interface is specified the global maxReceivers value will be considered If the receiver service

corresponds to a technical business system this configuration would help to restrict the parallel requests

from PI to that system This also works across protocols so that eg it could be used to restrict the number of

parallel requests using Proxy or IDoc to the same ERP receiver system Below you can find a screenshot of

the configuration UI in NWA SOA Monitoring

With the improvement mentioned above also the dispatching mechanism in the dispatcher queue is

changed so that it is aware of the maxReceiver settings This means that now again the backlog will now be

placed in the dispatcher queue and the prioritization will work properly

625 Overhead based on interface pattern being used

Also the interface pattern configured can cause some overhead When choosing the interface pattern

ldquoStateless Operationrdquo each message is parsed against its data type during inbound processing This can

cause performance and memory overhead but gives a syntactical check of the data received

When choosing ldquoStateless (XI30-Compatible)rdquo no check of the data type is performed and therefore the

overhead is avoided Therefore for interface with good data quality and high message throughput this

interface pattern should be chosen

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

68

63 Performance of Module Processing

If your analysis in chapter Adapter Performance Problem pointed to a problem in the Module Processing

then you must analyze the runtime of the modules used in the Communication Channel Adapter modules

can be combined so that one CC calls multiple modules in a defined sequence Adapter modules can be

custom-developed or SAP standard and can be used for many different purposes ndash like to transform the EDI

message before sending it to the partner or to change header attributes of a PI message before forwarding it

to a legacy application Due to the flexible usage of adapter modules there is also no standard way to

approach performance problems

In the audit log shown below you can see two adapter modules One module is a customer-developed

module called SimpleWaitModule The next module called CallSAPAdapter which is a standard module that

inserts the data into the messaging system queues

In the audit log you will get a first impression of the duration of the module In the example above you can

see that the SimpleWaitModule requires 4 seconds of processing time

Once again Wily Introscope offers a better overview of the processing time of modules In the screenshot

below you can see a dashboard showing the cumulativeaverage response times and number of invocations

of different modules If there is one that has been running for a very long time then it would be very easy to

identify since there will be a line indicating a much higher average response time The tooltip displays the

name of the module

Module

Processing

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

69

Once you have identified the module with the long running time you have to talk to the responsible developer

to understand why it is taking that long Eventually the module executes a look-up to a remote system using

JCo or JDBC which could be responsible for the delay In the best case the module will print out information

in addition to the audit log to detect such steps If not use the Wily Introscope transaction trace as explained

in appendix Wily Transaction Trace

64 Java only scenarios Integrated Configuration objects

From SAP NetWeaver PI 71 on you can configure scenarios that only run on the Java stack using the

Advance Adapter Engine (AAE) when only Java-based adapters are used A new configuration object is

used for this that is called Integrated Configuration When using this the steps executed so far in the ABAP

pipeline (Receiver Determination Interface Determination and Mapping) are also executed by the services in

the Adapter Engine

641 General performance gain when using Java only scenarios

The major advantage of AAE processing is a reduced overhead due to the context switches between ABAP

and Java Thus the overall throughput can be increased significantly and the overall latency of a PI

message can be reduced greatly If possible interfaces should always use the Integrated Configuration to

achieve best performance Therefore the best tuning option is to change a scenario that is using Java based

sender and receiver adapters from a classical ABAP-based scenario to an AAE-based one

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

70

Based on SAP internal measurements performed on 71 releases the throughput as well as the response

time could be improved significantly as shown in the diagrams below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

71

Based on these measurements the following statements can be made

1) Significant performance improvements can be achieved always with local processing (if available) for a certain scenario

2) This is valid for all adapter types huge mapping runtimes slow networks slow (receiver) applications reduce the throughput and therefore rated for the overall scenario also reduce the benefit of local processing in terms of overall throughput and response time measurements

3) Greatest benefit is recognized for small payloads (comparison 10k 50k 500k) and asynchronous messages

642 Message Flow of Java only scenarios

All the Java based tuning options mentioned in chapter Analyzing the (Advanced) Adapter Engine and Long

Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo are still valid Just the calling sequence is different

The following describes the steps used in Integrated Configuration to help you better understand the

message flow of an interface The example is JMS to Mail

1) Enter the JMS sender adapter

2) Put into the dispatcher queue of the messaging system

3) Forwarded to the JMS send queue of the messaging system

4) Message is taken by JMS send consumer thread

a No message split used

In this case the JMS consumer thread will do all the necessary steps previously done in the ABAP pipeline (like receiver determination interface determination and mapping) and will then also transfer the message to the backend system (remote database in our example) Thus all the steps are executed by one thread only

b Message split used (1n message relation)

In this case there is a context switch The thread taking the message out of the queue will process the message up to the message split step It will create the new message and put it in the Send queue again from where it will be taken by different threads (which will then map the child message and finally send it to the receiving system)

As we can see in this example for Integrated Configuration only one thread does all the different steps of a

message The consumer thread will not be available for other messages during the execution of these steps

The tuning of the Send queue (Call for synchronous messages) is therefore much more important for

scenarios using Integrated Configuration than for ABAP-based scenarios

The different steps of the message processing can be seen in the audit log of a message If for example a

long-running mapping or adapter module is indicated then you can use the relevant chapter of this guide

There is no difference in the analysis except that for mappings no JCo connection is required since the

mapping call is done directly from the Java stack

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

72

The example audit logs below are based on a Java only synchronous SOAP to SOAP scenario

In the highlighted areas you can see that all the steps are very fast except the mapping call which lasts

around 7 seconds To analyze the long duration of the mapping now the same steps as discussed in chapter

Long Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo have to be followed

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

73

643 Avoid blocking of Java only scenarios

Like classical ABAP based scenarios Java only scenarios can face backlog situations in case of a slow or

hanging receiver backend Since the Java only interfaces are only using send queues the restriction of

consumer threads on the receiver queue as described in chapter Avoid Blocking Caused by Single

SlowHanging Receiver Interface is no solution

Based on that an additional property messagingsystemqueueParallelismqueueTypes of service SAP XI AF

MESSAGING was introduced via Note 1493502 - Max Receiver Parameter for Integrated Configurations

The parameter can be set for synchronous and asynchronous interfaces Please keep in mind that

messagingsystemqueueParallelismmaxReceivers is a global parameter for all adapters and for all

configured Quality of Service This global restriction can be avoided starting with 731 SP11 740 SP06 as

per SAP Note 1916598 - NF Receiver Parallelism per Interface Usually a restriction for the parallelization

can be highly critical for synchronous interfaces Therefore we generally recommend to set

messagingsystemqueueParallelismqueueTypes in most cases to Recv IcoAsync only

644 Logging Staging on the AAE (PI 73 and higher)

Up to PI 711 on the PI Adapter Engine only one message version was persisted Especially for Java only

scenarios this was often not sufficient for troubleshooting since for instance the result of a mapping could not

be verified

Starting from PI 73 the persistence steps can be configured globally in the Adapter Engine for synchronous

and asynchronous interfaces This is similar to the LOGGING or LOGGING_SYNC on the ABAP stack In

later version of PI you will be able to do the configuration on interface level

The versions that can be persisted are shown in the diagram below (the abbreviations in green are the

values for the configuration)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

74

In the Messaging System we generally distinguish Staging (versioning) and Logging An overview is given

below

For details about the configuration please refer to the SAP online help Saving Message Versions

Similar to the LOGGING on the ABAP Integration Server persisting several versions of the Java message

can cause a high overhead on the DB and can cause a decrease in performance

The persistence steps can be seen directly in the audit log of a message

Also the Message Monitor shows the persisted versions

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

75

While this additional monitoring attributes are extremely helpful for troubleshooting they should be used

carefully from a performance perspective Especially the number of message versions and the

loggingstaging mode you choose can have a severe impact on performance Therefore you have to find the

balance between business requirement and performance overhead Some guidelines on how to use staging

and logging are summarized in SAP Note 1760915 - FAQ Staging and Logging in PI 73 and higher

65 J2EE HTTP load balancing

With NetWeaver version 71 and higher the Java HTTP load balancing is no longer done by the Java

Dispatcher but by the ICM The default load balancing rules of the ICM are designed with a stateful

application in mind giving priority to the balancing of HTTP sessions These default rules are not optimal for

stateless applications such as SAP PI

In the past we could observe an unequal load distribution across Java server nodes caused by this In case

of high backlogs this can cause a delay in the overall message processing of the interface because one

server node has more messages assigned than others and therefore not all available resources are used

A typical example can be seen in the Wily screenshot below

SAP Note 1596109 ndash Uneven distribution of HTTP requests on Java server nodes introduces new load

balancing rules for stateless applications like PI Please follow the description in the Note to ensure the

messages are distributed equally across the available server nodes In the meantime these load balancing

rules can also be executed using the CTC templates as described in Notes 1756963 and 1757922 Also this

has been included in the PI Initial Setup wizard in order to execute this task automatically as a post-

installation step (see Note 1760700 - PI CTC Add HTTP loadbalancing to initial setup)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

76

For the example given above we could see a much better load balancing after the new load balancing rules

were implemented This can be seen in the following screenshot

Please Note The load balancing rules mentioned above are only responsible to balance the messages

across the available server nodes of one instance HTTP load balancing across application servers is done

by the SAP Web Dispatcher More information about this is provided in the Guide How To Scale SAP PI 71

66 J2EE Engine Bottleneck

All tuning that can be done in the AFW is limited by the ability of the J2EE Engine to handle a given amount

of threads and to provide enough memory that is needed to process all requests Of course the CPU is also

a limiting factor but this will be discussed in Chapter 91

From SAP NetWeaver PI 71 onwards the SAP VM is used on all platforms Therefore the analysis of all

platforms can use the same tools

661 Java Memory

The first analysis of the J2EE memory is usually done using the garbage collection (GC) behavior of the

Java Virtual Machine (JVM) To get the information of the GC written to the standard log file the JVM has to

be started with the parameter ndashverbosegc

Analyze the file std_serverltngtout for the frequency of GCs their duration and the allocated memory

Look for unusual patterns indicating an Out-of-Memory error or an unintentional restart of your J2EE engine This would be visible if the allocated memory drops to 0 at a given point in time A healthy Garbage Collection output would display a saw tooth pattern ndash that is the memory usage would increase over time but then go down to its original value as soon as a full Garbage Collection is triggered You should see the memory usage go down to initial values during low volume times (night-time)

Pay attention to GCs with a high duration This is important because no PI message can be mapped or processed by the J2EE Adapter Framework during a GC in a JAVA VM One of the many reasons for long GCs that should be mentioned here is paging (see chapter 92) If the heap space of the J2EE Engine is exhausted all objects in the heap are evaluated for their references If there is still a reference the object is kept If there is no reference the object is deleted If some or all of the objects have been swapped out due to insufficient RAM then they have to be swapped in to be evaluated This leads to very long runtimes for GCs GCs of more than 15 minutes have been observed on swapping systems

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

77

Different tools exist for the Garbage Collection Analysis

1) Solution Manager Diagnostics

Solution Manager Diagnostics (SMD) offers a memory analysis application that is able to read the GC output in the std_serverltngtout files To start call the Root Cause Analysis section In the Common Task list you find the entry Java Memory Analysis which will give you the chance to upload your std_serverltngtout file It shows the memory allocated after the GC and also prints the duration of each GC (black dots with duration scale on right axis) A normal GC output should show a saw tooth pattern where the memory always goes down to an initial value (especially during low volume times) This is shown in the example screenshot below

2) Wily Introscope

Again Wily Introscope offers different dashboards that can be used to check the Garbage Collection behavior on the J2EE engine The dashboard can be found using the J2EE Overview J2EE GC Overview and shows important KPIs for the GC such as the count of GC or the duration for each interval The GC Time dashboards shows the ratio of time spent in GC An example is shown in the screenshot below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

78

3) Netweaver Administrator

The NWA also offers a view to monitor the Java heap usage However the important information about the duration of the GC is not available This monitor can be found by navigating to Availability and Performance Management Java System Reports Choose Report System Health

In NetWeaver 73 you find this data in Availability and Performance Management Resource Monitoring History Reports There you can build your own Memory report of the monitoring data provided The screenshot below eg shows the Used Memory per server node

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

79

4) SAP JVM Profiler

The SAP JVM profiler allows the analysis of the extensive GC information stored in the sapjvm_gcprf files of the server node folders or to directly connect to the server node to retrieve this information More information can be found at httpwikiscnsapcomwikidisplayASJAVAJava+Profiling

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

80

Procedure

o Search for restarts of the J2EE Engine by searching for the string ldquois startingrdquo in the file std_serverltngtout Apart from the first of these entries (which simply marks the initial start of the J2EE Engine) the second and any later entries mark a fatal situation after which the J2EE server node had to be restarted This could for example be an outndashof-memory situation Search the log above those entries for a possible reason

o Search for exceptions and fatal entries Was an out-of-memory error reported (search for pattern OutOfMemory) Was a thread shortage reported

o This document does not deal with tuning the J2EE Engine If at this point it becomes clear that the J2EE Engine is the limiting factor proceed with SAP Notes or open a customer incident Only some very basic recommendations are given here

o The parameters for the SAP VM are defined by Zero Admin templates From time to time new Zero Admin templates might be released which change important parameters for the SAP VM Thus you should check SAP Note 1248926 - AS Java VM Parameters for NetWeaver 71-based Products regularly and apply the changes accordingly

o The problem might be caused by too high parallelization when processing large messages This should be restricted to avoid overloading of the Java heap memory

1) In case the interfaces are using the ABAP Integration Server use the large message queue filters to restrict the parallelization of mapping calls from ABAP queues To do so set parameter EO_MSG_SIZE_LIMIT to eg 5000 to direct all large messages to dedicated XBTL or XBTM queues

2) To restrict the parallelization on the Java Messaging System you can use the large message queue mechanism described in chapter Large message queues on the Messaging System

o If it is already obvious that the memory of your J2EE Engine is getting short and nothing can be changed on the scenarios itself you have several options for scaling your PI landscape Especially for large PI installations with high message throughput or large messages SAP Note 1502676 - Scaling up large PI installations describes potential options

1) Increase the Java Heap of your server node If sufficient physical memory is available increase the default heap size of 2 GB to a higher value such as 3 or 4 GB Experience shows that the SAP VM can handle these heap sizes without major impact on performance Larger heap sizes can cause longer processing time of GCs which in turn can affect the PI application negatively Increasing the maximum heap size on the J2EE engine will automatically adapt the new area of the heap due to a dynamic configuration (in newer J2EE versions this will be per default set to 16 of the overall heap) After increasing the heap size monitor the duration of the GC execution

2) Add an additional server node to distribute the load over more processes SAP recommends that productive systems have at least 2 Java server nodes per instance Prerequisite is that your physical host has enough RAM and CPU To minimize the overhead for J2EE cluster communication it is in general recommended configuring larger server nodes (as described above) then a high number of server nodes

3) Add an additional application server consisting of a double stack Web Application Server (ABAP and J2EE Engine) to distribute the load to an additional hardware Find details on the scaling of PI with multiple instances in the Guide How To Scale up SAP NetWeaver Process Integration

4) Use a non-central Adapter Engine to separate large messages that cause the memory problem from business critical scenarios

662 Java System and Application Threads

A system or application thread is required for every task that the J2EE Engine executes Each of these

thread types are maintained using a thread pool If the pool of available threads is expired no more requests

can be processed It is therefore important that you have enough threads available at all times

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

81

o In general SAP recommends that you increase the system and application thread count for a PI system As described in SAP Note 937159 - XI Adapter Engine is stuck the application threads should be increased to 350 and the system threads to lsquo120rsquo for all J2EE server nodes

o There are two options for checking the thread usage

1) Wily Introscope

Wily Introscope provides a dashboard in the J2EE Overview section called J2EE CPU and Memory Detail that can also be used to monitor the thread usage on a PI engine Different views exist for Application and System Threads as shown in the screenshot below

2) NetWeaver Administrator (NWA)

The NWA also offers a monitor similar to the one available for the memory mentioned above This monitor can be found by navigating to Availability and Performance Management Java System Reports Choose Report System Health and looking at the windows shown in the screenshot below

Again in NetWeaver 73 and higher the monitor can be found at Availability and Performance

Management Resource Monitoring History Report An example for the Application Thread

Usage is shown below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

82

Check if the ThreadPoolUsageRate is below 80 for both the application threads and the system threads Also check the maximum value by choosing a longer history in Wily Introscope

Has the thread usage increased over the last daysweeks

Is there one server node that shows a higher thread usage than others

3) SAP Management Console

The SAP Management Console is an essential tool to analyze different aspects of the J2EE engine It also offers a thread dump view similar to SM50 in ABAP showing the activity of all the threads of a given Java Application Server More information about the SAP MC can be found in Note 1014480 - SAP Management Console (SAP-MC) and the online help The SAP Management Console can be started as Java web-start application using httpltservergt5ltInstanc_numbergt13

Navigate to AS Java Threads and check for any threads in red status (gt20 secs running time) This indicates long running threads and you should check for the related thread stacks The example below shows a long running thread related to the PI directory You can see the user doing the request and can see that the thread is waiting for an HTTP response from the CPACache

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

83

If you are not sure what is causing the problem you should take multiple thread dumps (at least 3 every 30 seconds) for all server nodes This can be done in the Management Console navigating to AS Java Process Table

The Thread Dump Viewer tool (TDV) from Note 1020246 - Thread Dump Viewer for SAP Java Engine can be used to analyze the produced thread dumps (inside the results archive that can be downloaded to your local PC)

663 FCA Server Threads

An additional type of thread was introduced with SAP NetWeaver PI 71 that is responsible for HTTP traffic

FCA Server Threads The FCA Server Threads are responsible for receiving HTTP calls at the Java side

(after the Java dispatcher is no longer available) Also FCA Threads use a Thread Pool Fifteen FCA

Threads are configured by default but based on SAP Note 1375656 - SAP Netweaver PI System Parameters

we recommend that you increase it to 50 This can be done in the NWA by changing the parameter

FCAServerThreadCount of service HTTP Provider

FCA Server Threads are particularly crucial for synchronous message transfer and for HTTP-based

scenarios like WebServices or calls from the ABAP engine to the Adapter Engine If there is a long duration

of the HTTP call due to a slow backend system then the thread is blocked for the whole time and not

available to send other HTTP requests (other PI messages)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

84

The configured FCAServerThreadCount represents the maximum number of threads working on a single

entry point An entry point in the PI sense could eg be the SOAP Adapter Servlet (share with all channels

as described in Tuning SOAP sender adapter)

o If response times for specific entry point are high (gt15 seconds) additional FCA threads will be spawned to be available for parallel HTTP based incoming requests using different entry points only This ensures that in case of problems with one application not all other applications will be blocked constantly

o An overall maximum of 20xFCAServerThreadCount may be used in the J2EE engine

There is currently no standard monitor available for FCA Threads except the thread view in the SAP

Management Console A new dashboard will be integrated into Wily Introscope soon and will also show the

FCA Server Threads that are in use

664 Switch Off VMC

The Virtual Machine Container (VMC) is enabled by default for an ABAP WebAS 71 Since PI does not use

VMC at runtime the VMC can be switched off on a PI system to avoid resource overhead This

recommendation is based on SAP Note 1375656 - SAP NetWeaver PI System Parameters

You can activate the VM Container setting the profile parameters vmcjenable = off which can be changed

using profile maintenance (transaction RZ10) Once you have made the change you need to restart the SAP

Web Application Server instance

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

85

7 ABAP PROXY SYSTEM TUNING

Every ABAP WebAS gt 620 includes an ABAP Proxy runtime This enables a SAP system to talk native PI

protocol (XI-SOAP) so that no costly transformation is necessary

In general ABAP proxy is just a framework that is triggered by (sender) or triggers (receiver) application-

specific coding It is not possible to give a general tuning recommendation because the applications and use

cases of ABAP proxy can differ so greatly

In this section we would like to highlight the system tuning options that can be applied to improve the

throughput at the ABAP Proxy backend side

In general you can increase the throughput of EO interfaces by changing the queue parallelization Two

different parameters exist to change the number of qRFC queues on the ABAP Proxy backend As in PI

ABAP Proxy processing is based on qRFC inbound queues (SMQ2) with specific names A sender proxy

uses XBTS queues (10 parallel by default) and a receiver proxy XBTR (20 parallel by default)

The sender proxy only forwards the message to PI by executing an HTTP call that must not take too much

time and which is not resource-critical On the other hand the receiver proxy executes the inbound

processing of the messages based on the application context (and which can be very time-consuming) It is

therefore more likely to face a backlog situation in the Receiver Proxy queues In the screenshot below you

can see the performance header of a receiver proxy message that required around 20 minutes in the

PLSRV_CALL_INBOUND_PROXY (which corresponds to the posting of the application data)

Of course such a long-running message will block the queue and all messages behind it will face a higher

latency Since this step is purely application-related it is only possible to perform tuning at the application

side

The number of sender and receiver ABAP Proxy queues can be changed in transaction SXMB_ADM on the

proxy backend with parameter EO_INBOUND_PARALLEL and sub parameter SENDER (XBTS queues)

and RECEIVER (XBTR queues) The prerequisite for increasing the number of queues is that there are

enough qRFC resources are available at the proxy system (as discussed in section qRFC Resources

(SARFC))

Also as described in chapter Message Prioritization on the ABAP Stack prioritization can be configured in

the ABAP Proxy system This can be used to separate runtime-critical interfaces from interfaces with a long

application processing (as shown above)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

86

71 New enhancements in Proxy queuing

The queuing mechanism was enhanced for sender and receiver proxy systems with Note 1802294

(Receiver) and Note 1831889 (Sender) After implementing these two Notes interface specific queues will be

used and tuning of these queues on interface level will be possible This can be very helpful in cases where

one receiver interface shows very long posting time in the application coding that cannot be further

improved Other messages for more business critical interfaces will eventually be blocked by this message

due to the common usage of the XBTR queues On the sender side of a Proxy system the processing of the

queues call the central PI hub In general the processing time there should be fast But in case of high

volume interfaces you might want to slow down less business critical interfaces to avoid overloading of your

central PI hub The prioritization mechanisms available prior to these Notes did not provide such options

This is achieved by adding an interface specific identifier to the queue-name In the screenshot below you

see a comparison in the queue names for the old framework (red) and the new framework (blue)

For the sender queues the following new parameters are introduced

o Parameter EO_QUEUE_PREFIX_INTERFACE with sub parameter SENDERSENDER_BACK Should be set to lsquo1rsquo to active interface specific queue This is a global switch affecting all sender interfaces

o Parameter EO_INBOUND_PARALLEL_SENDER without sub parameter determines the general number of queues per interface

Using a sub parameter ltsender IDgt the parallelization of individual interfaces can be changed This can eg be used to assign more queues for high priority interfaces

Wildcards can be used in the ltsender IDgt to group interfaces sharing eg the same service name or the same interface name but different services

For the receiver queues the following new parameters are introduced

o Parameter EO_QUEUE_PREFIX_INTERFACE with sub parameter RECEIVER Should be set to lsquo1rsquo to active interface specific queue This is a global switch affecting all receiver interfaces

o Parameter EO_INBOUND_PARALLEL_RECEIVER without sub parameter determines the general number of queues per interface

Using a sub parameter ltsender IDgt the parallelization of individual interfaces can be changed This can eg be used to assign more queues for high priority interfaces

Wildcards can be used in the ltsender IDgt to group interfaces sharing eg the same service name or the same interface name but different services

This new feature also replaces the currently existing prioritization since it is in general more flexible and

powerful We therefore recommend adjusting existing prioritization rules using the new queuing possibilities

described above

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

87

8 MESSAGE SIZE AS SOURCE OF PERFORMANCE PROBLEMS

The message size directly influences the performance of an interface The size of a PI message depends on

two elements The PI header elements with a rather static size and the payload which can vary greatly

between interfaces or over time for one interface (for example larger messages during year-end closing)

The size of the PI message header can cause a major overhead for small messages of only a few kB and

can cause a decrease in the overall throughput of the interface Furthermore many system operations (like

context switches or database operations) are necessary for only a small payload The larger the message

payload the smaller the overhead due to the PI message header On the other hand large messages

require a lot of memory on the Java stack which can cause heavy memory usage on ABAP or excessive

garbage collection activity (see section Java Memory) that will also reduce the overall system performance

Very large messages can even crash the PI system by causing an Out-of-Memory exception for example

You therefore have to find a compromise for the PI message size

Below you see throughput measurements performed by SAP In general the best throughput was identified

for messages sizes of 1 to 5 MB

The message size in these measurements corresponds to the XML message size processed in PI and not

the size of the file or IDoc being sent to PI You can use the Runtime Header of the ABAP stack to check the

message size Below you can see an example of a very small message While the MessageSizePayload

field describes the size of the payload in bytes (here 433 bytes) the MessageSizeTotal describes the total

message size (header + payload) In the example this is around 14 KB demonstrating the overhead by the PI

header for small messages The next two lines describe the payload size before and after the mapping In

the example below the mapping reduces the payload size The last two lines determine the size of the

response message that is sent back to PI before and after the response mapping for synchronous

messages

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

88

Based on the above observations we highly recommend that you use a reasonable message size for your

interfaces During the design and implementation of the interface we therefore recommend using a message

size of 1 to 5 MB if possible Messages can be collected at the sender side or by using IDoc packaging as

described in Tuning the IDoc Adapter to achieve this In case of large messages a split has to be performed

by changing the sender processing or by using the split functions available in the structure conversion of the

File adapter

81 Large message queues on PI ABAP

In case the interfaces are using the ABAP Integration Server use the large message queue filters to restrict

the parallelization of mapping calls from ABAP queues To do so set the parameter EO_MSG_SIZE_LIMIT of

category TUNING to eg 5000 to direct all messages larger 5 MB to dedicated XBTL or XBTM queues

The value of the parameter depends on the number of large messages and the acceptable delay that might

be caused due to a backlog in the large message queue

To reduce the backlog the number of large message queues can also be configured via parameter

EO_MSG_SIZE_LIMIT_PARALLEL of category TUNING The default value is 1 so that all messages larger

than the defined threshold will be processed in one single queue Naturally the parallelization should not be

set higher than 2 or 3 to avoid overloading of the Java memory due to parallel large message requests

82 Large message queues on PI Adapter Engine

SAP Note 1727870 - Handling of large messages in the Messaging System introduces large message

queues (virtual queues no adapter behind) also for the Java based Adapter Engine Contrary to the

Integration Engine it is not the size of a single large message only that determines the parallelization

Instead the sum of the size of the large messages across all adapters on a given Java server node is limited

to avoid overloading the Java heap This is based on so called permits that define a threshold of a message

size Each message larger than the permit threshold is considered as large message The number of permits

can be configured as well to determine the degree of parallelization Per default the permit size is 10 MB and

10 permits are available This means that large messages will be processed in parallel if 100 MB are not

exceeded

To show this let us look at an example using the default values Let us assume we have 5 messages waiting

to be processed (status ldquoTo Be Deliveredrdquo on one server node Message A has 5 MB message B has 10

MB message C has 50 MB message D 150 MB message E 50 MB and message F 40 MB Message A is

not considered large since the size is smaller than the permit size and is not considered large and can be

immediately processed Message B requires 1 permit message C requires 5 Since enough permits are

available processing will start (status DLNG) Hence for message D all available 10 permits would be

required Since the permits are currently not available it cannot be scheduled If blacklisting is enabled the

message will be put to error status (NDLV) since it exceed the maximum number of defined permits In that

case the message would have to be restarted manually Message E requires 5 permits and can also not be

scheduled But since there are 4 permits left message F is put to DLNG Due to the smaller size message B

and message F finish first releasing 5 permits This is sufficient to schedule message E which requires 5

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

89

permits Only after message E and C have finished message D can be scheduled consuming all available

permits

The example above shows a potential delay a large message could face due to the waiting time for the

permits But the assumption is that large messages are not time critical and therefore additional delay is less

critical than potential overload of the system

The large message queue handling is based on the Messaging System queues This means that restricting

the parallelization is only possible after the initial persistence of the message in the Messaging System

queues Per default this is only done after the Receiver Determination Therefore if you have a very high

parallel load of incoming large requests this feature will not help Instead you would have to restrict the size

of incoming requests on the sender channel (eg file size limit in the file adapter or the

icmHTTPmax_request_size_KB limit in the ICM for incoming HTTP requests) If you have very complex

extended receiver determination or complex content based routing it might be useful to configure staging in

the first processing step of the Messaging System (BI=3) as described in Logging Staging on the AAE (PI

73 and higher)

The number of permits consumed can be monitored in PIMON Monitoring Adapter Engine Status The

number of threads corresponds to the number of consumed permits

In newer Wily versions there is also a dashboard showing the number of consumed permits as well The

Worker Threads on the right hand side of the screenshot correspond to the permits

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

90

9 GENERAL HARDWARE BOTTLENECK

During all tuning actions discussed in the previous chapters you must keep in mind that the limitation of all

activities is set by the underlying CPU and memory capacity The physical server and its hardware have to

provide resources for three PI runtimes the Adapter Engine the Integration Engine and the Business

Process Engine Tuning one of the engines for high throughput leaves fewer resources for the remaining

engines Thus the hardware capacity has to be monitored closely

91 Monitoring CPU Capacity

When monitoring the CPU capacity it is not enough to use the average that is displayed in the ldquoPrevious

Hoursrdquo section of transaction ST06 Instead you should start your interface and monitor the CPU while it is

running The best procedure is to use operating system tools to monitor the CPU usage (particularly if using

hardware virtualization)

The SMD Host Agent also reports CPU data to Wily Introscope which can be viewed using dashboards as

shown below This example shows two systems where one is facing a temporary CPU overload and the

other a permanent one

Also the NWA offers a view on the CPU activity from NetWeaver 73 on via Availability and Performance

Management Resource Monitoring History Reports There you can build your own report based on the

ldquoCPU utilizationrdquo data as shown in the screenshot

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

91

You can also monitor the CPU usage in ST06 as described below

Procedure

Log on to your Integration Server and call transaction ST06 Take a look at the snapshot that is provided on

the entry screen than navigate to ldquoDetailed Analysis Menurdquo

Snapshot view Is the idle time (upper left corner) at a reasonable value at all times for example around 10 or higher

Snapshot view The load average is a good indicator of the dimension of the bottleneck It describes the number of processes for each CPU that are in a wait queue before they are assigned to a free CPU As long as the average remains at one process for each available CPU the CPU resources are sufficient Once there is an average of around three processes for each available CPU there is a bottleneck at the CPU resources In connection with a high CPU usage a high value here can indicate that too many processes are active on the server In connection with a low CPU usage a high value here can indicate that the main memory is too small The processes are then waiting due to excessive paging

Detailed Analysis View TOP CPU Which thread uses up the highest amount of CPU Is it the J2EE Engine the work processes or the database

o There are different follow-up actions to be taken depending on the findings of the second check The first option is of course to simply increase the hardware This should be accompanied by a thorough sizing (for which SAP provides the Quicksizer in the Service Marketplace httpservicesapcomquicksizer )

o The second option is to identify the CPU-consuming threads and to think about a reduction of the load in this component For example if the J2EE Engine is the largest consumer then it is worth re-evaluating the number of concurrent mapping threads andor consumer queues as described in the previous chapters

o SAP Note 742395 - Analyzing High CPU Usage by the J2EE Engine provides a good entry point if the J2EE Engine consumes too much CPU

o If it is the work processes that consume a lot of CPU use transaction ST03N to figure out if it is the Integration Server or the Business Process Engine You can do that by looking at the CPU usage per user BPE uses the user WF-BATCH while the Integration Server uses the last user who activated a qRFC queue (typically PIAFUSER or PIAPPLUSER) Use the previous chapter to restrict these activities

92 Monitoring Memory and Paging Activity

As stated above paging is very critical for the Java stack since it influence the Java GC behavior directly

Therefore paging should be avoided in every case for a Java-based system

The OS tools are the most reliable for monitoring the paging activity on your system Transaction ST06 can

also be used to perform a snapshot analysis Take a look at the snapshot that is provided on the entry

screen

Snapshot view Is there enough physical memory available

Snapshot view Is there considerable paging Paging can have a negative influence on your J2EE Engine If it does this can be seen in long GC times as described in Chapter 661

93 Monitoring the Database

Monitoring database performance is a rather complex task and requires information to be gathered from a

variety of sources Only a basic set of indicators is given for the purpose of this guide A more detailed

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

92

analysis is required if these indicators point to a major performance problem in the database If assistance is

needed open a support message under the component BC-DB-ORA (or MSS SDB DB6 respectively)

931 Generic J2EE database monitoring in NWA

The next sections of this chapter will be mainly based on ABAP transactions to do a detailed analysis of the

database performanceThe NWA however offers possibilities to monitor the database performance for the

Java based tables This is especially helpful for Java only AEXPO or Non-Central AAEs In the NWA you

can use the ldquoOpen SQL Monitorrdquo in the Troubleshooting section of NWA The official documentation can be

found in the Online Help Monitoring DB via DBACOCKPIT transaction (ABAP based) is possible also for the

Java-only systems from SAP Solution Manager (DBA Cockpit must be configured in the Managed System

Setup)

It will be too much to outline all the available functionalities here Instead only a few key capabilities will be

demonstrated If you want to eg see the number of select update or insert statements in your system you

can use the ldquoTable Statistics Monitorrdquo It clearly shows you which tables in your system are accessed the

most frequently

To see the processing time of the different statements on your system you can use the ldquoOpen SQL statisticsrdquo

monitor There you can see the count the total average and max processing time of the individual SQL

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

93

statements and can therefore identify the expensive statements on your system

Per default the recorded period is always from the last restart of the system If you would like to just look at

the statistics for a specific time period (eg during a test) you have the option of resetting the statistics in the

individual monitor prior to the test

932 Monitoring Database (Oracle)

Procedure

Log on to your Integration Server and call transaction ST04

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

94

Many of the numbers in the screenshot above depend on each other The following checklist names a few

key performance figures of your Oracle database

The data buffer quality for instance is based on the ratio of physical reads versus the total number of reads The lower the ratio the better the buffer quality The data buffer quality should be better than 94 the statistics should be based on 15 million total reads This number of reads ensures that the database is in an equilibrated state

Ratio of user and recursive calls A good performance is indicated by ratio values greater than 2 Otherwise the number of recursive calls compared to the number of user calls is too high Over time this ratio always declines because more and more SQL statements get parsed in the meantime

Number of reads for each user call If this value exceeds 30 blocks for each user call this indicates an expensive SQL statement

Check the value of TimeUser call Values larger than 15 ms often indicate an optimization issue

Compare busy wait time versus CPU time A ratio of 6040 generally indicates a well-tuned system Significantly higher values (for example 8020) indicate room for improvement

The DD-cache quality should be better than 80

933 Monitoring Database (MS SQL)

Procedure

To display the most important performance parameters of the database call Transaction ST04 or choose

Tools Administration Monitor Performance Database Activity An analysis is only meaningful if

the database has been running for several hours with a typical workload To ensure a significant database

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

95

workload we recommend a minimum of 500 CPU busy seconds Note The default values displayed in

section Server Engine are relative values To display the absolute values press button Absolute values

Check the values in (1)

The cache hit ratio (2) which is the main performance indicator for the data cache shows the

average percentage of requested data pages found in the cache This is the average value since startup The value should always be above 98 (even during heavy workload) If it is significantly below 98 the data cache could be too small To check the history of these values use Transaction ST04 and choose Detail Analysis Menu Performance Database A snapshot is collected every 2 hours

Memory setting (5) shows the memory allocation strategy used and shows the following

FIXED SQL Server has a constant amount of memory allocated which is set by SQL Server configuration parameters min server memory (MB) = max server memory (MB)

RANGE SQL Server dynamically allocates memory between min server memory (MB) lt gt max server memory (MB)

AUTO SQL Server dynamically allocates memory between 4 MB and 2 PB which is set by min server memory (MB) = 0 and max server memory (MB) = 2147483647

FIXED-AWE SQL Server has a constant amount of memory allocated which is set by min server memory (MB) = max server memory (MB) In addition the address windowing extension functionality of Windows 2000 is enabled

934 Monitoring Database (DB2)

Procedure

To get an overview of the overall buffer pool usage catalog and package cache information go to

transaction ST04 choose Performance Database section Buffer Pool (or section Cache respectively)

1 1

1 2

1 3

1 5

1 4

gt 6h

gt 98

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

96

Buffer Pools Number Number of buffer pools configured in this system

Buffer Pools Total Size The total size of all configured buffer pools in KB If more than one buffer pool is used choose Performance Buffer Pools to get the size and buffer quality for every single buffer pool

Overall buffer quality This represents the ratio of physical reads to logical reads of all buffer pools

Data hit ratio In addition to overall buffer quality you can use the data hit ratio to monitor the database (data logical reads - data physical reads) (data logical reads) 100

Index hit ratio In addition to overall buffer quality you can use the index hit ratio to monitor the database (index logical reads - index physical reads) (index logical reads) 100

Data or Index logical reads The total number of read requests for data or index pages that went through the buffer pool

Data or Index physical reads The total number of read requests that required IO to place data or index pages in the buffer pool

Data synchronous reads or writes Read or write requests performed by db2agents

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

97

Catalog cache size Maximum size of the catalog cache that is used to maintain the most frequently accessed sections of the catalog

Catalog cache quality Ratio of catalog entries (inserts) to reused catalog entries (lookups)

Catalog cache overflows Number of times that an insert in the catalog cache failed because the catalog cache was full (increase catalog cache size)

Package cache size Maximum size of the package cache that is used to maintain the most frequently accessed sections of the package

Package cache quality Ratio of package entries (inserts) to reused package entries (lookups)

Package cache overflows Number of times that an insert in the package cache failed because the package cache was full (increase package cache size)

935 Monitoring Database (MaxDB SAP DB)

Procedure

As with the other database types call transaction ST04 to display the most important performance

parameters

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

98

With the Database Performance Monitor (transaction ST04) and its submonitors the CCMS in the SAP R3

System allows you to view in the SAP R3 System all of the information that can be used to identify

bottleneck situations

The SQL Statements section provides information about the number of SQL statements executed and related sizes

The IO Activity section lists physical and logical read and write accesses to the database

The Lock Activity section (in transaction ST04) allows you to identify possible bottlenecks in the assignment of locks by the database

The Logging Activity section combines information about the log area

The Scan and sort activity section can be helpful in identifying that suitable indexes are missing

The Cache Activity section provides information about the usage of the caches and the associated hit rates

o The data cache Hit rate should be 99 in a balanced system If this is not the case you need to check if the low hit rate is due to the size of the data cache being too small or due to an unsuitable SQL statement If necessary you can increase the data cache by increasing the MaxDB parameter Data_Cache_Size (lower than 74) or Cache_Size (74 or higher)

o The catalog cache hit rate should be 85 in a balanced system It is not a cause for concern if the catalog hit rate is temporarily lower since accesses to the system tables and an active command monitor can temporarily impair the hit rate If the catalog cache is busy the pages are paged out in the data cache rather than on the hard disk

o Log entries are not written directly to the log volume but first into a buffer (LOG_IO_QUEUE) so as to be able to write several log entries together asynchronously with one IO on the hard disk Only once a LOG_IO_QUEUE page is full is it written to the hard disk However there are situations in which you need to write LOG entries from the LOG_QUEUE onto the hard disk as quickly as possible for example if transactions are completed (COMMITROLLBACK) The transactions wait until the log writer reports the OK informing them that the log entry is on the hard disk Firstly this means that is important to use the quickest possible hard disks for the log volume(s) and secondly you must ensure that no LOG_QUEUE_OVERFLOWS occur in production operation If the LOG_IO_QUEUE is full then all transactions that want to write LOG entries must wait until free memory becomes available again in the LOG_IO_QUEUE At LOG_QUEUE_OVERFOLWS (Transaction DB50 -gt Current Status -gt Activities Overview -gt LOG_IO_QUEUE Overflow) you need to increase the LOG_IO_QUEUE parameter

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

99

o In accordance with Note 819324 - FAQ MaxDB SQL optimization check which SQL statements are responsible for the most disk accesses and whether they can be optimized You should also optimize SQL statements that have a poor runtime and poor selectivity

Bottleneck Analysis

The Bottleneck Analysis function is available in transaction ST04 under the Problem Analyzes

Performance Database Analyzer Bottlenecks You can use this function to activate the corresponding

analysis tool dbanalyzer

The dbanalyzer then collects important performance data every 15 minutes and evaluates this for possible

bottlenecks

The result of the bottleneck analysis is output in text format to provide a quick overview of possible causes of

performance problems For a more detailed description of the bottleneck messages see the online

documentation (httphelpsapcom) in the SAP Web Application Server area and search for SAP DB

bottleneck analysis messages

The dbanalyzer tool can also be started at operating system level It logs its analysis results in files with date

stamps in the subdirectory ltrundirectorygtanalyzer (you can find the actual directory by double-clicking on

lsquoPropertiesrsquo in DB50) of the relevant database instance

94 Monitoring Database Tables

Some of the tables of SAP PI grow very quickly and can cause severe performance problems if archiving or

deletion does not take place frequently For troubleshooting see SAP Note 872388 ndash Troubleshooting

Archiving and Deletion in PI

Procedure

Log on to your Integration Server and call transaction SE16 Enter the table names as listed below and execute In the following screen simply press the button ldquoNumber of Entriesrdquo The most important tables are SXMSPMAST (cleaned up by XML message archivingdeletion)

If you use the switch procedure you have to check SXMSPMAST2 as well

SXMSCLUR SXMSCLUP (cleaned up by XML message archivingdeletion)

If you use the switch procedure you have to check SXMSPCLUR2 and SXMSCLUP2 as well

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

100

SXMSPHIST (cleaned up by the deletion of history entries)

If you use the switch procedure you have to check SXMSPHIST2 as well

SXMSPFRAWH and SXMSPFRAWD (cleaned up by the performance jobs see SAP Note 820622 ndash Standard Jobs for XI Performance Monitoring)

SWFRXI (cleaned up by specific jobs see SAP Note 874708 - BPE HT Deleting Message Persistence Data in SWFRXI)

SWWWIHEAD (cleaned up by work item archivingdeletion)

Use an appropriate database tool for example SQLPLUS for Oracle for the database tables of the J2EE schema Main tables that could be affected of growth

PI Message tables BC_MSG and BC_MSG_AUDIT (when audit log persistence is enabled)

PI message logging information when stagingloging is used BC_MSG_LOG_VERSION

XI IDoc tables (if IDoc persistence is activated) XI_IDOC_IN_MSG and XI_IDCO_OUT_MSG

Check for all tables if the number of entries is reasonably small or remains roughly constant over a period of time If that is not the case check your archivingdeletion setup

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

101

10 TRACES LOGS AND MONITORING DECREASING PERFORMANCE

101 Integration Engine

There are three important locations for the configuration of tracing logging in the Integration Engine The

pipeline settings of PI itself the Internet Communication Manager (ICM) and the gateway All three are

heavily involved in message processing and therefore their tracing or logging settings can have an impact on

performance On top of this you might have changed the SM59 RFC destinations and have switched on the

trace in order to analyze a problem If so then this needs to be reset as well

Procedure

Start transaction SXMB_ADM and navigate to Integration Engine Configuration Specific Configuration and search for the following entries

Category Parameter Subparameter Current Value Default

RUNTIME TRACE_LEVEL ltnonegt ltyour valuegt 1

RUNTIME LOGGING ltnonegt ltyour valuegt 0

RUNTIME LOGGING_SYNC ltnonegt ltyour valuegt 0

Set the above parameters back to the default value which is the recommended value by SAP

Start transaction SMICM and check the value of the Trace Level that is displayed in the overview

(third line) Set the trace level to lsquo1rsquo if not already the case You can do so by navigating to Goto Trace Level Set

Start transaction SMGW navigate to Goto Parameters Display and check the parameter Trace Level Set the trace level to lsquo1rsquo if not already the case You can do so by navigating to Goto Trace Gateway Reduce Level (or Increase Level respectively)

Start transaction SM59 and search the following RFC destination for the flag ldquoTracerdquo on the tab Special Options and remove the Flag ldquoTracerdquo if it has been set

AI_RUNTIME_JCOSERVER

AI_DIRECTORY_JCOSERVER

LCRSAPRFC

SAPSLDAPI It is possible that other RFC destinations are used for sending out IDocs and similar

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

102

102 Business Process Engine

You only have to check the Event Trace in the Business Process Engine

Procedure

Call transaction SWELS and check if the Event Trace is switched on The recommended setting is lsquooffrsquo (as shown in the screenshot below)

103 Adapter Framework

Since the Adapter Framework runs on the J2EE Engine the tracing and logging can be conveniently

checked and controlled using NetWeaver Administrator To improve the performance SAP recommends that

you set all trace levels to default Higher trace levels are only acceptable in productive usage during problem

analysis Below you will find a description on how to set the default trace level for all locations at once It is of

possible to do it for every location separately but this way you might forget to reset one of the locations

Procedure

Start the NetWeaver Administrator and navigate to Problem Management (or Troubleshooting in NW 73 and higher) Logs and Traces and choose Log Configuration In the Show drop down list box choose Tracing Locations Select the Root Location in the tree and press the button Default Configuration This will set the default configuration for all locations

Start the NetWeaver Administrator and navigate to Troubleshooting Logviewer (direct link nwalinks) Open the View ldquoDeveloper Tracerdquo and check if you have very frequent reoccurring exceptions that fill up the trace Analyze the exception and check what is causing this (eg a wrong configuration of a Communication Channel causing repeated traces If you see very frequent errors for which you cannot find the root cause open a customer incident at SAP

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

103

An aggregated view of Java exceptions (by number date instance server node etc) are reported and available in SAP Solution Manager Root Cause Analysis Exception Analysis functionality

1031 Persistence of Audit Log information in PI 710 and higher

With SAP NetWeaver PI 71 the audit log is not persisted for successful messages in the database by default

to avoid performance overhead Therefore the audit log is only available in the cache for a limited period of

time (based on the overall message volume)

Audit log persistence can be activated temporarily for performance troubleshooting where audit log

information is required To do so use the NWA and change the parameter

ldquomessagingauditLogmemoryCacherdquo to false in service XPI Service Messaging System by setting the

parameter to false Details can be found in SAP Note 1314974 - PI 71 AF Messaging System audit log

persistence Only do so temporarily during the time of troubleshooting to avoid any performance problems

from the additional persistence

After implementing Note 1611347 - New data columns in Message Monitoring additional information like

message processing time in ms and server node is visible in the message monitor

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

104

11 ERRORS AS SOURCE OF PERFORMANCE PROBLEMS

If errors occur within messages or within technical components of the SAP Process Integration then this can

have a severe impact on the overall performance Thus to solve performance problems it is sometimes

necessary to analyze the log files of technical components and to search for messages with errors To

search for messages with errors use the PI Admin Check which is available using SAP Note 884865 - PIXI

Admin Check

Procedure

ICM (Internet Communication Manager)

Start transaction SMICM open the log file (Shift + F5 or GoTo Trace File Show All) and check for errors

Gateway

Start transaction SMGW open the Log File (CTRL + Shift + F10 or Goto Trace Gateway Display File) and check for errors

System Log

Start transaction SM21 choose an appropriate time interval and search the log entries for errors Repeat the procedure for remote systems if you are using dialog instances

ABAP Runtime Errors

Start transaction ST22 choose an appropriate time interval and search for PI-related dumps In general the number of ABAP dumps should be very small on a PI system and therefore all occurring dumps should be analyzed

Alerts CCMS

Start transaction RZ20 and search for recent alerts

Work Process and RFC trace

In the PI work directory check all files which begin with dev_rfc or dev_w for errors

J2EE Engine

In the PI work directory search for errors in file dev_server0out If you are using more than one J2EE server node check dev_serverltngtout as well

Applications running on the J2EE Engine

The traces for application-related errors are the defaulttrc files located in j2eeclusterserverltngtlog directory To monitor exceptions across multiple server nodes the best tool to use is the LogViewer service of the NWA This is available via Problem Management Logs and Traces and choose Log Viewer Search for exceptions in the area of the performance problem for example the PI Messaging system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

105

APPENDIX A

A1 Wily Introscope Transaction Trace

As mentioned above the Wily Transaction trace can be used to trace expensive steps that were noticed

during the mapping or module processing The transaction trace will allow you to drill down further into Java

performance problems and to distinguish if it is a pure coding problem or caused by a look-up to a remote

system or a slow connection to the local database

The Wily Transaction trace can be used to trace all steps that exceed a specific duration on the Java stack If

a user is known (for example for SAP Enterprise Portal) the tracing can be restricted to a specific user In PI

if you encounter a module that lasts several seconds then you can restrict the tracing as shown below

Note Starting the Transaction Trace increases the data collection on the satellite system (PI) and is

therefore only recommended in productive environment for troubleshooting purposes

With the selection above the trace will run for 10 minutes (this is the maximum and it can be canceled earlier)

and will trace all steps exceeding 3 seconds for the agents of the PI system If such long-running steps are

found then a new window will be displayed listing these steps

In the screenshot below you can see the result in the Trace View The Trace View shows the elapsed time

from left to rightndash in the example below around 48 seconds From top to bottom we can see the call stack of

the thread In general we are interested in long-running threads on the bottom of the trace view A long-

running block at the bottom means that this is the lowest level coding that was instrumented and which is

consuming all the time

In the example below we see a mapping call that is performing many individual database statements ndash this

will become visible by highlighting the lowest level In such a case you have to review the coding of the

mapping to see if the high amount of database calls can be summarized in one call

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

106

Another case that is often seen is that a lookup using JDBC to a remote database or RFC to an ABAP

system takes a long time in the mapping or adapter module In such a case there will be one long block at

the bottom of the transaction trace that also gives you some details about the statement that was executed

A2 XPI inspector for troubleshooting and performance analysis

The XPI Inspector is an additional tool that provides enhanced logging and tracing capabilities with a main

focus on troubleshooting But the tool can also be used for troubleshooting of performance issues

General information about the tool can be found at Note 1514898 - XPI Inspector for troubleshooting PI The

tool can be called using the following URL on your system httphostjava-portxpi_inspector

To analyze performance problems typically the example ldquo51 ndash ldquoPerformance Problem)rdquo is used As a basic

measurement it allows you to take multiple thread dumps in specific time intervals These thread dumps can

be analyzed later by your administrators or SAP support to identify what is causing the problem The Thread

Dump Viewer tool (TDV) from Note 1020246 - Thread Dump Viewer for SAP Java Engine can be used to

analyze the produced thread dumps (inside the results archive that can be downloaded to your local PC)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

107

In addition the tool allows you to do JVM profiling be either doing JVM Performance tracing or JVM Memory

Allocation tracing This can help to understand in detail which steps of the processing are taking a long time

As output the tool provides prf files that can be loaded into the JVM Profiler More information can be found

at httpwikiscnsapcomwikidisplayASJAVAJava+Profiling Please Note that these traces (especially the

memory profiling) will cause a high overhead on the J2EE engine and are therefore not recommended to be

used in production Instead it is advisable to reproduce the problem on your QA system and use the tool

there

Document Version 30 March 2014

wwwsapcom

copy 2014 SAP AG All rights reserved

SAP R3 SAP NetWeaver Duet PartnerEdge ByDesign SAP

BusinessObjects Explorer StreamWork SAP HANA and other SAP

products and services mentioned herein as well as their respective

logos are trademarks or registered trademarks of SAP AG in Germany

and other countries

Business Objects and the Business Objects logo BusinessObjects

Crystal Reports Crystal Decisions Web Intelligence Xcelsius and

other Business Objects products and services mentioned herein as

well as their respective logos are trademarks or registered trademarks

of Business Objects Software Ltd Business Objects is an SAP

company

Sybase and Adaptive Server iAnywhere Sybase 365 SQL

Anywhere and other Sybase products and services mentioned herein

as well as their respective logos are trademarks or registered

trademarks of Sybase Inc Sybase is an SAP company

Crossgate mgic EDDY B2B 360deg and B2B 360deg Services are

registered trademarks of Crossgate AG in Germany and other

countries Crossgate is an SAP company

All other product and service names mentioned are the trademarks of

their respective companies Data contained in this document serves

informational purposes only National product specifications may vary

These materials are subject to change without notice These materials

are provided by SAP AG and its affiliated companies (SAP Group)

for informational purposes only without representation or warranty of

any kind and SAP Group shall not be liable for errors or omissions

with respect to the materials The only warranties for SAP Group

products and services are those that are set forth in the express

warranty statements accompanying such products and services if

any Nothing herein should be construed as constituting an additional

warranty

wwwsapcom

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

7

With 71 the so called Advanced Adapter Engine (AAE) was introduced The Adapter Engine was enhanced

to provide additional routing and mapping call functionality so that it is possible to process messages locally

This means that a message that is handled by a sender and receiver adapter based on J2EE does not need

to be forwarded to the ABAP Integration Engine This saves a lot of internal communication reduces the

response time and significantly increases the overall throughput The deployment options and the message

flow for 71 based systems and higher are shown below Currently not all the functionalities available in the

PI ABAP stack are available on the AAE but each new PI release closes the gap further

In SAP PI 73 and higher the Adapter Engine Extended (AEX) was introduced In addition to the AAE the

Adapter Engine Extended also allows local configuration of the PI objects The AEX can therefore be seen

as a complete PI installation running on Java only From the runtime perspective no major differences can be

seen compared to the AAE and therefore no differentiation is made in this guide

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

8

Looking at the different involved components we get a first impression of where a performance problem

might be located In the Integration Engine itself in the Business Process Engine or in one of the Advanced

Adapter Engines (central non-central plain PCK) Of course a performance problem could also occur

anywhere in between for example in the network or around a firewall Note that there is a separation

between a performance issue ldquoin PIrdquo and a performance issue in any attached systems If viewed ldquothrough

the eyesrdquo of a message (that is from the point of view of the message flow) then the PI system technically

starts as soon as the message enters an adapter or (if HTTP communication is used) as soon as the

message enters the pipeline of the Integration Engine Any delay prior to this must be handled by the

sending system The PI system technically ends as soon as the message reaches the target system for

example a given receiver adapter (or in case of HTTP communication the pipeline of the Integration Server)

has received the success message of the receiver Any delay after this point in time must be analyzed in the

receiving system

There are generally two types of performance problems The first is a more general statement that the PI

system is slow and has a low performance level or does not reach the expected throughput The second is

typically connected to a specific interface failing to meet the business expectation with regard to the

processing time The layout of this check is based on the latter First you should try to determine the

component that is responsible for the long processing time or the component that needs the highest absolute

time for processing Once this has been clarified there is a set of transactions that will help you to analyze

the origin of the performance problem

If the recommendations given in this guide are not sufficient SAP can help you to optimize the service via

SAP consulting or SAP MaxAttention services SAP might also handle smaller problems restricted to a

specific interface if you describe your problem in a SAP customer incident Support offerings like ldquoSAP

Going Live Analysis (GA)rdquo and ldquoSAP Going Live Verification (GV)rdquo can be used to ensure that the main

system parameters are configured according to SAP best practice You can order any type of service using

SAP Service Marketplace or your local SAP contacts

If you have already worked with the PI Admin Check (SAP Note 884865) you will recognize some of the

transactions This is because SAP considers the regular checking of the performance to be an important

administrational task However this check tries to show the methodology to approach performance problems

to its reader Also it offers the most common reasons for performance problems and links to possible follow-

up actions but does not refer to any regular administrational transactions

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

9

Wily Introscope is a prerequisite for analyzing performance problems at the Java side It allows you to

monitor the resource usage of multiple J2EE server nodes and provides information about all the important

components on the Java stack like mapping runtime messaging system or module processor Furthermore

the data collected by Wily is stored for several days so it is still possible to analyze a performance problem at

a later date Wily Introscope is provided free-of-charge for SAP customers It is delivered as part of Solution

Manager Diagnostics but can also be installed separately For more information see

httpservicesapcomdiagnostics or SAP Note 797147

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

10

2 WORKING WITH THIS DOCUMENT

If you are experiencing performance problems on your Process Integration system start with Chapter 3

Determining the Bottleneck It helps you to determine the processing time for the different runtimes within PI

Integration Engine Business Process Engine and Adapter Engine or connected Proxy System Once you

have identified the area in which the bottleneck is most probably located continue with the relevant chapter

This is not always easy to do because a long processing time is not always an indication for a bottleneck It

can make sense to do this if a complicated step is involved (Business Process Engine) if an extensive

mapping is to be executed (IS) or if the payload is quite large (all runtimes) For this reason you need to

compare the value that you retrieve from Chapter 3 with values that you have received previously for

example The history data provided for the Java components by Wily Introscope is a big help If this is not

possible you will have to work with a hypothesis

Once the area of concern has been identified (or your first assumption leads you there) Chapters 4 5 6

and 7 will help you to analyze the Integration Engine the Business Process Engine the PI Adapter

Framework and the ABAP Proxy runtime

After (or preferably during) the analysis of the different process components it is important to keep in mind

that the bottlenecks you have observed could also be caused by other interfaces processing at the same

time It will lead you to slightly different conclusions and tuning measures You therefore have to distinguish

the following cases based on where the problem occurs

A with regard to the interface itself that is it occurs even if a single message of this interface is processed

B or with regard to the volume of this interface that is many messages of this interface are processed at the same time

C or with regard to the overall message volume processed on PI

This can be analyzed by repeating the procedure of Chapter 3 for A) a single message that is processed

while no other messages of this interface are processed and while no other interfaces are running Then

compare this value with B) the processing time for a typical amount of messages of this interface not simply

one message as before If the values of measurement A) and B) are similar repeat the procedure with C) a

typical volume of all interfaces that is during a representative timeframe on your productive PI system or

with the help of a tailored volume test

These three measurements ndash A) processing time of a single message B) processing time of a typical

amount of messages of a single interface and C) processing time of a typical amount of messages of all

interfaces ndash should enable you to distinguish between the three possible situations

o A specific interface has long processing steps that need to be identified and improved The tuning options are usually limited and a re-design of the interface might be required

o The mass processing of an interface leads to high processing times This situation typically calls for tuning measures

o The long processing time is a result of the overall load on the system This situation can be solved by tuning measures and by taking advantage of PI features for example to establish separation of interfaces If the bottleneck is hardware-related it could also require a re-sizing of the hardware that is used

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

11

Chapters 8 9 10 and 11 deal with more general reasons for bad performance such as heavy

tracinglogging error situations and general hardware problems They should be taken into account if the

reason for slow processing cannot be found easily or if situation C from above applies (long processing times

due to a high overall load)

Important

Chapter 9 provides the basic checks for the hardware of the PI server It should not only be used when

analyzing a hardware bottleneck due to a high overall load andor an insufficient sizing but should also be

used after every change made to the configuration of PI to ensure that the hardware is able to handle the

new situation This is important because a classical PI system uses three runtime engines (IS BPE AFW)

Tuning one engine for high throughput might have a direct impact on the others With every tuning action

applied you have to be aware of the consequences on the other runtimes or the available hardware

resources

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

12

3 DETERMINING THE BOTTLENECK

The total processing time in PI consists of the processing time in the Integration Server processing time in

the Business Process Engine (if ccBPM is used) as well as the processing time in the Adapter Framework (if

adapters except IDoc plain HTTP or ABAP proxies are used) In the subsequent chapters you will find a

detailed description of how to obtain these processing times

For reasons of completeness we will also have a look at the ABAP Proxy runtime and the available

performance evaluations

31 Integration Engine Processing Time

You can use an aggregated view of the performance (details about activation in Note 820622 - Standard jobs

for XI performance monitoring) data for the Integration Engine as shown below

Procedure

Log on to your Integration Engine and call transaction SXMB_IFR In the browser window follow the link to

the Runtime Workbench In there click ldquoPerformance Monitoringrdquo Change the display to ldquoDetailed Data

Aggregatedrdquo select lsquoISrsquo as the Data Source (or lsquoPMIrsquo if you have configured the Process Monitoring

Infrastructurersquo) and ldquoXIIntegrationServerltyour_hostgtrdquo as the component Then choose an appropriate time

interval for example the last day You have to enter the details of the specific interface you want to monitor

here

The interesting value is the ldquoProcessing Time [s]rdquo in the fifth column It is the sum of the processing time of the single steps in the Integration Server For now you are not interested in the single steps that are listed on the right side since you are still trying to find out the part of the PI system where the most time is lost Chapter 44 will work with the individual steps

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

13

If you do not know which interface is affected yet you first have to get an overview Instead of navigating to

Detailed Data Aggregated in the Runtime Workbench choose Overview Aggregated Use the button Display

Options and check the options for sender component receiver component sender interface and receiver

interface Check the processing times as described above

32 Adapter Engine Processing Time

Procedure

Log on to your Integration Server and call transaction SXMB_IFR In the browser window follow the link to

the Runtime Workbench In there click Message Monitoring choose Adapter Engine lthostgt from Database

from the drop down lists and press Display Enter your interface details and ldquoStartrdquo the selection Select one

message using the radio button and press Details

Calculate the difference between the start and end timestamp indicated on the screen

Do the above calculation for the outbound (AFW IS) as well as the inbound (IS AFW) messages

The audit log of successful message is no longer persisted in SAP NetWeaver PI 71 by default in order to minimize the load on the database The audit log has a high impact on the database load since it usually consists of many rows that are inserted individually in the database In newer releases therefore a cache was implemented that keeps the audit log information for a period of time only Thus no detailed information is available any longer from a historical point of view Instead you should use Wily Introscope for historical analyses of the performance of successful messages

The audit log can be persisted on the database to allow historical analysis of performance problems

on the Java stack To do so change the parameter ldquomessagingauditLogmemoryCacherdquo from

true to false for service ldquoXPI Service Messaging Systemrdquo using NetWeaver Administrator (NWA) as described in SAP Note 1314974 Note Only do this temporarily if you have identified the bottleneck on the AFW First try using Wily to determine the root cause of the problem

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

14

Due to above mentioned limitations Wily Introscope is the right tool for performance analysis of the Adapter Engine Wily persists the most important steps like queuing module processing and average response time in historical dashboards (aggregated data per intervals for all server nodes of a system or all systems with filters possible etc)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

15

Especially for Java-Only runtime using Integrated Configuration Wily shows the complete processing time after the initial sender adapter steps (receiver determination mapping and time in the backend call) as can be seen in the Message Processing Time dashboard below

Using the ldquoShow Minimum and Maximumrdquo functionality deviations from the average processing time can be seen In the example below a message processing step takes up to 22 seconds The reason for this could either be a slow mapping or a delay in the call to the receiver system (in the example below Java IDoc adapter is used to transfer the data to ERP)

In case of high processing times in one of your steps further analysis might be required using thread dumps or the Java Profiler as outlined in the appendix section A2 XPI inspector for troubleshooting and performance analysis

321 Adapter Engine Performance monitor in PI 731 and higher

Starting from PI 731 SP4 there is a new performance monitor available for the Adapter Engine More

information on the activation of the performance monitor can be found at Note 1636215 ndash Performance

Monitoring for the Adapter Engine Similar to the ABAP monitor in the RWB the data is displayed in an

aggregated format On the PI start page use the link ldquoConfiguration and Monitoring Homerdquo Go to ldquoAdapter

Enginerdquo and select ldquoPerformance Monitorrdquo The data is persisted on an hourly basis for the last 24 hours and

on a daily basis for the last 7 days The data is further grouped on interface level and per Java server node

With the information provided you can therefore see the minimum maximum and average response time of

an individual interface on a specific Java server node All individual steps of the message processing like

time spent in the Messaging System queues or Adapter modules are listed In the example below you can

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

16

see that most of the time (5 seconds) is spent in a module called SimpleWaitModule This would be the entry

point for the further analysis

Starting with 731 SP10 (740 SP05) this data can be exported to excel in two ways Overview or Detailed (details in SAP Note 1886761)

33 Processing Time in the Business Process Engine

Procedure

Log on to your integration server and call transaction SXMB_MONI Adjust your selection in a way that you

select one or more messages of the respective interface Once the messages are listed navigate to the

Outbound column (or Inbound if your integration process is the sender) and click on PE Alternatively you

can select the radio button for ldquoProcess Viewrdquo on the entranceselection screen of SXMB_MONI

To judge the performance of an integration process it is essential to understand the steps that are executed A long-running integration process in itself is not critical because it can wait for a trigger event or can include a synchronous call to a backend Therefore it is important to understand the steps that are included before analyzing an individual workflow log For example the screenshot below shows a long waiting time after the mapping This is perhaps due to an integrated wait step

Calculate the time difference between the first and the last entry Note that this time for the workflow does not include the duration of RFC calls for example To see this processing time navigate to the ldquoList with Technical Detailsrdquo (second button from the left on the screenshot below or shift + F9) Repeat this step for several messages to get an overview about the most time consuming steps

Please Note that the processing time mentioned above does not include the waiting time of the messages in the ccBPM queues Therefore it is essential to monitor a queue backlog in ccBPM queues as discussed in section ldquoQueuing and ccBPM (or increasing Parallelism for ccBPM Processes)rdquo

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

17

Via transaction SWI2_DURA you can display an aggregated performance overview across all ccBPM

processes running on your PI system In the initial screen you have to choose ldquo(Sub-) Workflow and the time

range you would like to look at

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

18

The results here allow you to compare the average performance of a process It shows you the processing

time based on barriers The 50 barrier means that 50 of the messages were processed faster than the

value specified Furthermore you also see a comparison of the processing time to eg the day before By

adjusting the selection criteria of the transaction you can therefore get a good overview about the normal

processing time of the Integration process and can judge if there is a general performance problem or just a

temporary one

The number of process instances per integration process can be easily checked via transaction

SWI2_FREQ The data shown allows you to check the volume of the Integration Processes and allow you to

judge if your performance problem is caused by an increase of volume which could cause a higher latency in

the ccBPM queues

New in PI 73 and higher

Starting from PI 73 a new monitoring for ccBPM processes is available This monitor can be started from

transaction SXMB_MONI_BPE Integration Process Monitoring (also available in ldquoConfiguration and

Monitoring Homerdquo on PI start page) This is new browser based view that allows a simplified and aggregated

view on the PI Integration Processes

On the initial screen you get an overview about all the Integration Processes executed in the selected time

interval Therefore you can immediately see the volume of each Integration Process

From there you can navigate to the Integration process facing the performance issues and look at the

individual process instances and the start and end time Furthermore there is a direct entry point to see the

PI messages that are assigned to this process

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

19

Choosing one special process instance ID you will see the time spend in each individual processing step

within the process In the example below you can see that most of the time is spend in the Wait Step

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

20

4 ANALYZING THE INTEGRATION ENGINE

If chapter Determining the Bottleneck has shown that the bottleneck is clearly in the Integration Engine there

are several transactions that can help you to analyze the reason for this To understand why this selection of

transactions helps to analyze the problem it is important to know that the processing within the Integration

Engine is done within the pipeline The central pipeline of the Integration Server executes the following main

steps

Central Pipeline Steps Description of Central Pipeline Steps

PLSRV_XML_VALIDATION_RQ_INB XML Validation Inbound Channel Request

PLSRV_RECEIVER_DETERMINATION Receiver Determination

PLSRV_INTERFACE_DETERMINATION Interface Determination

PLSRV_RECEIVER_MESSAGE_SPLIT Branching of Messages

PLSRV_MAPPING_REQUEST Mapping

PLSRV_OUTBOUND_BINDING Technical Routing

PLSRV_XML_VALIDATION_RQ_OUT XML Validation Outbound Channel Request

PLSRV_CALL_ADAPTER Transfer to Respective Adapter IDoc HTTP Proxy AFW

PLSRV_XML_VALIDATION_RS_INBXML Validation Inbound Channel Response

PLSRV_MAPPING_RESPONSE Response Message Mapping

PLSRV_XML_VALIDATION_RS_OUT Validation Outbound Channel Response

The last three steps highlighted in bold are available in case of synchronous messages only and reflect the

time spent in the mapping of the synchronous response message

The steps indicated in red are new in SAP NetWeaver PI 71 and higher and can be used to validate the

payload of a PI message against an XML schema These steps are optional and can be executed at different

times in the PI pipeline processing for example before and after a mapping (as shown above)

With PI 731 an additional option of using an external Virus Scan during PI message processing can be

activated as described in the Online Help The Virus Scan can be configured on multiple steps in the pipeline

ndash eg after the mapping (Virus Scan Outbound Channel Request) when receiving a synchronous response

(Virus Scan Inbound Channel Response) and after the response mapping (Virus Scan Outbound Channel

Response)

It is important to know is that these pipeline steps are executed based on qRFC Logical Unit of Work (LUW)

by using dialog work processes

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

21

With 711 and higher versions you can activate the Lean Message Search (LMS) to be able to search for

payload content This is described in more detail in chapter Long Processing Times for

ldquoLMS_EXTRACTIONrdquo

41 Work Process Overview (SM50SM66)

The work process overview is the central transaction to get an overview of the current processes running in

the Integration Engine For message processing PI is only using Dialog work processes (DIA WP) Therefore

it is essential to ensure that enough DIA WPs available

The process overview is only a snapshot analysis You therefore have to refresh the data multiple times to

get a feeling for the dynamics of the processes The most important questions to be asked are as follows

How many work processes are being used on average Use the CPU Time (clock symbol) to check that not all DIA WPs are used In case all DIA WPs have a high CPU time this indicates a WP bottleneck

Which users are using these processes It depends on the usage of your PI system All asynchronous messages are processed by the QIN scheduler who will start the message processing in DIA WPs The user that is shown in SM66 will be the one that triggered the QIN scheduler This can eg be a communication user for a connected backend (usually a copy of PIAPPLUSER) or the user used to send data from the Adapter Engine (per default PIAFUSER)

Activities related to the Business Process Engine (ccBPM) will be indicated by user WF-BATCH If you see high WF-BATCH activity take a look at chapter 51

Are there any long running processes and if so what are these processes doing with regard to the reports used and the tables accessed

o As a rule of thumb for the initial configuration the number of DIA work processes should be around six times the value of CPU in your PI system (rdispwp_no_dia = 6 to 8 times CPUs cores based on SAP Note 1375656 - SAP NetWeaver PI System Parameters)

If you would like to get an overview for an extended period of time without actually refreshing the transaction

at regular intervals use report SDFMON and schedule the Daily Monitoring It allows you to collect metrics

such as CPU usage amount of free Dialog work processes and most importantly a snapshot of the work

process activity as provided in transactions SM50 and SM66 The frequency for the data collection can be as

low as 10 seconds and the data can be viewed after the configurable timeframe of the analysis Note

SDFMON is only intended for analysis of performance issues and should only be scheduled for a limited

period of time

If you have Solution Manager Diagnostics installed and all the agents connected to PI you can also monitor

the ABAP resource usage using Wily Introscope As you can see in the dashboard below you can use it to

monitor the number of free work processes (prerequisite is that the SMD agent is running on the PI system)

The major advantage of Wily Introscope is that this information is also available from the past and allows

analysis after the problem has occurred in the system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

22

42 qRFC Resources (SARFC)

Depending on the configuration of the RFC server group not all dialog work processes can be used for

qRFC processing in the Integration Server As stated earlier all (ABAP based) asynchronous messages are

processed using qRFC Therefore tuning the RFC layer is one of the most important tasks to achieve good

PI performance

In case you have a high volume of (usually runtime critical) synchronous scenarios you have to ensure that

enough DIA WPs are kept free by the asynchronous interfaces running at the same time Since this is a very

difficult tuning exercise we usually recommend implementing runtime critical synchronous interfaces using

Java only configuration (ICO) whenever possible If this is not possible you have to ensure via SARFC tuning

that enough work processes are kept free to process the synchronous messages

The current resource situation in the system can be monitored using transaction SARFC

Procedure

First check which application servers can be used for qRFC inbound processing by checking the AS Group

assigned in transaction SMQR

Call transaction SARFC and refresh several times since the values are only snapshot values

Is the value for ldquoMax Resourcesrdquo close to your overall number of dialog work processes Max Resources is a fixed value and describes how many DIA work processes may be used for RFC communication If the value is too low your RFC parameters need tuning to provide the Integration Server with enough resource for the pipeline processing The RFC parameters can be displayed by double-clicking on ldquoServer Namerdquo

o A good starting point is to set the values as follows

Max no of logons = 90

Max disp of own logons = 90

Max no of WPs used = 90

Max wait time = 5

Min no of free WP = 3-10 (depending on the size of the application server)

These values are specified in SAP Note 1375656 ndash SAP NetWeaver PI System Parameters (this parameter must be increased if you plan to have synchronous interfaces) To avoid any bottleneck and blocking situation free DIA WPs should always be available The value should be adjusted based on the overall number of DIA WPs available at the system and the above requirements

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

23

Note You have to set the parameters using the SAP instance profile Otherwise the changes are lost

after the server is restarted

The field ldquoResourcesrdquo shows the number of DIA WPs available for RFC processing Is this value reasonably high for example above zero all the time If the ldquoResourcesrdquo value is zero then the Integration Server cannot process XML messages because no dialog work process is free to execute the necessary qRFC There are several conclusions that can be drawn from this observation as follows

1) Depending on the CPU usage (see Chapter ldquoMonitoring CPU Capacityrdquo) it might be necessary at this time to increase the number of dialog work processes in your system If the CPU usage however is already very high increasing the number of DIA will not solve the bottleneck

2) Depending on the number of concurrent PI queues (see Chapter qRFC Queues (SMQ2)) it might be necessary to decrease the degree of parallelism in your Integration Server because the resources are obviously not sufficient to handle the load The next section describes how to tune PI inbound and outbound queues

The data provided in SARFC is also collected and shown in a graphical and historical way using Solution

Manager Diagnostic and Wily Introscope (as can be seen in the screenshot below) This allows easy

monitoring of the RFC resources across all available PI application servers

43 Parallelization of PI qRFC Queues

The qRFC inbound queues on the ABAP stack are one of the most important areas in PI tuning Therefore it

is essential to understand the PI queuing concept

PI generally uses two types of queues for message processing ndash PI inbound and PI outbound queues Both

types are technical qRFC inbound queues and can therefore be monitored using SMQ2 PI inbound queues

are named XBTI (EO) or XBQI (EOIO) and are shared between all interfaces running on PI by default The

PI outbound queues are named XBTO (EO) and XBQO (EOIO) The queue suffix (in red XBTO0___0004)

specifies the receiver business system This way PI is using dedicated outbound queues for each receiver

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

24

system All interfaces belonging to the same receiver business system are contained in the same outbound

queue Furthermore there are dedicated queues for prioritization of separation of large messages To get an

overview about the available queues use SXMB_ADM Manage Queues

PI inbound and outbound queues execute different pipeline steps

PI Inbound Queues

Pipeline Steps Description of Pipeline Steps

PLSRV_XML_VALIDATION_RQ_INB XML Validation Inbound Channel Request

PLSRV_RECEIVER_DETERMINATION Receiver Determination

PLSRV_INTERFACE_DETERMINATION Interface Determination

PLSRV_RECEIVER_MESSAGE_SPLIT Branching of Messages

PI Outbound Queues

Pipeline Steps Description of Pipeline Steps

PLSRV_MAPPING_REQUEST Mapping

PLSRV_OUTBOUND_BINDING Technical Routing

PLSRV_XML_VALIDATION_RQ_OUT XML Validation Outbound Channel Request

PLSRV_CALL_ADAPTER Transfer to Respective Adapter IDoc HTTP Proxy AFW

Before the steps above are executed the messages are placed in a qRFC queue (SMQ2) The messages

will wait till they are first in queue and till a free DIA work process is assigned to a queue The wait time in the

queue are recorded in the performance header in DB_ENTRY_QUEUING (PI inbound queue) and

DB_SPLITTER_QUEUING (PI outbound queue) Most often you will see the long processing times there as

discussed in chapter Long Processing Times for ldquoDB_ENTRY_QUEUEINGrdquo and Long Processing Times for

ldquoDB_SPLITTER_QUEUEINGrdquo

Looking at the steps above it is clear that tuning the queues will have a direct impact on the connected PI

components and also backend systems For example by increasing the number of parallel outbound queues

more mappings will be executed in parallel which will in turn put a greater load on the Java stack or more

messages will be forwarded in parallel to the backend system in case of a Proxy call Thus when tuning PI

queues you must always keep the implications for the connected systems in mind

The tuning of PI ABAP queue parameters is done in transaction SXMB_ADM Integration Engine

Configuration and by selecting the category TUNING

For productive usage we always recommend to use inbound and outbound queues (parameter

EO_INBOUND_TO_OUTBOUND=1) Otherwise only inbound queues will be used which are shared across

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

25

all interfaces Hence a problem with one single backend system will affect all interfaces running on the

system

The principle of less is sometimes more also applies for tuning the number of parallel PI queues Increasing

the number of parallel queues will result in more parallel queues with fewer entries per queue In theory this

should result in a lower latency if enough DIA WPs are available

But practically this is not true for high volume systems The main reason for this is the overhead involved in

the reloading of queues (see details below) Furthermore important tuning measures like PI packaging (see

Chapter ldquoPI Message Packagingrdquo) aim to increase the throughput based on a higher number of messages in

the queue Thus from a throughput perspective it is definitely advisable to configure fewer queues

If you have very high runtime requirements you should prioritize these interfaces and assign a different

parallelism for high priority queues only This can be done using parameters

EO_INBOUND_PARALLEL_HIGH_PRIO and EO_OUTBOUND_PARALLEL_HIGH_PRIO as described in

section ldquoMessage Prioritization on the ABAP Stackrdquo

To tune the parallelism of inbound and outbound queues the relevant parameters are

EO_INBOUND_PARALLEL and EO_OUTBOUND_PARALLEL have to be used

Below you can see a screenshot of SMQ2 You can see the PI inbound queues and outbound queues Also

ccBPM queues (XBQO$PE) are displayed and will be discussed in section 5

Procedure

Log on to your Integration Server call transaction SMQ2 and execute If you are running ABAP proxies on a

separate client on the same system enter lsquorsquo for the client Transaction SMQ2 provides snapshots only and

must therefore be refreshed several times to get viable information

o The first value to check is the number of queues concurrently active over a period of time Since each queue needs a dialog work process to be worked on the number of concurrent queues must not be too high compared to the number of dialog work processes usable for RFC communication (see Chapter qRFC Resources (SARFC)) The optimum throughput can be achieved if the number of active queues is equal to or slightly higher than the number of dialog work processes (provided that the number of DIA work processes can be handled by the CPU of the system see transaction ST06)

In case many of the queues are EOIO queues (eg because the serialization is done on the material number) try to reduce the queues by following Chapter EOIO tuning

o The second value of importance is the number of entries per queue Hit refresh a few times to check if the numbers increasedecreaseremain the same An increase of entries for all queues or a specific

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

26

queue points to a bottleneck or a general problem on the system The conclusion that can be drawn from this is not simple Possible reasons have been found to include

1) A slow step in a specific interface

Bad processing time of a single message or a whole interface can be caused by expensive processing steps as for example the mapping step or receiver determination This can be confirmed by looking at the processing time for each step as shown in ldquoAnalyzing the runtime of pipeline stepsrdquo

2) Backlog in Queues

Check if inbound or outbound queues face a backlog A backlog in a queue is generally nothing critical since the queues ensure that the PI components as well as the backend systems are not overloaded For instance batch triggered interfaces are usually causing high backlogs that get processed over time Only in case the backlog prevents you to meet the business requirements for an interface this should be analyzed

If the backlog is caused by a high volume of messages arriving in a short period of time one solution to minimize the backlog is to increase the number of queues available for this interface (if sufficient hardware is available) If one interface is having a backlog in the outbound queues you could eg specify EO_OUTBOUND_PARALLEL with a sub parameter specifying your interface to increase the parallelism for this interface

But in general a backlog can also be caused by a long processing time in one of the pipeline steps as discussed in 1) or a performance problem with one of the connected components like PI Java stack or connected backend systems Due to the steps performed in the outbound queues (mapping or call adapter) it is more likely that a backlog will be seen in the outbound queues Inbound queue processing is only expensive if Content Based Routing or Extended Receiver Determination is used To understand which step is taking long follow once more chapter ldquoAnalyzing the runtime of pipeline stepsrdquo

In addition to tuning of the number of inbound and outbound queues also ldquoMessage Prioritization on the ABAP Stackrdquo and ldquoPI Message Packagingrdquo can help

The screenshot below shows such a backlog situation in Wily Introscope Please Note that the information about the queues is only collected by Wily in 5 minutes intervals explaining the gaps between the measurement points On the dashboard ldquoTotal qRFC Inbound Queue Entriesrdquo you can see that the number of messages in SMQ2 is increasing The number of inbound queues does not increase at the same time It also offers a dashboard for the number of entries by queue name and the time entries remain in SMQ2 With the information provided here it is also possible to distinguish a blocking situation or a general resource problem after the problem is no longer present in the system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

27

3) Queues stay in status READY in SMQ2

To see the status of the queues use the filter button in SMQ2 as shown below

It is a normal situation to see many queues in READY status for a limited amount of time The reason for this is the way the QIN scheduler reloads LUWs (see point 2 below) A waiting time of up to 5 seconds per queuing step is considered normal during normal load situation (even if there is no backlog in the queue and enough resources are available) If you observe a situation where many queues are in READY status for longer period of time the following situations can apply

A resource bottleneck with regard to RFC resources Confirm by following section qRFC Resources (SARFC)

To avoid overloading a system with RFC load the QIN scheduler only reloads new queues after a certain amount of queues have finished processing This can lead to long waiting times in the queues as explained in SAP Note 1115861 - Behaviour of Inbound Scheduler after Resource Bottleneck Since PI is a non-user system we recommend setting rfcinb_sched_resource_threshold to 3 as described in SAP Note 1375656 ndash SAP NetWeaver PI System Parameters

Long DB loading times for RFC tables If using Oracle ensure that SAP Note 1020260 - Delivery of Oracle statistics is applied

Additional overhead during the pipeline executions occurs due to PI internal mechanisms Per default a lock is set during processing of each message This is no longer necessary and should

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

28

be switched off by setting the parameter LOCK_MESSAGE of category RUNTIME to 0 as described in SAP Note 1058915 In addition during the processing of an individual message the message repeatedly calls an enqueue which leads to a deterioration of the throughput This can be avoided by setting parameter CACHE_ENQUEUE to 0 as described in SAP Note 1366904

Sometimes a queue stays in READY status for a very long time while other queues are getting processed fine After manual unlocking the queue is processed but then stops after some time again A potential reason is described in Note 1500048 - SMQ2 inbound Queue remains in READY state In a PI environment queues in status RUNNING for more than 30 minutes usually happen due to one individual step taking long (as discussed in step 1) or a problem on the infrastructure (eg memory problems on Java or HTTP 200 lost on network) After 30 minutes the QIN scheduler removes this queue from the scheduling and therefore it remains in READY To solve such cases the root cause for the blocking has to be understood

4) Queue in READY status due to queue blacklisting

If a queue is for a very long time in status RUNNING (eg due to a problem with a mapping or due to a very slow backend) it is possible that the QIN scheduler excludes this queue from queue scheduling Hence the queue stays in status READY even though enough work processes are available The ldquoblacklistingrdquo of queues takes place when the runtime of a queue exceeds the ldquoMonitoring thresholdrdquo configured on this queue Notes 1500048 - queues stay in ready (schedule_monitor) and Note 1745298 - SMQR Inbound queue scheduler identify excluded entries provide more input on this topic The screenshot below shows a setting of 2400 seconds (40 minutes) for this parameter

Check if a queue stays in READY status for a long time while others are processing without any issue Ensure Note 1745298 is implemented and check the system log for the following exception ldquoQINEXCLUDE ltqueue namegt lttimedate at which the queue scheduler startedgt If this is the case check why the long runtime occurs (see section Analyzing the runtime of PI pipeline steps) or increase the schedule_monitor in exceptional cases only (if eg the runtime of an individual processing step cannot be improved further)

5) An error situation is blocking one or more queues

Click the Alarm Bell (Change View) push button once to see only queues with error status Errors delay the queue processing within the Integration Server and may decrease the throughput if for example multiple retries occur Navigate to the first entry of the queue and analyze the problem by forward navigation to the relevant message in SXMB_MONI

Often you see EO queues in SYSFAIL or RETRY due to the problems during the processing of an individual message This can be prevented by following the description of Chapter Prevent blocking of EO queues

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

29

44 Analyzing the runtime of PI pipeline steps

The duration of the pipeline steps is the ultimate answer to long processing times in the Integration Engine

since it describes exactly how much time was spent at which point The recommended way to retrieve the

duration of the pipeline steps is the RWB and will be described below Advanced users may use the

Performance Header of the SOAP message using transaction SXMB_MONI but the timestamps are not

easy to read If you still prefer the latter method here is the explanation 20110409092656165 must be read

as yyyymmddhh(min)(min)(sec)(sec)(microseconds) that is the timestamp corresponds to April 9 2011 at

092656 and 165 microseconds Please note that these timestamps in Performance Header of

SXI_MONITOR transaction are stored and displayed in UTC time therefore conversion to system time must

be done when analyzing them

In case PI message packaging is configured the performance header will always reflect the processing time

per package Hence a duration of 50 seconds does not mean a single message took 50 seconds but

eventually the package contained 100 messages so that every message took 05 ms More details about

this can be found in section PI Message Packaging

Procedure

Log on to your Integration Server and call transaction SXMB_IFR In the browser window follow the link to

the Runtime Workbench In there click Performance Monitoring Change the display to Detailed Data

Aggregated and choose an appropriate time interval for example the last day For this selection you have to

enter the details of the specific interface you want to monitor

You will now see in the lowerndashpart of the screen a split of the processing time into single steps Check the time difference between the steps Does any step take longer than expected In the example in the screenshot below the DB_ENTRY_QUEUEING starts after 0032 seconds and ends after 0243 seconds which means it took 0211 seconds (211ms)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

30

Compare the processing times for the single steps for different measurements as outlined in Chapter

Pipeline Steps (SXMB_MONI or RWB) For example is a single step only long if many messages

are processed or if a single message is processed This helps you to decide if the problem is a

general design problem (single message has long processing step) or if it is related to the message

volume (only for a high number of messages this process step has large values)

Each step has different follow-up actions that are described next

441 Long Processing Times for ldquoPLSRV_RECEIVER_ DETERMINATIONrdquo PLSRV_INTERFACE_DETERMINATION

Steps that can take a long time in inbound processing are the Receiver and Interface Determination In these

steps the receiver system and interface is calculated Normally this is very fast but PI offers the possibility of

enhanced receiver determinations In these cases the calculation is based on the payload of a message

There are different implementation options

o Content-Based Routing (CBR)

CBR allows defining XPath expressions that are evaluated during runtime These expressions can be combined by logical operators for example to check the value for multiple fields The processing time of such a request directly correlates to the number and complexity of the conditions defined

No tuning options exist for the system in regard to CBR The performance of this step can only be changed by reducing the number of rules by changing the design of the interfaces Another option is to use an extended receiver determination (executing a mapping program) since this is faster than CBR using many rules

o Mapping to determine Receivers

A standard PI mapping (ABAP Java XSLT) can also be used to determine the receiver If you observe high runtimes in such a receiver determination follow the steps outlined in the next section since the same mapping runtime is used

442 Long Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo

Before analyzing the mapping you must understand which runtime is used Mappings can be implemented

in ABAP as graphical mappings in the Enterprise Service Builder as self-developed Java mappings or

XSLT mappings One interface can also be configured to use a sequence of mappings executed

sequentially In such a case analysis is more difficult because it is not clear which mapping in the sequence

is taking a long time

Normal debugging and tracing tools (transaction ST12) can be used for mappings executed in ABAP Any

type of mapping can be tested from ABAP using transaction SXI_MAPPING_TEST Knowing

senderreceiver partyserviceinterfacenamespace and source message payload it is possible to check the

target message (after mapping execution) and detailed trace output (similar to contents of trace of the

message in SXMB_MONI with TRACE_LEVEL = 3) It can also be used for debugging at runtime by using

the standard debugging functionality

For ABAP based XSLT mappings it is also possible via transaction XSLT_TOOL to test trace and debug

XSLT transformations on the ABAP stack

In general the mapping response time is heavily influenced by the complexity of the mapping and the

message size Therefore to analyze a performance problem in the mapping environment you should

compare the mapping runtime during the time of the problem with values reported several days earlier to get

a better understanding

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

31

If no AAE is used the starting point of a mapping is the Integration Engine that will send the request by RFC

destination AI_RUNTIME_JCOSERVER to the gateway There it will be picked up by a registered server

program The registered server program belongs to the J2EE Engine The request will be forwarded to the

J2EE Engine by a JCo call and then executed by the Java runtime When the mapping has been executed

the result is sent back to the ABAP pipeline There are therefore multiple places to check when trying to

determine why the mapping step took so long

Before analyzing the mapping runtime of the PI system check if only one interface is affected or if you face a

long mapping runtime for different interfaces To do so check the mapping runtime of messages being

processed at the same time in the system

The best tool for such an analysis is Wily Introscope which offers a dashboard for all mappings being

executed at a given time Each line in the dashboard represents one mapping and shows the average

response time and the number of invocations

In the screenshot below you can see that many different mapping steps have required around 500 seconds

for processing Comparing the data during the incident with the data from the day before will allow you to

judge if this might be a problem of the underlying J2EE engine as described in section J2EE Engine

Bottleneck

If there is only one mapping that faces performance problems there would be just one line sticking out in the

Wily graphs If you face a general problem that affects different interfaces you can choose a longer

timeframe that allows you to compare the processing times in a different time period and verify if it is only a

ldquotemporaryrdquo issue ndash this would for example indicate an overload of the mapping runtime

If you have found out that only one interface is affected then it is very unlikely to be a system problem but

rather a problem in the implementation of the mapping of that specific interface

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

32

Check the message size of the mapping in the runtime header using SXMB_MONI Verify if the message size is larger than usual (which would explain the longer runtime)

There can also be one or many lookups to a remote system (using RFC or JDBC) causing the long processing time of a mapping Together with the application you then have to check if the connection to the backend is working properly Such a mapping with lookups can be analyzed using Wily Transaction Trace as explained in the appendix section Wily Transaction Trace

If not one but several interfaces are affected a potential system bottleneck occurs and this is described in

the following

o Not enough resources (registered server programs) available That could either be the case if too many mapping requests are issued at the same time or if one J2EE server node is down and has not registered any server programs

To check if there were too many mapping requests for the available registered server programs compare the number of outbound queues that are concurrently active with the number of registered server programs The number of outbound queues that are concurrently active can be monitored with transaction SMQ2 and counting queues with the names XBTO amp XBQOXB2O XBTA and XBQAXB2A (high priority) XBTZ and XBQZXB2Z (low priority) and XBTM (large messages) In addition to this you have to take into account mapping steps that are executed by synchronous messages or in a ccBPM process Mappings executed by a ccBPM process are not queued but the ccBPM process calls the mapping program directly using tRFC The number of registered server programs can be determined with transaction SMGW Goto Logged On Clients and filtering by the program (ldquoTP Namerdquo) AI_RUNTIME_ltSIDgt In general we recommend configuring 20 parallel mapping connections per server node

If the problem occurred in the past it is more difficult to determine the number of queues that are concurrently active One way is to use transaction SXMB_MONI and specify every possible outbound queue (field ldquoQueue IDrdquo) in the advanced selection criteria (Note wildcard search not available) The timeframe can be restricted in the upper-part of the advanced selection criteria Advanced users could use transaction SE16 to search directly in table SXMSPMAST using the field ldquoQUEUEINTrdquo

The other option is to use the Wily ldquoqRFC Inbound Queue Detailrdquo dashboard as described in qRFC Queue Monitoring (SMQ2)

To check if the J2EE is not available or if the server program is not registered for a different reason use transaction SM59 to test the RFC destination AI_RUNTIME_JCOSERVER If the problem occurred in the past search the gateway trace (transaction SMGW GoTo Trace Gateway Display File) and the RFC developer traces dev_rfc files available in the work directory for the string ldquoAI_RUNTIMErdquo In addition check the std_serverltngtout in the work directory for restarts of the J2EE Engine

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

33

o Depending on the checks above there are two options to resolve the bottleneck Of course if the J2EE was not available the reason for this has to be found and prevented in the future

The two options for tuning are

1) the number of outbound queues that are concurrently active and

2) the number of mapping connections from the J2EE server to the ABAP gateway

The increase of mapping connections is only recommended for strong J2EE servers since each mapping thread needs resources like CPU and Heap memory Too many mapping connections mean too many mappings being executed on the J2EE Engine In turn this might reduce the performance of other parts of the PI system for example the pipeline processing in the Integration Engine or the processing within the adapters of the AFW or can even cause out-of-memory errors on the J2EE engine Add additional J2EE server nodes to balance the parallel requests Each J2EE server node will

register new destinations at the gateway and will therefore take a part of the mapping load

Another option is to reduce the number of outbound queues that are concurrently active to solve the bottleneck This will certainly result in higher backlogs in the queue and is therefore only applicable for less runtime critical interfaces This option can be used if the backlog is caused by a master data interface with high volume but no critical processing time requirements In such a case you can only lower the number of parallel queues for this interface by using a sub-parameter of EO_OUTBOUND_PARALLEL in transaction SXMB_ADM By doing so you slow down one interface to ensure other (more critical) interfaces have sufficient resources available

o If multiple application servers configured on the PI system ensure that all the server nodes are connected to their local gateway (local bundle option) as described in the guide How To Scale PI

443 Long Processing Times for ldquoPLSRV_CALL_ADAPTERrdquo

Call adapter is the last step executed in the PI pipeline and forwards the message to the next component

along the message flow In most cases (local Adapter Engine IDoc or BPE) this is a local call The IDoc

adapter only puts the message on the tRFC layer (SM58) of the PI system so that the actual transfer of the

IDoc is not included in the performance header For ABAP Proxies plain HTTP or calls to a central or

decentral Adapter Engine an HTTP call is made so that network time can have an influence here

Looking at the processing time we have to distinguish asynchronous (EO and EOIO) and synchronous (BE)

interfaces

In asynchronous interfaces the call adapter step includes the transfer of the message by the network and

the initial steps on the receiver side (in most cases this just means persisting the message on the receiverrsquos

database) A long duration can therefore have two reasons

o Network latency For large messages of several MB in particular the network latency is an important factor (especially if the receiving ABAP proxy system is located on another continent for example) Network latency has to be checked in case of a long call adapter step In case HTTP load balancers are used they are considered as part of the network

o Insufficient resources on receiver side Enough resources must be available at the receiver side to ensure quick processing of a message For instance enough ICM threads and dialog work processes must be available in the case of an ABAP proxy system Therefore the analysis of long call adapter steps always has to include the relevant backend system

For synchronous messages (requestresponse behavior) the call adapter step also includes the processing

time on the backend to generate the response message Therefore the call adapter for synchronous

messages includes the time of the transfer of the request message the calculation of the corresponding

response message at the receiver side and the transfer back to PI Therefore the processing time to

process a request at the receiving target system for synchronous messages must always be analyzed to find

the most costly processing steps

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

34

444 Long Processing Times for ldquoDB_ENTRY_QUEUEINGrdquo

The value for DB_ENTRY_QUEUEING describes the time that a message has spent waiting in a PI inbound

queue before processing started In case of errors the time also includes the wait time for restart of the LUW

in the queue The inbound queues (XBTI XBT1 XBT9 XBTL for EO messages and XBQIXB2I

XBQ1XB21 XBQ9XB29 for EOIO messages) process the pipeline steps for the Receiver

Determination the Interface Determination and the Message Split (and optionally XML inbound validation)

Thus if the step DB_ENTRY_QUEUEING has a high value the inbound queues have to be monitored using

transactions SMQ2 and SARFC The reasons are similar as those outlined in chapter qRFC Resources

(SARFC) and qRFC Queues (SMQ2)

o Not enough resources (DIA work processes for RFC communication) available

o A backlog in the queues caused by one of the following reasons

The number of parallel inbound queues is too low to handle the incoming messages in parallel Note that a simple increase of the parameter EO_INBOUND_PARALLEL might not always be the solution (as enough work processes also have to be available to process the queues in parallel) The number of work processes is in turn restricted by the number of CPUs that are available for your system If you would like to separate critical from non-critical sender systems define a dedicated set of inbound queues by specifying a sender ID as a sub parameter of EO_INBOUND_PARALLEL For example by doing so you can separate master data interfaces from business critical interfaces Please note that the separation on inbound queues only works on Business System level and not on interface level

The inbound queues were blocked by the first LUW Use transaction SMQ2 to check if that is still the case To identify if such a situation occurred in the past use the Wily Introscope dashboard ldquoABAP System qRFC Inbound Detailldquo It is generally not easy to find the one message that is blocking the queue It might be achieved by checking RFC traces or by searching for a single message with a long pipeline processing step For EO queues there is no business reason to block the queues in case of a single message error To prevent this follow the description in section Prevent blocking of EO queues

In case you have a different runtime of the messages in the queues (eg due to complex extended Receiver Determinations) the number of messages per queue might be uneven distributed between the queues In PI 73 a new balancing option is available as discussed in Avoid uneven backlogs with queue balancing

Problems when loading the LUW by the QIN scheduler as described in qRFC Queues (SMQ2)

Note If your system uses the parameter EO_INBOUND_TO_OUTBOUND = 0 you must also read chapter

445 for analyzing the reasons EO_INBOUND_TO_OUTBOUND only determines whether inbound queues

(value lsquo0rsquo) or inbound and outbound queues (value lsquo1rsquo) are used for the pipeline processing in the integration

server Check the value with transaction SXMB_ADM Integration Engine Configuration Specific

Configuration The default value is lsquo1rsquo meaning the usage of inbound and outbound queues (the

recommended behavior)

445 Long Processing Times for ldquoDB_SPLITTER_QUEUEINGrdquo

The value for DB_SPLITTER_QUEUEING describes the time that a message has spent waiting in a PI

outbound queue until a work process was assigned In case of errors the time also includes the wait time for

the restart of the LUW in the queue

The outbound queues (XBTO XBTA XBTZ XBTM for EO messages and XBQOXB2O

XBQAXB2A XBQZXB2Z for EOIO messages) process the pipeline steps for the mapping outbound

binding and call adapter

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

35

Thus if the step DB_SPLITTER_QUEUEING has a high value the outbound queues have to be monitored

using transactions SMQ2 and SARFC The reasons are similar as outlined in chapters qRFC Resources

(SARFC) and qRFC Queues (SMQ2) or described in the section above for PI inbound queues

o Not enough resources (DIA work processes for RFC communication) available

o A backlog in the queues caused by one of the following reasons

The number of parallel outbound queues is too low to handle the outgoing messages in parallel Note that simply increasing the parameter EO_OUTBOUND_ PARALLEL might not always be the solution as enough work processes have to be available to process the queues in parallel The number of work processes in turn is limited by the number of CPUs available for the system Also the parameter EO_OUTBOUND_PARALLEL is used differently than the parameter EO_INBOUND_ PARALLEL because it determines the degree of parallelism for each receiver system Thus it is possible to increase the number of parallel outbound queues for specific receivers whereas other receivers are handled with a lower degree of parallelism Note that not every receiver backend is able to handle a high degree of parallelism so that in case of an ABAP Proxy system it might not indicate a problem on PI in case you observe a high value for DB_SPLITTER_QUEUING

The outbound queues were blocked by the first LUW To identify if such a situation occurred in the past use the Wily Introscope dashboard ldquoABAP System qRFC Inbound Detailldquo In general it is not easy to find the one message that is blocking the queue It might be achieved by checking RFC traces or by searching for a single message with a long pipeline processing step For EO queues there is no business reason to block the queues in case of a single message error To prevent this follow the description in section Prevent blocking of EO queues

Since the outbound queues include a processing step that is bound to take longer than the other steps (that is the mapping and the call adapter step) this might mean a long waiting time for all subsequent messages As an example assume that a mapping takes 1 second and that 100 messages are waiting in a queue The first message gets executed at once the second message has to wait about 1 second (a bit more since more steps are executed than just the mapping but this should be ignored at the moment) the third takes 3 seconds and so on The 100

th message

has to wait 100 seconds until it is processed that is the value for DB_SPLITTER_QUEUING is as high as 100 For a given outbound queue you would therefore see an increase for the DB_SPLITTER_QUEUING value over time If you experience this situation proceed with Chapter 442

In case you have a different runtime of the messages in the queues (eg due to different message size) the number of messages per queue might be uneven distributed between the queues In PI 73 a new balancing option is available as discussed in Avoid uneven backlogs with queue balancing

o Problems when loading the LUW by the QIN scheduler as described in qRFC Queues (SMQ2)

Note The number of parallel outbound queues is also connected with the ability of the receiving system to

process a specific amount of messages for each time unit

In section Adapter Parallelism default restrictions of the specific adapters of the J2EE Engine are explained Some adapters (JDBC File Mail) can process only one message for each server node at a time (this can be changed) It does not make sense for such adapters to have too many parallel qRFC queues since this will only move the message backlog from ABAP to Java

For messages that are directed to the IDoc outbound adapter the value for EO_OUTBOUND_PARALLEL is connected to the MAXCONN value of the corresponding outbound destination You can define the maximum number of connections using transaction SMQS ndash the default value is 10 parallel connections If applicable SAP highly recommends that you use IDoc packaging in the outbound IDoc adapter to reduce the load during the transmission of tRFC calls from the PI system to the receiving system

For ABAP proxy systems the number of outbound queues directly determines the number of messages sent in parallel to the receiving system Thus you have to ensure that the resources

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

36

available on PI are aligned with those on the sendingreceiving ABAP proxy system

446 Long Processing Times for ldquoLMS_EXTRACTIONrdquo

Lean Message Search can be configured for newer PI releases as described in the Online Help After

applying Note 1761133 - PI runtime Enhancement of performance measurement an additional header for

LMS will be written to the performance header

The header could look like this indicating that around 25 seconds were spent in the LMS analysis

When using trace level two additional timestamps are written to provide details about this overall runtime

LMS_EXTRACTION_GET_VALUES

This timestamp describes the evaluation of the message according to user-defined filter criteria

LMS_EXTRACTION_ADJUST_VALUES

This timestamp describes the post-processing of the previously found values in the BAdI (as of PI 730)

The runtime of the LMS heavily depends on the number of payload attributes to be extracted Also the

payload size and the complexity of the XPath expression has a direct impact on the performance of the LMS

In general the number of elements to be indexed should be kept at a minimum and very deep and complex

XPath expressions should be avoided

If you want to minimize the impact of LMS on the PI message processing time you can define the extraction

method to use an external job In that case the messages will be indexed after processing only and it will

therefore have no performance impact during runtime Of course this method imposes a delay in the

indexing (based on the frequency of job SXMS_EXTRACT_MESSAGES) so that newly processed

messages cannot be searched using LMS If this delay is acceptable for the monitoring responsible the

messages should be indexed using an external job

447 Other step performed in the ABAP pipeline

As discussed earlier there are additional steps in the ABAP pipeline that can be executed based on the

configuration of your scenario Per default these steps will not be activated and should therefore not

consume any time All these steps are traced in the performance header of the message Below you can see

the details for

- XML validation

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

37

- Virus Scan

In case one of these steps is taking long you have to check the configuration of your scenario In the

example of the Virus scan the problem might be related to the external virus scanner used and tuning has to

happen on that side

45 PI Message Packaging for Integration Engine

PI message packaging was introduced with SAP NetWeaver PI 70 SP13 and is per default activated in PI

71 and higher Message packaging in BPE is independent of message packaging in PI (see chapter 56)

but they can be used together

To improve performance and message throughput asynchronous messages can be assembled to packages

and processed as one entity For this the qRFC scheduler was enabled to process a set of messages (a

package) in one LUW Thus instead of sending one message to the different pipeline steps (for example

mappingrouting) a package of messages will be sent that will reduce the number of context switches that is

required Furthermore access to databases is more efficient since requests can be bundled in one database

operation Depending on the number and size of messages in the queue this procedure improves

performance considerably In return message packaging can increase the runtime for individual messages

(latency) due to the delay in the packaging process

Message packaging is only applicable for asynchronous scenarios (QoS EO and EOIO) Due to the potential

latency of individual messages packaging is not suitable for interfaces with very high runtime requirements

In general packaging has the highest benefit for interfaces with high volume and small message size

The performance improvement achieved directly relates to the number of the messages bundled in each

package Message packaging must not solely be used in the PI system Tests have shown that the

performance improvement significantly increases if message packaging is configured end-to-end - that is

from the sending system using PI to the receiving system Message packaging is mainly applicable for

application systems connected to PI by ABAP proxy

From the runtime perspective no major changes are introduced when activating packaging Messages

remain individual entities in regards to persistence and monitoring Additional transactions are introduced

allowing the monitoring of the packaging process

Messages are also treated individually for error handling If an error occurs in a message processed in a

package then the package will be disassembled and all messages will be processed as single messages Of

course in a case where many errors occur (for example due to interface design) this will reduce the benefits

of message packaging SAP systems connected to PI via ABAP Proxy can also make use of the message

packaging to reduce the HTTP calls necessary to transmit messages between the systems Other sending

and receiving applications in particular will not see any changes because they send and receive individual

messages and receive individual commits

The PI message packaging can be adapted to meet your specific needs In general the packaging is

determined by three parameters

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

38

1) Message count Maximum number of messages in a package (default 100)

2) Maximum package size Sum of all messages in kilobytes (default 1 MB)

3) Delay time Time to wait before the queue is processed if the number of messages does not reach

the message count (default 0 meaning no waiting time)

These values can be adjusted using transactions SXMS_BCONF and SXMS_BCM To better use the

performance improvements offered by packaging you could for example define a specific packaging for

interfaces with very small messages to allow up to 1000 messages for each package Another option could

be to increase the waiting time (only if latency is not critical) to create bigger packages

See SAP Note 1037176 - XI Runtime Message Packaging for more details about the necessary

prerequisites and configuration of message packaging More information is also available at

httphelpsapcom SAP NetWeaver SAP NetWeaver PIMobileIdM 71 SAP NetWeaver Process

Integration 71 Including Enhancement Package 1 SAP NetWeaver Process Integration Library

Function-Oriented View Process Integration Integration Engine Message Packaging

46 Prevent blocking of EO queues

In general messages for interfaces using Quality of Service Exactly Once are independent of each other In

case of an error in one message there is no business reason to stop the processing of other messages But

exactly this happens in case an EO queue goes into error due to an error in the processing of a single

message The queue will then be automatically retried in configurable intervals This retry will cause a delay

of all other messages in the queue which cannot be processed due to the error of the first message in the

queue

To avoid this a new error handling was introduced in SAP Notes 1298448 - XI runtime No automatic retry

for EO message processing (set EO_RETRY_AUTOMATIC = 0 and schedule restart job

RSXMB_RESTART_MESSAGES) and SAP Note 1393039 - XI runtime Dump stops XI queue After

applying this Note EO queues will not go in SYSFAIL status any longer These Notes are recommended for

all PI systems and are per default activated on PI 73 systems By specifying a receiver ID as sub parameter

this behavior can be configured for individual interfaces only

In PI 73 you can also configure the number of messages that would go into error before the whole queue

would be stopped For permanent errors (eg due to a problem with the receiving backend) the queue would

go into SYSFAIL so that not too many messages are going into error This also reduces the number of alerts

for Message Based Alerting To define the threshold the parameter EO_RETRY_AUT_COUNT can be set

The default value is 0 and indicates that the number of messages is not restricted

47 Avoid uneven backlogs with queue balancing

From PI 73 on a new mechanism is available to address the distribution of messages in queues Per default

in all PI versions the messages are getting assigned to the different queues randomly In case of different

runtimes of LUWs caused by eg different message sizes or different runtimes in mappings this can cause

an uneven distribution of messages within the different queues This can increase the latency of the

messages waiting at the end of that queue

To avoid such an unequal distribution the system checks the queue length before putting a new message in

the queue Hence it tries to achieve an equal balancing during inbound processing A queue which has a

higher backlog will get less new messages assigned Queues with fewer entries will get more messages

assigned Therefore it is different than the old BALANCING parameter of category TUNING) which was used

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

39

to rebalance messages already assigned to a queue The assignment of messages to queues is shown in

the diagram below

To activate the new queue balancing you have to set the parameters EO_QUEUE_BALANCING_READ and

EO_QUEUE_BALANCING_SELECT of category TUNING in SXMB_ADM If the value of the parameter

EO_QUEUE_BALANCING_READ is 0 (default value) then the messages are distributed randomly across

the queues that are available If however EO_QUEUE_BALANCING_READ is set to a value n greater than

zero then on average the current fill level of the queue is determined after every nth message and stored in

the shared memory of the application server This data is used as the basis for determining the queue (see

description of parameter EO_QUEUE_BALANCING_SELECT) relevant for balancing

The parameter EO_QUEUE_BALANCING_SELECT specifies the relative fill levels of the queues in percent

Relative here means in relation to the maximum filled queue If there are queues with a lower fill level than

defined here then only these are taken into consideration for distribution If all queues have a higher fill level

then all queues are taken into consideration

Please note that determining the fill level takes place using database access and therefore impacts system

performance The value of EO_QUEUE_BALANCING_READ should for this reason be oriented towards the

message throughput and specific requirements or even distribution For higher volume system a higher value

should be chosen

Letrsquos look at an example In case you set the parameter EO_QUEUE_BALANCING_READ to 1000 after

every 1000th incoming message the queue distribution will be checked This requires a database access

and can therefore cause a performance impact The fill level for each queue will then be written to shared

memory In our example we configured EO_QUEUE_BALANCING_SELECT to 20 At a given point in time

you have three outbound queues on the system XBTO__A contains 500 entries XBTO__B 150 and

XBTO__C contains 50 messages Therefore XBTO__B has a fill level of 30 and XBTO__C of 10 That

means the only queue relevant for balancing is XBTO__C which will get more messages assigned

Based on this example you can see that it is important to find the correct values As a general guideline to

minimize performance impact you should set EO_QUEUE_BALANCING_READ to a rather large value The

correct value of EO_QUEUE_BALANCING_SELECT depends on the criticality of a delay caused by a

backlog

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

40

48 Reduce the number of parallel EOIO queues

As discussed in Chapter Parallelization of PI qRFC Queues it is important to keep the number of queues

limited As a rule of thumb the number of active queues should be equal to the number of available work

processes in the system It is especially crucial to not have a lot of queues containing only one message

since this will cause a high overhead on the QIN scheduler due to very frequent reloading of the queues

Especially for EOIO queues we often see a high number of parallel queues containing only one message

The reason for this is for example that the serialization has to be done on document number to eg ensure

that updates to the same material are not transferred out of sequence Typically during a batch triggered

load of master data this can result in hundreds of EOIO queues in SMQ2 just containing only one or two

messages This is very bad from a performance point of view and will cause significant performance

degradation for all other interfaces running at the same time To overcome this situation the overall number

of EOIO queues (for all interfaces) can be limited as of PI 71 as described in Change Number of EOIO

Queues

The maximum number of queues can be set in SXMB_ADM Configure Filter for Queue Prioritization

Number of EOIO queues

During runtime a new set of queues will be used with the name XB2 as can be seen below for the outbound

queues

These queues will be shared between all EOIO interfaces for the given interface priority and by setting the

number of queues the parallelization for all EOIO interfaces is limited Thus more messages will be using the

same EOIO queue and therefore PI message packaging will work better and also the reloading of the

queues by the QIN scheduler will show much better performance

In case of errors the messages will be removed from the XB2 queues and will be moved to the standard

XBQ queues All other messages for the same serialization context will be moved to the XBQ queue

directly to ensure the serialization is maintained This means that in case of an error the shared EOIO

queues will not be blocked and messages for other serialization contexts will be not delayed

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

41

49 Tuning the ABAP IDoc Adapter

Very often the IDoc adapter deals with very high message volume and tuning of this is very essential

491 ABAP basis tuning

As stated above the IDoc adapter uses the tRFC layer to transfer messages from PI to the receiver system

The IDoc adapter will only put the message in SM58 and from there the standard tRFC layer forwards the

LUW You therefore have to ensure that sufficient resources (mainly DIA WPs) are available for processing

tRFC requests

Wily Introscope also offers a dashboard (shown below) for the tRFC layer which shows the tRFC entries and

their different statuses With these dashboards you are also able to identify history backlogs on tRFC

In order to control the resources used when sending the IDocs from sender system to PI or from PI to the

receiver backend you can also think about registering the RFC destinations in SMQS in PI as known from

standard ALE tuning By changing the Max Conn or the Max Runtime values in SMQS you can limit or

increase the number of DIA WPs used by the tRFC layer to send the IDocs for a given destination This will

mitigate the risk of system overload on sender and receiver side

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

42

492 Packaging on sender and receiver side

One option for improving the performance of messages processed with the IDoc adapter is to use IDoc

packaging

The receiver IDoc adapter in PI (for sending messages from PI to the receiving application system) already

had the option for packaging IDocs in previous releases For this messages processed in the IDoc adapter

are bundled together before being sent out to the receiving system Thus only one tRFC call is required to

send multiple IDocs The message packaging that is being discussed in section PI Message Packaging uses

a similar approach and therefore replaces the ldquooldrdquo packaging of IDocs in the receiver IDoc adapter

Therefore we highly recommend configuring Message Packaging since this helps transferring data for the

IDoc adapter as well as the ABAP Proxy

For the sender IDoc adapter there was previously the option to activate packaging in the partner profile of

the sending system in case IDocs are not send immediately but in batch By specifying the ldquoMaximum

Number of IDocsrdquo in the report RSEOUT00 multiple IDocs are bundled in one tRFC calls At the PI side

these packages will be disassembled by the IDoc adapter and the messages will be processed individually

Therefore this will help mainly to reduce the tRFC resources involved in transferring the data between the

systems but not within PI

This behavior was changed with SAP NetWeaver PI 711 and higher If the sender backend sends an IDoc

package to PI a new option has been introduced which allows the processing of IDocs as a package on PI

as well You can specify an IDoc package size on the sender IDoc Communication Channel The necessary

configuration is described in detail in the following SDN blog - IDoc Package Processing Using Sender IDoc

Adapter in PI EhP1 A significant increase in throughput was recorded for high volume interfaces with small

IDocs

493 Configuration of IDoc posting on receiver side

As described in SAP Note 1828282 - IDoc adapter long runtime XI -gt IDoc-gt ALE also the way the IDocs

are posted on the receiver ERP backend will directly influence the processing of the LUW on the PI side In

general two options exist for IDoc posting

Trigger immediately The IDoc will be posted directly when it is received For this a free dialog workprocess will be required

Trigger by background program If this option is chosen the IDoc is persisted on the ERP side and the posting of the application data will happen asynchronously using report RBDAPP01 via a background job

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

43

When ldquotrigger immediatelyrdquo is configured the sender system (PI) has to wait till the IDoc is posted While the

workprocess will roll out you will see the LUW in ldquoTransaction Executingrdquo during this time For the

IDOC_AAE the RFC connection and the java threads will be kept busy until the IDOC is posted in the

backend therefore leading to backlogs on the IDOC_AAE

The coding for posting the application data can be very complex and therefore this operation can take a long

time consuming unnecessary resources also on the sender side With background processing of the IDocs

the sender and receiver systems are decoupled from each other

Due to this we generally recommend to use ldquotrigger immediatelyrdquo in exceptional cases where the data has to

be posted without any delay In all other cases like high volume replication of master data background

processing of IDocs should be chosen

With Note 1793313 and 1855518 a third option for posting IDocs on the ERP side was introduced This

option is using bgRFC on the ERP side to queue the IDocs As soon as the configured resources for the

bgRFC destination are available the IDocs will be posted By doing so the IDocs will be posted based on

resource availability on the receiver system and no additional background jobs will be required Furthermore

storing the LUWs in bgRFC will ensure that the sender and receiver systems are decoupled This option

therefore guarantees that the IDocs will be posted as soon as possible based on the available resources on

the receiver system without the requirement to schedule many background jobs This new option is therefore

the recommended way of IDoc posting in systems that fulfill the necessary requirements

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

44

410 Message Prioritization on the ABAP Stack

A performance problem is often caused by an overload of the system for example due to a high volume

master interface If business critical interfaces with a maximum response time are running at the same time

then they might face delays due to a lack of resources Furthermore as described in qRFC Queues (SMQ2)

messages of different interfaces use the same inbound queue by default which means that critical and less-

critical messages are mixed up in the same queue

We highly recommend using message prioritization on the ABAP stack to prevent such delays for critical

interfaces (for Java see Interface Prioritization in the Messaging System )

For more information about PI prioritization see httphelpsapcom and navigate to SAP NetWeaver SAP

NetWeaver PIMobileIdM 71 SAP NetWeaver Process Integration 71 Including Enhancement Package 1

SAP NetWeaver Process Integration Library Function-Oriented View Process Integration

Integration Engine Prioritized Message Processing

To balance the available resources further between the configured interfaces you can also configure a

different parallelization level for queues with different priorities Details for this can be found in SAP Note

1333028 ndash Different Parallelization for HP Queues

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

45

5 ANALYZING THE BUSINESS PROCESS ENGINE (BPE)

This chapter helps to analyze performance problems if Chapter Determining the Bottleneck has shown that

the BPE is the slowest step for a message during its time in the PI system Of course this type of runtime

engine is very susceptible to inefficient design of the implemented integration process Information about

best practices for designing BPE processes can be found in the documentation Making Correct Use of

Integration Processes In general the memory and resource consumption is higher than for a simple pipeline

processing in the Integration Engine As outlined in the document linked above every message that is sent

to BPE and every message that is sent from BPE is duplicated In addition work items are created for the

integration process itself as well as for every step More database space is therefore required than

previously expected and more CPU time is needed for the additional steps

51 Work Process Overview

The Business Process Engine executes the steps (work items) of an integration process using tRFC calls to

the RFC destination WORKFLOW_LOCAL_ltclientgt To check if there are enough resources available to

process the work items call transaction SM50 (on each application server) while one of the integration

processes is running

o Look for the user ldquoWF-BATCHrdquo and follow its activities by refreshing the transaction

o Are there always dialog work processes available

The most important point to keep in mind here is the concurrent activity of the Business Process Engine (for

ccBPM processes) and the Integration Engine (for pipeline processing) Both runtime environments need

dialog work processes Thus performance problems in one of the engines cannot be solved by restricting

the other Rather an appropriate balance between the two runtimes has to be found

The activity of the WF-BATCH user is not restricted by default It is highly recommended that you register the

RFC destination WORKFLOW_LOCAL_ltclientgt with transaction SMQS and assign a suitable amount of

maximum connections Otherwise ccBPM activity may block the pipeline processing of the Integration

Server and reduce the throughput of PI drastically SAP Note 888279 - Regulatingdistributing the Workflow

Load gives more details

In addition it might help to use specific servers (dialog instances) to carry out a given workflow or to use load

balancing Of course both options only apply for larger PI systems that consist of at least a central instance

and one dialog instance

52 Duration of Integration Process Steps

As described in section Processing Time in the Business Process Engine with PI 73 there is a new monitor

available in ldquoConfiguration and Monitoring Homerdquo on PI start page It shows the duration of individual steps

of an integration process in a very easy way and is the now tool to analyze performance related issues on

the ccBPM

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

46

For releases prior than PI 73 check in the workflow log for long-running steps as described in the previous

chapter Use the ldquoList with Technical Detailsrdquo to do so Since the design of integration processes varies

significantly there is no general rule for the duration of a step or for specific sequences of steps Instead

you have to get a feeling for your implemented integration processes

Did a specific process step decrease in performance over a period of time

Does one specific process step stick out with regard to the other steps of the same integration process

Do you observe long durations for a transformation step (ldquomappingrdquo)

Do you observe long durations of synchronous sendreceive steps in the integration process which correlates for example to a slow response from a connected remote system

Use the SAP Notes database to learn about recommendations and hints SAP Note 857530 - BPE

Composite Note Regarding Performance acts as a central note for all performance notes and might be used

as the entry point

Once you are able to answer the above questions there are several paths to follow

o Two common reasons should be taken into consideration for a performance decrease over time Firstly the load on PI might have increased over time as well Check the hardware resources (chapter 9) and carefully monitor the work process availability (chapter 51) Alternatively the underlying databases might slow down the integration processes Use chapter 54 to check this possibility

o If it is a specific process step that sticks out the process design itself has to be reviewed The SAP Notes mentioned above might help here

o If the mapping step takes a long time it is worth having a look at chapter 442 since the BPE and the IE use the same resources (mapping connections from the ABAP gateway to the J2EE server) to execute their mapping

o If there is a long processing time for a synchronous sendreceive step check the connected backend system for any performance issues

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

47

53 Advanced Analysis Load Created by the Business Process Engine (ST03N)

Since the Business Process Engine runs on the ABAP part of the PI Integration Server you can use the

statistical data written by the dialog work processes to analyze a performance problem

Procedure

Log in to your Integration Server and call transaction ST03N Switch to ldquoExpert Moderdquo and choose the

appropriate timeframe (this could be the ldquoLast Minutersquos Loadrdquo from the Detailed Analysis menu or a specific

day or week from the Workload menu) In the Analysis View navigate to User and Settlement Statistics and

then to User Profile The user you are interested in is the WF-BATCH user who does all ccBPM-related work

Compare the total CPU Time (in seconds) to the time frame (in seconds) of the analysis In the above example an analysis time frame of 1 hour has been chosen which corresponds to 3600 seconds To be exact these 3600 seconds have to be multiplied by the number of available CPUs Out of these 3600CPU seconds the WF-BATCH user used up 36 seconds

o The above value gives you a good idea of how much load the Business Process Engine creates on your PI Integration Server By comparison with the other users (as well as the CPU utilization from transaction ST06) you can determine if the Integration Server is able to handle this load with the available number of CPUs or not This does not help you to optimize the load but it does provide support when deciding how to distribute the workload of the BPE and the workload of the Integration Server (see Chapter 51)

Experienced ABAP administrators should also analyze the database time wait time and compare these values

54 Database Reorganization

The database is accessed many times during the processing of an Integration Process This is not usually

connected with performance problems but if a specific database table is large then statements may take

longer than for small database tables

Use transaction ST05 to collect a SQL trace while the integration process in question is being executed Restrict the trace to user WF-BATCH When the integration process is finished stop the trace and view it Look for high values in the duration column especially for SELECT statements

Use transaction SE16 to determine the number of entries for the typical workflow tables SWW_WI2OBJ and SWFRXI There are many more database tables used by BPE but if the number of entries is high for the above tables they are high for the rest as well See SAP Note 872388 - Troubleshooting Archiving and Deletion in XI and the links given in this note if you find high amounts of entries

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

48

55 Queuing and ccBPM (or increasing Parallelism for ccBPM Processes)

Until recently ccBPM processes accepted messages from only one queue (one queue XBQO$PEWS for

each workflow) If there was a high message throughput for this workflow then a high backlog occurred for

these queues A couple of enhancements have therefore been implemented to improve scalability if the

ccBPM runtime

o Inbound processing without buffering

o Use of one configurable queue that can be prioritized or assigned to a dedicated server

o Use of multiple inbound queues (queue name will be XBPE_WS)

o Configure transactional behavior of ccBPM process steps to reduce number of persistency steps

Details about the configuration and tuning of the ccBPM runtime are described at

httpswwwsdnsapcomirjsdnhowtoguides SAP Netweaver 70 End-to-End Process Integration

o How to Configure Inbound Processing in ccBPM Part I Delivery Mode

o How to Configure Inbound Processing in ccBPM Part II Queue Assignment

o How to Configure ccBPM Runtime Part III Transactional Behavior of an Integration Process

Procedure

Use transaction SMQ2 to check if the queues of type XBQO$PE show a high backlog

o Based on this information verify if the ccBPM process is suitable to run with multiple queues This is especially the case for collect patterns realized in BPE If you can use multiple queues configure the workflow based on the above guides

56 Message Packaging in BPE

Inbound processing takes up the most amount of processing time in many scenarios within BPE

Message packaging in BPE helps to improve performance by delivering multiple messages to BPE process

instances in one transaction This can lead to an increased throughput which means that the number of

messages that can be processed in a specific amount of time can increase significantly Message packaging

can also increase the runtime for individual messages (latency) due to the delay in the packaging process

The sending of packages can be triggered when the packages exceed a certain number of messages a

specific package size (in kB) or a maximum waiting time The extent of the performance improvement that

can be obtained depends on the type of scenario Scenarios with the following prerequisites are generally

most suitable

o Many messages are received for each process instance

o Messages that are sent to a process instance arrive together in a short period of time

o Generally high load on the process type

o Messages do not have to be delivered immediately

For example collect scenarios are particularly suitable for message packaging The higher the number of

messages is in a package the higher the potential performance improvements will be

Tests that have been performed have shown a high potential for throughput improvements up to factor 47 in

tested Collect Scenarios depending on the packaging size that has been configured For more details about

the functionality of Message Packaging in BPE and its configuration see SAP Note 1058623 - BPE

Message Packaging

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

49

Message packaging in BPE is independent of message packaging in PI (see Chapter 45) but they can be

used together

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

50

6 ANALYZING THE (ADVANCED) ADAPTER ENGINE

This chapter describes further checks and possible reasons for bottlenecks in the (Advanced) Adapter

Engine Although this is not the place to describe the architecture of the Adapter Framework in detail the

following aspects are important to know for the analysis of a performance problem on the AFW

o The AFW contains the Messaging System that is responsible for the persistence and queuing of messages Within the flow of a message it is placed in between the sender adapter (for example a file adapter that polls from a folder) and the Integration Server as well as between the Integration Server and the receiver adapter (for example a JMS adapter that sends a message out to an external system)

o The Messaging System uses 4 queues for each adapter type and these are responsible for receiving and sending messages To separate the message transfer each adapter (for example JMS SOAP JDBC and so on) has its own set of queues One queue for receiving synchronous messages (Request queue) one queue for sending synchronous messages (Call queue) one queue for receiving asynchronous messages (Receive queue) and one queue for sending asynchronous messages (Send queue) The name of the queue always states the adapter type and the queue name (for example JMS_httpsapcomxiXISystemReceive for the JMS receiver asynchronous queues)

o A configurable number of consumer threads are assigned on each queue These threads work on the queue in parallel meaning that the Messaging Queues are not strictly FirstIn-FirstOut Five threads are assigned by default to each queue and thus five messages can be processed in parallel But as will be discussed later the parallel execution of requests to the backend depends heavily on the adapter being used since not every adapter can work in parallel by default

o SAP NetWeaver PI 71 introduced a new queue The dispatcher queue The dispatcher queue is used to realize prioritization on the AAE (see chapter Prioritization in the Messaging System) Every message in the PI system is first assigned to the dispatcher queue before it is assigned to the adapter-specific queues

Viewed from the perspective of a message that enters the PI system using a J2EE Engine adapter (for

example File) and leaves the PI system using a J2EE Engine adapter (for example JDBC) asynchronously

the steps are as follows

1) Enter the sender adapter

2) Put into the dispatcher queue of the messaging system

3) Forwarded to the File send queue of the messaging system (based on Interface priority)

4) Retrieved from the send queue by a consumer thread

5) Sent from the messaging system to the pipeline of the ABAP Integration Engine (if no Integrated Configuration (AAE) is used

6) Processed by the pipeline steps of the Integration Engine

7) Sent to the messaging system by the Integration Engine

8) Put into the dispatcher queue of the messaging system

9) Forwarded to the JDBC receive queue of the messaging system (based on Interface priority)

10) Retrieved from the receive queue (based on the maxReceivers)

11) Sent to the receiver adapter

12) Sent to the final receiver

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

51

All the analysis is based on the information provided in the Audit Log With SAP NetWeaver PI 71 the audit

log is not persisted for successful messages in the database by default to avoid performance overhead

Therefore the audit log is only available in the cache for a limited period of time (based on the overall

message volume) More details can be found in chapter Persistence of Audit Log information in PI 710 and

higher

As described in chapter Adapter Engine Performance monitor in PI 731 and higher a new performance

monitor is available from PI 731 SP4 With this monitor you can display the processing time per interface in

a given interval and identify the processing steps in which a lot of time is spend

With 731 SP10 and 74 SP5 a download functionality for the performance monitor is provided as described

in SAP Note 1886761

61 Adapter Performance Problem

Compared to a bottleneck in the Messaging System it is relatively easy to detect a performance

problembottleneck in the adapter Since the timestamps for inbound and outbound messages are written at

different points in time the procedure is described separately for sender and receiver adapters

Note It is not generally possible for all sender and receiver adapters to work in parallel (discussed in section

Adapter Parallelism)

611 Adapter Parallelism

As mentioned before not all adapters are able to handle requests in parallel Thus increasing the number of

threads working on a queue in the messaging system will not solve a performance problembottleneck

There are 3 strategies to work around these restrictions

1) Create additional communication channels with a different name and adjust the respective senderreceiver agreements to use them in parallel

2) Add a second server node that will automatically have the same adapters and communication channel running as the first server node This does not work for polling sender adapters (File JDBC or Mail) since the adapter framework scheduler assigns only one server node to a polling communication channel

3) Install and use a non-central Adapter Framework for performance-critical interfaces to achieve better separation of interfaces

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

52

Some of the most frequently-used adapters and the possible options are discussed below

o Polling Adapters (JDBC Mail File)

At the sender side these adapters use an Adapter Framework Scheduler which assigns a server node that does the polling at the specified interval Thus only one server node in the J2EE cluster does the polling and no parallelization can be achieved Therefore scaling via additional server node is not possible More details will be presented in section Adapter Framework Scheduler For example since the channels are doing the same SELECT statement on the database or pick up files with the same file name parallel processing will only result in locking problems To increase the throughput for such interfaces the polling interval has to be reduced to avoid a big backlog of data that is to be polled If the volume is still too high you should think about creating a second interface for example the new interface would poll the data from a different directory or database table to avoid locking

At the receiver side the adapters work sequentially on each server node by default For example only one UPDATE statement can be executed for JDBC for each Communication Channel (independent of the number of consumer threads configured in the Messaging System) and all the other messages for the same Communication Channel will wait until the first one is finished This is done to avoid blocking situations on the remote database On the other hand this can cause blocking situations for whole adapters as discussed in section Avoid Blocking Caused by Single SlowHanging Receiver Interface

To allow better throughput for these adapters you can configure the degree of parallelism at the receiver side In the Processing tab of the Communication Channel in the field ldquoMaximum Concurrencyrdquo enter the number of messages to be processed in parallel by the receiver channel For example if you enter the value 2 then two messages are processed in parallel on one J2EE server node The parallel execution of these statements at database level of course depends on the nature of the statements and the isolation level defined on the database If all statements update the same database record database locking will occur and no parallelization can be achieved

o JMS Adapter

The JMS adapter is per default using a push mechanism on the PI sender side This means the data is pushed by the sending MQ provider Per default every Communication channel has one JMS connection established per J2EE server node On each connection it processes one message after the other so that the processing is sequential Since there is one connection per server node scaling via additional server nodes is an option

With Note 1502046 - JMS Adapter message pull mechanism the JMS protocol can be changed to use a pull mechanism By doing so you can specify a polling interval in the PI Communication Channel and PI will be the initiator of the communication Also here the Adapter Framework Scheduler is used which implies sequential processing But in contrary to JDBC or File sender channels the JMS polling sender channel will allow parallel processing on all server nodes of the J2EE cluster Therefore scaling via additional J2EE server nodes is an option

The JMS receiver side supports parallel operation out of the box Only some small parts during message processing (pure sending of the message to the JMS provider) are synchronized Therefore to enable parallel processing on the JMS receiver side no actions are necessary

o SOAP Adapter

The SOAP adapter is able to process requests in parallel The SOAP sender side has in general no limitations in the number of requests it can execute in parallel The limiting factor here is the FCAThreads available to process the incoming HTTP calls This is discussed in more detail in chapter Tuning SOAP sender adapter On the receiver side the parallelism depends on the number of the threads defined in the messaging system and the ability of the receiving system to cope with the load

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

53

o RFC Adapter

The RFC adapter offers parameters to adjust the degree of parallelism by defining the number of initial and maximum connections to be used The initial threads are allocated from the Application thread pool directly and are therefore not available for any other tasks Therefore the number of initial thread should be kept minimal To avoid bottlenecks during the peak time the maximum connections can be used A bottleneck is indicated by the following exception in the Audit Log

comsapaiiaframsapiDeliveryException error while processing message to remote systemcomsapaiiafrfccoreclientRfcClientException resource error could not get a client from JCOPool comsapmwjcoJCO$Exception (106) JCO_ERROR_RESOURCE Connection pool RfcClient hellip is exhausted The current pool size limit (max connections) is 1 connections

Also this should be done carefully since these threads will be taken from the J2EE application thread pool Therefore a very high value there can cause a bottleneck on the J2EE engine and therefore cause a major instability of the system As per RFC adapter online help the maximum number of connections are restricted to 50

o IDoc_AAE Adapter

The IDoc adapter on Java (IDoc_AAE) was introduced with PI 73 On the sender side the parallelization depends on the configuration mode chosen In Manual Mode the adapter works sequential per server node For channels in ldquoDefault Moderdquo it depends on the configuration of the inbound Resource Adapter (RA) Via the parameter MaxReaderThreadCount of the inbound RA you can configure how many threads are globally available for all IDoc adapters running in Default Mode Hence this determines the overall parallelization of the IDoc sender adapter per Java server node Currently the recommended maximum number of threads is 10

The receiver side of the Java IDoc adapter is working parallel per default

The table below gives a summary of the parallelism for the different adapter types

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

54

612 Sender Adapter

The first step(s) to be taken for a sender adapter depend on the type of adapter itself Afterwards however

the message flow is always the same First the message is processed in the Module Processor and

afterwards put into the queue of the Messaging System If the message is synchronous then it is the call

queue if the message is asynchronous that it is the send queue The final action of the Adapter Framework

is then to send the message from the messaging system to the Integration Server (for Non ICO interfaces)

Only the first entries of the audit log are of interest to establish if the sender adapter has a performance

problem From the first entry to the entry ldquoMessage successfully put into the queuerdquo These steps logically

belong to the sender adapter and the Module Processor while the subsequent steps belong to the

Messaging System The sender adapter uses one J2EE Engine application thread to execute its work This

is generally the same thread for all steps involved

Select the ldquoDetailsrdquo of a message in the AFW The first entry of the audit log will be the beginning of the activity of the sender adapter The end of the adapterrsquos activity is marked by the entry ldquoThe application sent the message asynchronously using connection Returning to applicationrdquo

6121 Tuning SOAP sender adapter As described above in general there is no limitation in the parallelization of the SOAP sender adapter itself

The incoming SOAP requests are limited by the available FCA Threads for HTTP processing The FCA

Threads are discussed in more detail in FCA Server Threads

Per default all interfaces using the SOAP adapter are using the same set of FCA Threads since they are all

using the same URL httplthostgtltportgtMessageServlet In case one interface is facing a very high load or

slow backend connections this can cause a blockage of the available FCA Threads and could cause a heavy

impact on the processing time of other interfaces

To avoid this SAP Note 1903431 allows the usage of different URLs for specific interfaces This way you

could isolate interfaces with a very high volume on a dedicated entry point using its own set of FCA Threads

In case of high load for this interface the other SOAP sender interfaces would not be affected by a shortage

of FCA Threads

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

55

To use this new feature you can specify on the sender system the following two entry points

o MessageServletInternal

o MessageServletExternal

More information about the URL of the SOAP adapter can be found here

613 Receiver Adapter

For a receiver adapter it is just the other way around when compared to the sender adapter First the

message is received from the Integration Server and then put into a queue of the Messaging System this

time either the request or the receive queue for asynchronous and the request queue for synchronous

messages The last step is then the activity of the receiver adapter itself

Select the ldquoDetailsrdquo of a message in the AFW The adapter activity starts after the entry ldquoThe message status set to DLNGrdquo The message will then be forwarded to the entry point of the AFW ldquoDelivering to channel ltchannel_namegtrdquo enters the Module Processor ldquoMP Entering Module Processorrdquo and ends with ldquoThe message was successfully delivered to the application using connection rdquo

Depending on the type of the adapter you now have to investigate why the receiver adapter needs a high amount of processing time This could be a complicated operating system command a bad network connection or a slow backend system among many other reasons

Here too custom modules can be the reason for a prolonged processing time Check Performance of Module Processing for more details

Note From 71 the audit log is not persisted any longer for performance reasons as described in Persistence of Audit Log information in PI 710 and higher

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

56

614 IDoc_AAE adapter tuning

A comprehensive and up-to date description about the tuning of the IDoc adapter is summarized in Note

1641541 - IDoc_AAE Adapter performance tuning Please refer to this Note for details

Similar to all other adapters running on the Adapter Engine also the IDOC_AAE adapter is using the Messaging System queues and the explanations of the next chapter Messaging System Bottleneck are also valid for the IDOC_AAE adapter Note that the IDOC_AAE adapter only supports async Scenarios ndash so it is only using the ldquoSendrdquo queue in Java-only scenarios and the ldquoSendrdquo and ldquoReceiverdquo queue in dual-stack scenariosrdquo

For the IDoc_AAE sender you can configure in addition bulk support as indicated in the screenshot below By doing so all messages received by PI in one RFC call from ERP will be processed as one message This is similar to the mechanism of the ABAP IDoc adapter described in the chapter Packaging on sender and receiver side except with the difference that the number of messages in one bulk cannot be further limited

It is also essential to ensure that the size of the IDocs or IDoc packages for sender or receiver IDocs is not

getting too large to avoid any negative impact on the Java heap memory and Garbage Collection In the

case of the IDoc_AAE adapter the XML size of the IDoc is less critical it is the number of segments in an

IDoc or the overall number of segments in all IDocs of an IDoc package that will influence the memory

allocation during message processing Based on the current implementation per IDoc segment around 5 KB

of memory during processing is required A package of 5 IDocs with 2000 segments for each IDoc would

consume roughly 500 MB during processing In such cases it is important to eg lower the package size

andor use large message queues as outlined in chapter Large message queues on PI Adapter Engine

615 Packaging for Proxy (SOAP in XI 30 protocol) and Java IDoc adapter

Due to message packaging in the Integration Engine (see PI Message Packaging for Integration Engine) the

ABAP based IDoc and Proxy adapter was always using packaging when sending asynchronous messages

to the receiver system With PI 731 packaging was also introduced for Java based IDoc and Proxy (SOAP in

XI 30 protocol) adapter The aim of packaging is to achieve similar benefits to packaging already existing in

ABAP pipeline

o Less calls to backend consuming less parallel Dialog Work Processes on the receiver

o Less overhead due to context switches authentication when calling the backend

Especially for high volume interfaces packaging should be enabled to avoid overloading of the backend

systems and impact on users operating on the system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

57

Important

In the ABAP stack the packaging is active per default (in 71 and higher) as described in chapter PI

Message Packaging for Integration Engine Experience shows that packaging can have a very high impact

on the DIA Work process usage on the ERP receiver side Especially for migration projects from dual-stack

PI to AEXPO it is important to evaluate packaging to avoid any negative impact on the receiving ECC due

to missing packaging

The packaging on Java works in general different then on ABAP While in ABAP the aggregation is done on

the individual qRFC queue level (which always contains messages to one receiver system only) this is not

possible on the Adapter Engine since messages for all receivers are located in the same queue Therefore

packages are built outside of the Adapter Engine queues The messages to be packaged will be forwarded

to a bulk handler which waits till either the time for the package is exceeded or the number of messages or

data size is reached After this the message is send by a bulk thread to the receiving backend system

Packaging on Java is always done for a server node individually If you have a high number of server nodes

packaging will work less efficiently due to the load balancing of messages across the available server nodes

When enabling message packaging the message status will stay in status ldquoDeliveringrdquo throughout all steps

described above In the audit log you can see the time it spends in packaging The audit log shown below

shows eg that the message was almost waiting one minute before the package was built and 9 IDocs were

sent with this package

Packaging can be enabled globally by setting the following parameters in Messaging System service as

described in Note 1913972

o messagingsystemmsgcollectorenabled Enable packaging or disable it globally

o messagingsystemmsgcollectormaxMemPerBulk The maximum size of a bulk message

o messagingsystemmsgcollectorbulkTimeout The wait time per bulk - default 60 seconds

o messagingsystemmsgcollectormaxMemTotal Maximum memory which can be used by message collector

o messagingsystemmsgcollectorpoolSize Number of parallel threads used to send out packages to the backend This value has to be tuned based on the general performance of the receiver system the network latency between the systems or the configuration of the IDoc posting on the ERP sender side (described in chapter IDoc posting on receiver side) Therefore tuning of these threads might be very important These threads can unfortunately not be monitored yet with any PI tool or Wily

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

58

Introscope To verify that not all packaging threads are in use you can use the Thread monitoring in the SAP Management Console and check for threads named ldquoBULK_EXECUTORrdquo as shown below

If you would like to adapt the packaging for specific communication channel this can also be done using the

configuration options in the Integration Directory

While in the SOAP adapter in XI 30 protocol you can define three different packaging KPIs this is currently

not possible for the IDoc adapter There you can only specify the package size based on number of

messages

616 Adapter Framework Scheduler

For polling adapters like File JDBC and JMS the Adapter Framework Scheduler determines on which server

node the polling takes place Per default a File or JDBC sender only works on one server node and the

Adapter Framework (AFW) Scheduler determines on which one the job is scheduled

The Adapter Frameworks scheduler had several performance and reliability issues in a cluster environment

especially in systems having many Java server nodes and many polling communication channels Based on

this and the symptoms described in Note 1355715 - AF Scheduler to avoid using cluster communication a

new scheduler was released We highly recommend using the new version of the AFW scheduler To

activate the new scheduler you have to set the parameter schedulerrelocMode of the Adapter framework

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

59

service property to some negative value For instance by setting the value to -15 after every 15th polling

interval a rebalancing of the channel within the J2EE cluster might happen This can allow for better

balancing of the incoming load across the available server nodes if the files are coming in on regular

intervals The balancing is achieved by doing a servlet call Based on the HTTP load balancing the channel

might then be dispatched to another server node To avoid the balancing overhead this value should not be

set too low A value between -10 and -15 should be acceptable in most cases For very short polling intervals

(eg every second) even lower values (eg -50) can be configured

In case many files are put at one given time on the directory (by eg using a batch process) all these files will

be processed by one polling process only Hence a proper load balancing cannot be achieved by the AFW

scheduler In such a case the only option is to use write the files with different names or different directories

so that you can configure multiple sender channels to pick up the files

Starting from PI 731 you can monitor the Adapter Framework Scheduler in pimon Monitoring

Background Job Processing Monitor Adapter Framework Scheduler Jobs There you can check on which

server node the Communication Channel is polling (Status=rdquoActiverdquo) Also you can see the time the channel

was polling the last time and will poll next time

You cannot influence the server node on which the channel is polling This is determined via the AFW

scheduler You can only influence the frequency after which a channel can potentially move to another

server node by tuning the parameter relocMode as outlined above

Prior to PI 731 you have to use the following URL to monitor the Adapter Framework Scheduler

httpserverportAdapterFrameworkschedulerschedulerjsp

Above you see an example of a File sender channel The type value determines if the channel runs only on

one server node (type = L (local)) or on all server nodes (Type = G (global)) The time in the brackets

[60000] determines the polling interval in ms ndash in this case 60 seconds The Status column shows the

status of the channel

o ldquoONrdquo Currently polling

o ldquoonrdquo Currently waiting for the next polling

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

60

o ldquooffrdquo not scheduled any longer (eg channel deactivated or scheduled on another server node)

62 Messaging System Bottleneck

As described earlier the Messaging System is responsible for persisting and queuing messages that come

from any sender adapter and go to the Integration Server pipeline or Java only interfaces (ICO) or that come

from the Integration Server pipeline and go to any receiver adapter The time difference between receiving

the message from the one side and delivering it to the other side is the value that can be used to analyze

bottlenecks in the Messaging System The following chapters describe how to determine the time difference

for both directions

A bottleneck (or rather a ldquofakerdquo bottleneck) in the Messaging System can usually be closely connected with a

performance problem in the receiver adapter You must therefore make sure you execute the check

described in chapter 613 If the receiver adapter needs a lot of time to process messages then of course

the messages get queued in the messaging system and remain there for a long time since the adapter is not

ready to process the next one yet It looks like the messaging system is not fast enough but actually the

receiver adapter is the limiting factor

Execute the checks described in chapter 621 and 622 to get an overview of the situation you should do this for multiple messages

Decide if a specific receiver adapter is causing long waiting times in the messaging system by executing the check described in chapter 613 An additional hint at a receiver adapter problem is if only messages to specific receivers show a long wait time in the queue

Use the queue monitoring in the Runtime Workbench (RWB) to analyze how many threads are active for each of the queue types and how large the queue size currently is To do so choose Component Monitoring Adapter Engine Press the Engine Status button and choose tab ldquoAdditional Datardquo This will open the page below showing the number of messages in a queue the threads currently working and the maximum available threads Note This view only shows the Messaging System of one J2EE server node To get the whole picture you have to check all the server nodes that are available in your system via the drop down list box

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

61

o Since checking all the server nodes with this browser frontend can be very tedious it is better to use Wily Introscope since it allows easy graphical analysis of the queues and thread usage of the messaging system across Java server nodes

The starting point in Wily Introscope is usually the PI triage showing all the backlogs in the queue In the screenshot below a backlog in the dispatcher queue (for 71 and higher only) can be seen

The dispatcher queue is not the problem here since it will forward messages to the adapter queue if any free

threads are available on the adapter specific consumer threads The analysis must start in the PI inbound

queues By using the navigation button in the upper right corner of the inbound queue size you can directly

jump to a more detailed view where you can see that the file adapter was causing the backlog

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

62

To see the consumer thread usage you can then follow the link to the file adapter In the screenshot below

you can see that all consumer threads were exceeded during the time of the backlog Thus increasing the

number of consumer threads could be a solution if the resulting delay is not acceptable for business

o If it is obvious that the messaging system does not have enough resources as in the case above increase the number of consumers for the queue in question Use the results of the previous check to decide which adapter queues need more consumers

The adapter specific queues in the messaging system have to be configured in the NWA using

service ldquoXPI Service AF Corerdquo and property messagingconnectionDefinition The default

values for the sending and receiving consumer threads are set as follows

(name=global messageListener=localejbsAFWListener exceptionListener=

localejbsAFWListener pollInterval=60000 pollAttempts=60

SendmaxConsumers=5 RecvmaxConsumers=5 CallmaxConsumers=5

RqstmaxConsumers=5)

To set individual values for a specific adapter type you have to add a new property set to the default set with the name of the respective adapter type for example

(name=JMS_httpsapcomxiXISystem

messageListener=localejbsAFWListener

exceptionListener=localejbsAFWListener pollInterval=60000

pollAttempts=60 SendmaxConsumers=7 RecvmaxConsumers=7

CallmaxConsumers=7 RqstmaxConsumers=7)

The name of the adapter specific queue is based on the pattern ltadapter_namegt_ ltnamespacegt for

example RFC_httpsapcomxiXISystem

Note that you must not change parameters such as pollInterval and pollAttempts For more details see SAP Note 791655 - Documentation of the XI Messaging System Service Properties

o Not all adapters use the parameter above Some special adapters like CIDX RNIF or Java Proxy can be changed by using the service ldquoXPI Service Messaging Systemrdquo and property messagingconnectionParams by adding the relevant lines for the adapter in question as described above

o A performance problem on the Messaging System can also be caused by a high logging as described in chapter Logging Staging on the AAE (PI 73 and higher)

o Important Make sure that your system is able to handle the additional threads that is monitor the CPU usage and application thread availability as well as the memory of the J2EE Engine after you have applied the change

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

63

o Important Keep in mind that not all adapters are able to handle requests in parallel even if you assign more threads on the Messaging System queues In such a case adding additional threads will not speed up processing significantly See chapter Adapter Parallelism for details

o A backlog of messages in the messaging system is highly critical for synchronous scenarios due to timeouts that might occur Therefore the number of synchronous threads always has to be tuned so that no bottleneck occurs Often a backlog in synchronous queues is caused due to bad performance on the receiving backend system

o More information is provided in the blog Tuning the PI Messaging System Queues

621 Messaging System in between AFW Sender Adapter and Integration Server (Outbound)

Open the Audit Log of a message and look for the following entry ldquoThe message was successfully retrieved

from the send queuerdquo Note that the queue name would be ldquoCall Queuerdquo if the message was synchronous

To determine how long the message waited to be picked up from the queue compare the timestamp of the

above entry with the timestamp for the step ldquoMessage successfully put into the queuerdquo A large time

difference between those two timestamps would therefore indicate a bottleneck for consumer threads on the

sender queues in the Messaging System

Instead of checking the latency of single messages we recommend using Wily Introscope to monitor the

queue behavior as shown above

Asynchronous messages only use one thread for processing in the messaging system Multiple threads are

used for synchronous messages The adapter thread puts the message into the messaging system queue

and will wait till the messaging system delivers the response The adapter thread is therefore not available

for other tasks until the response is returned A consumer thread on the call queue sends the message to the

Integration Engine The response will be received by a third thread (consumer thread of send queue) which

correlates the response with the original request After this the initiating adapter thread will be notified to

send the response to the original sender system This correlation can be seen in the audit log of the

synchronous message below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

64

622 Messaging System in Between Integration Server and AFW Receiver Adapter (Inbound)

Open the Audit Log of a message and look for the following entry ldquoMessage successfully put into the

queuerdquo Compare this timestamp with the timestamp for the step ldquoThe message was successfully retrieved

from the receive queuerdquo The time difference between these two timestamps is the time that the message

waited in the messaging system for a free consumer thread A large time difference between those two

timestamps indicates a bottleneck in the adapter specific inbound queues of the Messaging System

623 Interface Prioritization in the Messaging System

SAP NetWeaver PI 71 and higher offers the possibility for prioritization of interfaces similar to the ABAP

stack This feature is especially helpful if high volume interfaces run at the same time with business critical

interfaces To minimize the delay for critical messages the Messaging System now makes it possible for you

to define high medium and low priority processing at interface level Based on the priority the Dispatcher

Queue of the messaging system (which will be the first entry point for all messages) will forward the

messages to the standard adapter-specific queues The priority assigned to an interface determines the

number of messages that are forwarded once the adapter-specific queues have free consumer threads

available This is done based on a weighting of the messages to be reloaded The weights for the different

priorities are as follows High 75 Medium 20 Low 5

Based on this approach you can ensure that more resources can be used for high priority interfaces The

screenshot below shows the UI for message prioritization available in pimon Configuration and

Administration Message Prioritization

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

65

The number of messages per priority can be seen in a dashboard in Wily as shown below

You can find more details on configuring usage prioritization within the AAE at SAP Online Help navigate to

SAP NetWeaver SAP NetWeaver PI SAP NetWeaver Process Integration 71 Including Enhancement

Package 1 SAP NetWeaver Process Integration Library Function-Oriented View Process Integration

Process Integration Monitoring Component Monitoring Prioritizing the Processing of Messages

624 Avoid Blocking Caused by Single SlowHanging Receiver Interface

As already discussed above some receiver adapters process messages sequentially on one server node by

default This is independent of the number of consumer threads defined for the corresponding receiver

queue in the messaging system For example even though you have configured 20 threads for the JDBC

Receiver one Communication Channel will only be able to send one request to the remote database at a

given time If there are many messages for the same interface all of them will get a thread from the

messaging system but will be blocked when performing the adapter call

In the case of a slow message transmission for example for EDI interfaces using ISDN technology or

remote system with small network bandwidth this behavior can cause a ldquohangingrdquo situation for a specific

adapter since one interface can block all messaging threads As a result all the other interfaces will not get

any resources and will be blocked

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

66

Based on this a parameter messagingsystemqueueParallelismmaxReceivers was introduced which can

be set in NWA in service XPI Service Messaging System With this parameter you can restrict the maximum

number of receiver consumer threads that can be used by one interface In older releases (lower 731 SP11

and 74 SP6) this is a global parameter at the moment that affects all adapters It should therefore not be set

too restrictively If you need high parallel throughput one option is to set the maxReceiver parameter to 5 (so

that each interface can use 5 consumer threads for each server node) and increase the overall number of

threads on the receive queue for the adapter in question (for example JDBC) to 20 With this configuration it

will be possible for four interfaces to get resources in parallel before all threads are blocked For more

information see Note 1136790 - Blocking Receiver Channel May Affect the Whole Adapter Type and SDN

blog Tuning the PI Messaging System Queues

In the screenshot below you see the impact of maxReceivers parameter when a backlog for one interface

occurs Even though there are more free SOAP threads available they are not consumed Hence the free

SOAP threads could be assigned to other interfaces running in parallel

This parameter has a direct impact on the queue where message backlogs occur As mentioned earlier

usually a backlog should occur on the Dispatcher queue only since it is responsible to dispatch threads only

if there are free consumer threads in the adapter specific queue When setting maxReceiver you restrict the

threads per interface and this means that less consumer threads are blocked When one interface is facing a

high backlog there will always be free consumer threads and therefore the dispatcher queue can dispatch

the message to the adapter specific queue Hence the backlog will appear in the adapter specific queue so

that the message prioritization would not work properly any longer

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

67

Per default the maxReceiver parameter is only relevant for asynchronous message processing (ICo and

classical dual-stack scenarios) The reason here is that a wait time for synchronous scenarios should be

avoided by all means and therefore additional restrictions in the number of available threads can be very

critical Therefore it is usually advisable to not limit the threads per interface but increase the overall number

of available threads In case you have many high volume synchronous scenarios with different priority that

run in parallel it might be advisable to limit the threads each interface can use To do so you have to set the

parameter messagingsystemqueueParallelismqueueTypes to Recv ICoAll as described in SAP Note

1493502 - Max Receiver Parameter for Integrated Configurations

Enhancement with 731 SP11 and 74 SP6

With Note 1916598 - NF Receiver Parallelism per Interface an enhancement was introduced that allows

the specification of the maximum parallelization on more granular level This new feature has to be activated

by setting the parameter messagingsystemqueueParallelismperInterface to true Using a configuration UI

you can specify rules to determine the parallelization for all interfaces of a given receiver service If no rule

for a given interface is specified the global maxReceivers value will be considered If the receiver service

corresponds to a technical business system this configuration would help to restrict the parallel requests

from PI to that system This also works across protocols so that eg it could be used to restrict the number of

parallel requests using Proxy or IDoc to the same ERP receiver system Below you can find a screenshot of

the configuration UI in NWA SOA Monitoring

With the improvement mentioned above also the dispatching mechanism in the dispatcher queue is

changed so that it is aware of the maxReceiver settings This means that now again the backlog will now be

placed in the dispatcher queue and the prioritization will work properly

625 Overhead based on interface pattern being used

Also the interface pattern configured can cause some overhead When choosing the interface pattern

ldquoStateless Operationrdquo each message is parsed against its data type during inbound processing This can

cause performance and memory overhead but gives a syntactical check of the data received

When choosing ldquoStateless (XI30-Compatible)rdquo no check of the data type is performed and therefore the

overhead is avoided Therefore for interface with good data quality and high message throughput this

interface pattern should be chosen

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

68

63 Performance of Module Processing

If your analysis in chapter Adapter Performance Problem pointed to a problem in the Module Processing

then you must analyze the runtime of the modules used in the Communication Channel Adapter modules

can be combined so that one CC calls multiple modules in a defined sequence Adapter modules can be

custom-developed or SAP standard and can be used for many different purposes ndash like to transform the EDI

message before sending it to the partner or to change header attributes of a PI message before forwarding it

to a legacy application Due to the flexible usage of adapter modules there is also no standard way to

approach performance problems

In the audit log shown below you can see two adapter modules One module is a customer-developed

module called SimpleWaitModule The next module called CallSAPAdapter which is a standard module that

inserts the data into the messaging system queues

In the audit log you will get a first impression of the duration of the module In the example above you can

see that the SimpleWaitModule requires 4 seconds of processing time

Once again Wily Introscope offers a better overview of the processing time of modules In the screenshot

below you can see a dashboard showing the cumulativeaverage response times and number of invocations

of different modules If there is one that has been running for a very long time then it would be very easy to

identify since there will be a line indicating a much higher average response time The tooltip displays the

name of the module

Module

Processing

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

69

Once you have identified the module with the long running time you have to talk to the responsible developer

to understand why it is taking that long Eventually the module executes a look-up to a remote system using

JCo or JDBC which could be responsible for the delay In the best case the module will print out information

in addition to the audit log to detect such steps If not use the Wily Introscope transaction trace as explained

in appendix Wily Transaction Trace

64 Java only scenarios Integrated Configuration objects

From SAP NetWeaver PI 71 on you can configure scenarios that only run on the Java stack using the

Advance Adapter Engine (AAE) when only Java-based adapters are used A new configuration object is

used for this that is called Integrated Configuration When using this the steps executed so far in the ABAP

pipeline (Receiver Determination Interface Determination and Mapping) are also executed by the services in

the Adapter Engine

641 General performance gain when using Java only scenarios

The major advantage of AAE processing is a reduced overhead due to the context switches between ABAP

and Java Thus the overall throughput can be increased significantly and the overall latency of a PI

message can be reduced greatly If possible interfaces should always use the Integrated Configuration to

achieve best performance Therefore the best tuning option is to change a scenario that is using Java based

sender and receiver adapters from a classical ABAP-based scenario to an AAE-based one

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

70

Based on SAP internal measurements performed on 71 releases the throughput as well as the response

time could be improved significantly as shown in the diagrams below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

71

Based on these measurements the following statements can be made

1) Significant performance improvements can be achieved always with local processing (if available) for a certain scenario

2) This is valid for all adapter types huge mapping runtimes slow networks slow (receiver) applications reduce the throughput and therefore rated for the overall scenario also reduce the benefit of local processing in terms of overall throughput and response time measurements

3) Greatest benefit is recognized for small payloads (comparison 10k 50k 500k) and asynchronous messages

642 Message Flow of Java only scenarios

All the Java based tuning options mentioned in chapter Analyzing the (Advanced) Adapter Engine and Long

Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo are still valid Just the calling sequence is different

The following describes the steps used in Integrated Configuration to help you better understand the

message flow of an interface The example is JMS to Mail

1) Enter the JMS sender adapter

2) Put into the dispatcher queue of the messaging system

3) Forwarded to the JMS send queue of the messaging system

4) Message is taken by JMS send consumer thread

a No message split used

In this case the JMS consumer thread will do all the necessary steps previously done in the ABAP pipeline (like receiver determination interface determination and mapping) and will then also transfer the message to the backend system (remote database in our example) Thus all the steps are executed by one thread only

b Message split used (1n message relation)

In this case there is a context switch The thread taking the message out of the queue will process the message up to the message split step It will create the new message and put it in the Send queue again from where it will be taken by different threads (which will then map the child message and finally send it to the receiving system)

As we can see in this example for Integrated Configuration only one thread does all the different steps of a

message The consumer thread will not be available for other messages during the execution of these steps

The tuning of the Send queue (Call for synchronous messages) is therefore much more important for

scenarios using Integrated Configuration than for ABAP-based scenarios

The different steps of the message processing can be seen in the audit log of a message If for example a

long-running mapping or adapter module is indicated then you can use the relevant chapter of this guide

There is no difference in the analysis except that for mappings no JCo connection is required since the

mapping call is done directly from the Java stack

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

72

The example audit logs below are based on a Java only synchronous SOAP to SOAP scenario

In the highlighted areas you can see that all the steps are very fast except the mapping call which lasts

around 7 seconds To analyze the long duration of the mapping now the same steps as discussed in chapter

Long Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo have to be followed

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

73

643 Avoid blocking of Java only scenarios

Like classical ABAP based scenarios Java only scenarios can face backlog situations in case of a slow or

hanging receiver backend Since the Java only interfaces are only using send queues the restriction of

consumer threads on the receiver queue as described in chapter Avoid Blocking Caused by Single

SlowHanging Receiver Interface is no solution

Based on that an additional property messagingsystemqueueParallelismqueueTypes of service SAP XI AF

MESSAGING was introduced via Note 1493502 - Max Receiver Parameter for Integrated Configurations

The parameter can be set for synchronous and asynchronous interfaces Please keep in mind that

messagingsystemqueueParallelismmaxReceivers is a global parameter for all adapters and for all

configured Quality of Service This global restriction can be avoided starting with 731 SP11 740 SP06 as

per SAP Note 1916598 - NF Receiver Parallelism per Interface Usually a restriction for the parallelization

can be highly critical for synchronous interfaces Therefore we generally recommend to set

messagingsystemqueueParallelismqueueTypes in most cases to Recv IcoAsync only

644 Logging Staging on the AAE (PI 73 and higher)

Up to PI 711 on the PI Adapter Engine only one message version was persisted Especially for Java only

scenarios this was often not sufficient for troubleshooting since for instance the result of a mapping could not

be verified

Starting from PI 73 the persistence steps can be configured globally in the Adapter Engine for synchronous

and asynchronous interfaces This is similar to the LOGGING or LOGGING_SYNC on the ABAP stack In

later version of PI you will be able to do the configuration on interface level

The versions that can be persisted are shown in the diagram below (the abbreviations in green are the

values for the configuration)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

74

In the Messaging System we generally distinguish Staging (versioning) and Logging An overview is given

below

For details about the configuration please refer to the SAP online help Saving Message Versions

Similar to the LOGGING on the ABAP Integration Server persisting several versions of the Java message

can cause a high overhead on the DB and can cause a decrease in performance

The persistence steps can be seen directly in the audit log of a message

Also the Message Monitor shows the persisted versions

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

75

While this additional monitoring attributes are extremely helpful for troubleshooting they should be used

carefully from a performance perspective Especially the number of message versions and the

loggingstaging mode you choose can have a severe impact on performance Therefore you have to find the

balance between business requirement and performance overhead Some guidelines on how to use staging

and logging are summarized in SAP Note 1760915 - FAQ Staging and Logging in PI 73 and higher

65 J2EE HTTP load balancing

With NetWeaver version 71 and higher the Java HTTP load balancing is no longer done by the Java

Dispatcher but by the ICM The default load balancing rules of the ICM are designed with a stateful

application in mind giving priority to the balancing of HTTP sessions These default rules are not optimal for

stateless applications such as SAP PI

In the past we could observe an unequal load distribution across Java server nodes caused by this In case

of high backlogs this can cause a delay in the overall message processing of the interface because one

server node has more messages assigned than others and therefore not all available resources are used

A typical example can be seen in the Wily screenshot below

SAP Note 1596109 ndash Uneven distribution of HTTP requests on Java server nodes introduces new load

balancing rules for stateless applications like PI Please follow the description in the Note to ensure the

messages are distributed equally across the available server nodes In the meantime these load balancing

rules can also be executed using the CTC templates as described in Notes 1756963 and 1757922 Also this

has been included in the PI Initial Setup wizard in order to execute this task automatically as a post-

installation step (see Note 1760700 - PI CTC Add HTTP loadbalancing to initial setup)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

76

For the example given above we could see a much better load balancing after the new load balancing rules

were implemented This can be seen in the following screenshot

Please Note The load balancing rules mentioned above are only responsible to balance the messages

across the available server nodes of one instance HTTP load balancing across application servers is done

by the SAP Web Dispatcher More information about this is provided in the Guide How To Scale SAP PI 71

66 J2EE Engine Bottleneck

All tuning that can be done in the AFW is limited by the ability of the J2EE Engine to handle a given amount

of threads and to provide enough memory that is needed to process all requests Of course the CPU is also

a limiting factor but this will be discussed in Chapter 91

From SAP NetWeaver PI 71 onwards the SAP VM is used on all platforms Therefore the analysis of all

platforms can use the same tools

661 Java Memory

The first analysis of the J2EE memory is usually done using the garbage collection (GC) behavior of the

Java Virtual Machine (JVM) To get the information of the GC written to the standard log file the JVM has to

be started with the parameter ndashverbosegc

Analyze the file std_serverltngtout for the frequency of GCs their duration and the allocated memory

Look for unusual patterns indicating an Out-of-Memory error or an unintentional restart of your J2EE engine This would be visible if the allocated memory drops to 0 at a given point in time A healthy Garbage Collection output would display a saw tooth pattern ndash that is the memory usage would increase over time but then go down to its original value as soon as a full Garbage Collection is triggered You should see the memory usage go down to initial values during low volume times (night-time)

Pay attention to GCs with a high duration This is important because no PI message can be mapped or processed by the J2EE Adapter Framework during a GC in a JAVA VM One of the many reasons for long GCs that should be mentioned here is paging (see chapter 92) If the heap space of the J2EE Engine is exhausted all objects in the heap are evaluated for their references If there is still a reference the object is kept If there is no reference the object is deleted If some or all of the objects have been swapped out due to insufficient RAM then they have to be swapped in to be evaluated This leads to very long runtimes for GCs GCs of more than 15 minutes have been observed on swapping systems

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

77

Different tools exist for the Garbage Collection Analysis

1) Solution Manager Diagnostics

Solution Manager Diagnostics (SMD) offers a memory analysis application that is able to read the GC output in the std_serverltngtout files To start call the Root Cause Analysis section In the Common Task list you find the entry Java Memory Analysis which will give you the chance to upload your std_serverltngtout file It shows the memory allocated after the GC and also prints the duration of each GC (black dots with duration scale on right axis) A normal GC output should show a saw tooth pattern where the memory always goes down to an initial value (especially during low volume times) This is shown in the example screenshot below

2) Wily Introscope

Again Wily Introscope offers different dashboards that can be used to check the Garbage Collection behavior on the J2EE engine The dashboard can be found using the J2EE Overview J2EE GC Overview and shows important KPIs for the GC such as the count of GC or the duration for each interval The GC Time dashboards shows the ratio of time spent in GC An example is shown in the screenshot below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

78

3) Netweaver Administrator

The NWA also offers a view to monitor the Java heap usage However the important information about the duration of the GC is not available This monitor can be found by navigating to Availability and Performance Management Java System Reports Choose Report System Health

In NetWeaver 73 you find this data in Availability and Performance Management Resource Monitoring History Reports There you can build your own Memory report of the monitoring data provided The screenshot below eg shows the Used Memory per server node

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

79

4) SAP JVM Profiler

The SAP JVM profiler allows the analysis of the extensive GC information stored in the sapjvm_gcprf files of the server node folders or to directly connect to the server node to retrieve this information More information can be found at httpwikiscnsapcomwikidisplayASJAVAJava+Profiling

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

80

Procedure

o Search for restarts of the J2EE Engine by searching for the string ldquois startingrdquo in the file std_serverltngtout Apart from the first of these entries (which simply marks the initial start of the J2EE Engine) the second and any later entries mark a fatal situation after which the J2EE server node had to be restarted This could for example be an outndashof-memory situation Search the log above those entries for a possible reason

o Search for exceptions and fatal entries Was an out-of-memory error reported (search for pattern OutOfMemory) Was a thread shortage reported

o This document does not deal with tuning the J2EE Engine If at this point it becomes clear that the J2EE Engine is the limiting factor proceed with SAP Notes or open a customer incident Only some very basic recommendations are given here

o The parameters for the SAP VM are defined by Zero Admin templates From time to time new Zero Admin templates might be released which change important parameters for the SAP VM Thus you should check SAP Note 1248926 - AS Java VM Parameters for NetWeaver 71-based Products regularly and apply the changes accordingly

o The problem might be caused by too high parallelization when processing large messages This should be restricted to avoid overloading of the Java heap memory

1) In case the interfaces are using the ABAP Integration Server use the large message queue filters to restrict the parallelization of mapping calls from ABAP queues To do so set parameter EO_MSG_SIZE_LIMIT to eg 5000 to direct all large messages to dedicated XBTL or XBTM queues

2) To restrict the parallelization on the Java Messaging System you can use the large message queue mechanism described in chapter Large message queues on the Messaging System

o If it is already obvious that the memory of your J2EE Engine is getting short and nothing can be changed on the scenarios itself you have several options for scaling your PI landscape Especially for large PI installations with high message throughput or large messages SAP Note 1502676 - Scaling up large PI installations describes potential options

1) Increase the Java Heap of your server node If sufficient physical memory is available increase the default heap size of 2 GB to a higher value such as 3 or 4 GB Experience shows that the SAP VM can handle these heap sizes without major impact on performance Larger heap sizes can cause longer processing time of GCs which in turn can affect the PI application negatively Increasing the maximum heap size on the J2EE engine will automatically adapt the new area of the heap due to a dynamic configuration (in newer J2EE versions this will be per default set to 16 of the overall heap) After increasing the heap size monitor the duration of the GC execution

2) Add an additional server node to distribute the load over more processes SAP recommends that productive systems have at least 2 Java server nodes per instance Prerequisite is that your physical host has enough RAM and CPU To minimize the overhead for J2EE cluster communication it is in general recommended configuring larger server nodes (as described above) then a high number of server nodes

3) Add an additional application server consisting of a double stack Web Application Server (ABAP and J2EE Engine) to distribute the load to an additional hardware Find details on the scaling of PI with multiple instances in the Guide How To Scale up SAP NetWeaver Process Integration

4) Use a non-central Adapter Engine to separate large messages that cause the memory problem from business critical scenarios

662 Java System and Application Threads

A system or application thread is required for every task that the J2EE Engine executes Each of these

thread types are maintained using a thread pool If the pool of available threads is expired no more requests

can be processed It is therefore important that you have enough threads available at all times

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

81

o In general SAP recommends that you increase the system and application thread count for a PI system As described in SAP Note 937159 - XI Adapter Engine is stuck the application threads should be increased to 350 and the system threads to lsquo120rsquo for all J2EE server nodes

o There are two options for checking the thread usage

1) Wily Introscope

Wily Introscope provides a dashboard in the J2EE Overview section called J2EE CPU and Memory Detail that can also be used to monitor the thread usage on a PI engine Different views exist for Application and System Threads as shown in the screenshot below

2) NetWeaver Administrator (NWA)

The NWA also offers a monitor similar to the one available for the memory mentioned above This monitor can be found by navigating to Availability and Performance Management Java System Reports Choose Report System Health and looking at the windows shown in the screenshot below

Again in NetWeaver 73 and higher the monitor can be found at Availability and Performance

Management Resource Monitoring History Report An example for the Application Thread

Usage is shown below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

82

Check if the ThreadPoolUsageRate is below 80 for both the application threads and the system threads Also check the maximum value by choosing a longer history in Wily Introscope

Has the thread usage increased over the last daysweeks

Is there one server node that shows a higher thread usage than others

3) SAP Management Console

The SAP Management Console is an essential tool to analyze different aspects of the J2EE engine It also offers a thread dump view similar to SM50 in ABAP showing the activity of all the threads of a given Java Application Server More information about the SAP MC can be found in Note 1014480 - SAP Management Console (SAP-MC) and the online help The SAP Management Console can be started as Java web-start application using httpltservergt5ltInstanc_numbergt13

Navigate to AS Java Threads and check for any threads in red status (gt20 secs running time) This indicates long running threads and you should check for the related thread stacks The example below shows a long running thread related to the PI directory You can see the user doing the request and can see that the thread is waiting for an HTTP response from the CPACache

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

83

If you are not sure what is causing the problem you should take multiple thread dumps (at least 3 every 30 seconds) for all server nodes This can be done in the Management Console navigating to AS Java Process Table

The Thread Dump Viewer tool (TDV) from Note 1020246 - Thread Dump Viewer for SAP Java Engine can be used to analyze the produced thread dumps (inside the results archive that can be downloaded to your local PC)

663 FCA Server Threads

An additional type of thread was introduced with SAP NetWeaver PI 71 that is responsible for HTTP traffic

FCA Server Threads The FCA Server Threads are responsible for receiving HTTP calls at the Java side

(after the Java dispatcher is no longer available) Also FCA Threads use a Thread Pool Fifteen FCA

Threads are configured by default but based on SAP Note 1375656 - SAP Netweaver PI System Parameters

we recommend that you increase it to 50 This can be done in the NWA by changing the parameter

FCAServerThreadCount of service HTTP Provider

FCA Server Threads are particularly crucial for synchronous message transfer and for HTTP-based

scenarios like WebServices or calls from the ABAP engine to the Adapter Engine If there is a long duration

of the HTTP call due to a slow backend system then the thread is blocked for the whole time and not

available to send other HTTP requests (other PI messages)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

84

The configured FCAServerThreadCount represents the maximum number of threads working on a single

entry point An entry point in the PI sense could eg be the SOAP Adapter Servlet (share with all channels

as described in Tuning SOAP sender adapter)

o If response times for specific entry point are high (gt15 seconds) additional FCA threads will be spawned to be available for parallel HTTP based incoming requests using different entry points only This ensures that in case of problems with one application not all other applications will be blocked constantly

o An overall maximum of 20xFCAServerThreadCount may be used in the J2EE engine

There is currently no standard monitor available for FCA Threads except the thread view in the SAP

Management Console A new dashboard will be integrated into Wily Introscope soon and will also show the

FCA Server Threads that are in use

664 Switch Off VMC

The Virtual Machine Container (VMC) is enabled by default for an ABAP WebAS 71 Since PI does not use

VMC at runtime the VMC can be switched off on a PI system to avoid resource overhead This

recommendation is based on SAP Note 1375656 - SAP NetWeaver PI System Parameters

You can activate the VM Container setting the profile parameters vmcjenable = off which can be changed

using profile maintenance (transaction RZ10) Once you have made the change you need to restart the SAP

Web Application Server instance

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

85

7 ABAP PROXY SYSTEM TUNING

Every ABAP WebAS gt 620 includes an ABAP Proxy runtime This enables a SAP system to talk native PI

protocol (XI-SOAP) so that no costly transformation is necessary

In general ABAP proxy is just a framework that is triggered by (sender) or triggers (receiver) application-

specific coding It is not possible to give a general tuning recommendation because the applications and use

cases of ABAP proxy can differ so greatly

In this section we would like to highlight the system tuning options that can be applied to improve the

throughput at the ABAP Proxy backend side

In general you can increase the throughput of EO interfaces by changing the queue parallelization Two

different parameters exist to change the number of qRFC queues on the ABAP Proxy backend As in PI

ABAP Proxy processing is based on qRFC inbound queues (SMQ2) with specific names A sender proxy

uses XBTS queues (10 parallel by default) and a receiver proxy XBTR (20 parallel by default)

The sender proxy only forwards the message to PI by executing an HTTP call that must not take too much

time and which is not resource-critical On the other hand the receiver proxy executes the inbound

processing of the messages based on the application context (and which can be very time-consuming) It is

therefore more likely to face a backlog situation in the Receiver Proxy queues In the screenshot below you

can see the performance header of a receiver proxy message that required around 20 minutes in the

PLSRV_CALL_INBOUND_PROXY (which corresponds to the posting of the application data)

Of course such a long-running message will block the queue and all messages behind it will face a higher

latency Since this step is purely application-related it is only possible to perform tuning at the application

side

The number of sender and receiver ABAP Proxy queues can be changed in transaction SXMB_ADM on the

proxy backend with parameter EO_INBOUND_PARALLEL and sub parameter SENDER (XBTS queues)

and RECEIVER (XBTR queues) The prerequisite for increasing the number of queues is that there are

enough qRFC resources are available at the proxy system (as discussed in section qRFC Resources

(SARFC))

Also as described in chapter Message Prioritization on the ABAP Stack prioritization can be configured in

the ABAP Proxy system This can be used to separate runtime-critical interfaces from interfaces with a long

application processing (as shown above)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

86

71 New enhancements in Proxy queuing

The queuing mechanism was enhanced for sender and receiver proxy systems with Note 1802294

(Receiver) and Note 1831889 (Sender) After implementing these two Notes interface specific queues will be

used and tuning of these queues on interface level will be possible This can be very helpful in cases where

one receiver interface shows very long posting time in the application coding that cannot be further

improved Other messages for more business critical interfaces will eventually be blocked by this message

due to the common usage of the XBTR queues On the sender side of a Proxy system the processing of the

queues call the central PI hub In general the processing time there should be fast But in case of high

volume interfaces you might want to slow down less business critical interfaces to avoid overloading of your

central PI hub The prioritization mechanisms available prior to these Notes did not provide such options

This is achieved by adding an interface specific identifier to the queue-name In the screenshot below you

see a comparison in the queue names for the old framework (red) and the new framework (blue)

For the sender queues the following new parameters are introduced

o Parameter EO_QUEUE_PREFIX_INTERFACE with sub parameter SENDERSENDER_BACK Should be set to lsquo1rsquo to active interface specific queue This is a global switch affecting all sender interfaces

o Parameter EO_INBOUND_PARALLEL_SENDER without sub parameter determines the general number of queues per interface

Using a sub parameter ltsender IDgt the parallelization of individual interfaces can be changed This can eg be used to assign more queues for high priority interfaces

Wildcards can be used in the ltsender IDgt to group interfaces sharing eg the same service name or the same interface name but different services

For the receiver queues the following new parameters are introduced

o Parameter EO_QUEUE_PREFIX_INTERFACE with sub parameter RECEIVER Should be set to lsquo1rsquo to active interface specific queue This is a global switch affecting all receiver interfaces

o Parameter EO_INBOUND_PARALLEL_RECEIVER without sub parameter determines the general number of queues per interface

Using a sub parameter ltsender IDgt the parallelization of individual interfaces can be changed This can eg be used to assign more queues for high priority interfaces

Wildcards can be used in the ltsender IDgt to group interfaces sharing eg the same service name or the same interface name but different services

This new feature also replaces the currently existing prioritization since it is in general more flexible and

powerful We therefore recommend adjusting existing prioritization rules using the new queuing possibilities

described above

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

87

8 MESSAGE SIZE AS SOURCE OF PERFORMANCE PROBLEMS

The message size directly influences the performance of an interface The size of a PI message depends on

two elements The PI header elements with a rather static size and the payload which can vary greatly

between interfaces or over time for one interface (for example larger messages during year-end closing)

The size of the PI message header can cause a major overhead for small messages of only a few kB and

can cause a decrease in the overall throughput of the interface Furthermore many system operations (like

context switches or database operations) are necessary for only a small payload The larger the message

payload the smaller the overhead due to the PI message header On the other hand large messages

require a lot of memory on the Java stack which can cause heavy memory usage on ABAP or excessive

garbage collection activity (see section Java Memory) that will also reduce the overall system performance

Very large messages can even crash the PI system by causing an Out-of-Memory exception for example

You therefore have to find a compromise for the PI message size

Below you see throughput measurements performed by SAP In general the best throughput was identified

for messages sizes of 1 to 5 MB

The message size in these measurements corresponds to the XML message size processed in PI and not

the size of the file or IDoc being sent to PI You can use the Runtime Header of the ABAP stack to check the

message size Below you can see an example of a very small message While the MessageSizePayload

field describes the size of the payload in bytes (here 433 bytes) the MessageSizeTotal describes the total

message size (header + payload) In the example this is around 14 KB demonstrating the overhead by the PI

header for small messages The next two lines describe the payload size before and after the mapping In

the example below the mapping reduces the payload size The last two lines determine the size of the

response message that is sent back to PI before and after the response mapping for synchronous

messages

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

88

Based on the above observations we highly recommend that you use a reasonable message size for your

interfaces During the design and implementation of the interface we therefore recommend using a message

size of 1 to 5 MB if possible Messages can be collected at the sender side or by using IDoc packaging as

described in Tuning the IDoc Adapter to achieve this In case of large messages a split has to be performed

by changing the sender processing or by using the split functions available in the structure conversion of the

File adapter

81 Large message queues on PI ABAP

In case the interfaces are using the ABAP Integration Server use the large message queue filters to restrict

the parallelization of mapping calls from ABAP queues To do so set the parameter EO_MSG_SIZE_LIMIT of

category TUNING to eg 5000 to direct all messages larger 5 MB to dedicated XBTL or XBTM queues

The value of the parameter depends on the number of large messages and the acceptable delay that might

be caused due to a backlog in the large message queue

To reduce the backlog the number of large message queues can also be configured via parameter

EO_MSG_SIZE_LIMIT_PARALLEL of category TUNING The default value is 1 so that all messages larger

than the defined threshold will be processed in one single queue Naturally the parallelization should not be

set higher than 2 or 3 to avoid overloading of the Java memory due to parallel large message requests

82 Large message queues on PI Adapter Engine

SAP Note 1727870 - Handling of large messages in the Messaging System introduces large message

queues (virtual queues no adapter behind) also for the Java based Adapter Engine Contrary to the

Integration Engine it is not the size of a single large message only that determines the parallelization

Instead the sum of the size of the large messages across all adapters on a given Java server node is limited

to avoid overloading the Java heap This is based on so called permits that define a threshold of a message

size Each message larger than the permit threshold is considered as large message The number of permits

can be configured as well to determine the degree of parallelization Per default the permit size is 10 MB and

10 permits are available This means that large messages will be processed in parallel if 100 MB are not

exceeded

To show this let us look at an example using the default values Let us assume we have 5 messages waiting

to be processed (status ldquoTo Be Deliveredrdquo on one server node Message A has 5 MB message B has 10

MB message C has 50 MB message D 150 MB message E 50 MB and message F 40 MB Message A is

not considered large since the size is smaller than the permit size and is not considered large and can be

immediately processed Message B requires 1 permit message C requires 5 Since enough permits are

available processing will start (status DLNG) Hence for message D all available 10 permits would be

required Since the permits are currently not available it cannot be scheduled If blacklisting is enabled the

message will be put to error status (NDLV) since it exceed the maximum number of defined permits In that

case the message would have to be restarted manually Message E requires 5 permits and can also not be

scheduled But since there are 4 permits left message F is put to DLNG Due to the smaller size message B

and message F finish first releasing 5 permits This is sufficient to schedule message E which requires 5

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

89

permits Only after message E and C have finished message D can be scheduled consuming all available

permits

The example above shows a potential delay a large message could face due to the waiting time for the

permits But the assumption is that large messages are not time critical and therefore additional delay is less

critical than potential overload of the system

The large message queue handling is based on the Messaging System queues This means that restricting

the parallelization is only possible after the initial persistence of the message in the Messaging System

queues Per default this is only done after the Receiver Determination Therefore if you have a very high

parallel load of incoming large requests this feature will not help Instead you would have to restrict the size

of incoming requests on the sender channel (eg file size limit in the file adapter or the

icmHTTPmax_request_size_KB limit in the ICM for incoming HTTP requests) If you have very complex

extended receiver determination or complex content based routing it might be useful to configure staging in

the first processing step of the Messaging System (BI=3) as described in Logging Staging on the AAE (PI

73 and higher)

The number of permits consumed can be monitored in PIMON Monitoring Adapter Engine Status The

number of threads corresponds to the number of consumed permits

In newer Wily versions there is also a dashboard showing the number of consumed permits as well The

Worker Threads on the right hand side of the screenshot correspond to the permits

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

90

9 GENERAL HARDWARE BOTTLENECK

During all tuning actions discussed in the previous chapters you must keep in mind that the limitation of all

activities is set by the underlying CPU and memory capacity The physical server and its hardware have to

provide resources for three PI runtimes the Adapter Engine the Integration Engine and the Business

Process Engine Tuning one of the engines for high throughput leaves fewer resources for the remaining

engines Thus the hardware capacity has to be monitored closely

91 Monitoring CPU Capacity

When monitoring the CPU capacity it is not enough to use the average that is displayed in the ldquoPrevious

Hoursrdquo section of transaction ST06 Instead you should start your interface and monitor the CPU while it is

running The best procedure is to use operating system tools to monitor the CPU usage (particularly if using

hardware virtualization)

The SMD Host Agent also reports CPU data to Wily Introscope which can be viewed using dashboards as

shown below This example shows two systems where one is facing a temporary CPU overload and the

other a permanent one

Also the NWA offers a view on the CPU activity from NetWeaver 73 on via Availability and Performance

Management Resource Monitoring History Reports There you can build your own report based on the

ldquoCPU utilizationrdquo data as shown in the screenshot

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

91

You can also monitor the CPU usage in ST06 as described below

Procedure

Log on to your Integration Server and call transaction ST06 Take a look at the snapshot that is provided on

the entry screen than navigate to ldquoDetailed Analysis Menurdquo

Snapshot view Is the idle time (upper left corner) at a reasonable value at all times for example around 10 or higher

Snapshot view The load average is a good indicator of the dimension of the bottleneck It describes the number of processes for each CPU that are in a wait queue before they are assigned to a free CPU As long as the average remains at one process for each available CPU the CPU resources are sufficient Once there is an average of around three processes for each available CPU there is a bottleneck at the CPU resources In connection with a high CPU usage a high value here can indicate that too many processes are active on the server In connection with a low CPU usage a high value here can indicate that the main memory is too small The processes are then waiting due to excessive paging

Detailed Analysis View TOP CPU Which thread uses up the highest amount of CPU Is it the J2EE Engine the work processes or the database

o There are different follow-up actions to be taken depending on the findings of the second check The first option is of course to simply increase the hardware This should be accompanied by a thorough sizing (for which SAP provides the Quicksizer in the Service Marketplace httpservicesapcomquicksizer )

o The second option is to identify the CPU-consuming threads and to think about a reduction of the load in this component For example if the J2EE Engine is the largest consumer then it is worth re-evaluating the number of concurrent mapping threads andor consumer queues as described in the previous chapters

o SAP Note 742395 - Analyzing High CPU Usage by the J2EE Engine provides a good entry point if the J2EE Engine consumes too much CPU

o If it is the work processes that consume a lot of CPU use transaction ST03N to figure out if it is the Integration Server or the Business Process Engine You can do that by looking at the CPU usage per user BPE uses the user WF-BATCH while the Integration Server uses the last user who activated a qRFC queue (typically PIAFUSER or PIAPPLUSER) Use the previous chapter to restrict these activities

92 Monitoring Memory and Paging Activity

As stated above paging is very critical for the Java stack since it influence the Java GC behavior directly

Therefore paging should be avoided in every case for a Java-based system

The OS tools are the most reliable for monitoring the paging activity on your system Transaction ST06 can

also be used to perform a snapshot analysis Take a look at the snapshot that is provided on the entry

screen

Snapshot view Is there enough physical memory available

Snapshot view Is there considerable paging Paging can have a negative influence on your J2EE Engine If it does this can be seen in long GC times as described in Chapter 661

93 Monitoring the Database

Monitoring database performance is a rather complex task and requires information to be gathered from a

variety of sources Only a basic set of indicators is given for the purpose of this guide A more detailed

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

92

analysis is required if these indicators point to a major performance problem in the database If assistance is

needed open a support message under the component BC-DB-ORA (or MSS SDB DB6 respectively)

931 Generic J2EE database monitoring in NWA

The next sections of this chapter will be mainly based on ABAP transactions to do a detailed analysis of the

database performanceThe NWA however offers possibilities to monitor the database performance for the

Java based tables This is especially helpful for Java only AEXPO or Non-Central AAEs In the NWA you

can use the ldquoOpen SQL Monitorrdquo in the Troubleshooting section of NWA The official documentation can be

found in the Online Help Monitoring DB via DBACOCKPIT transaction (ABAP based) is possible also for the

Java-only systems from SAP Solution Manager (DBA Cockpit must be configured in the Managed System

Setup)

It will be too much to outline all the available functionalities here Instead only a few key capabilities will be

demonstrated If you want to eg see the number of select update or insert statements in your system you

can use the ldquoTable Statistics Monitorrdquo It clearly shows you which tables in your system are accessed the

most frequently

To see the processing time of the different statements on your system you can use the ldquoOpen SQL statisticsrdquo

monitor There you can see the count the total average and max processing time of the individual SQL

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

93

statements and can therefore identify the expensive statements on your system

Per default the recorded period is always from the last restart of the system If you would like to just look at

the statistics for a specific time period (eg during a test) you have the option of resetting the statistics in the

individual monitor prior to the test

932 Monitoring Database (Oracle)

Procedure

Log on to your Integration Server and call transaction ST04

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

94

Many of the numbers in the screenshot above depend on each other The following checklist names a few

key performance figures of your Oracle database

The data buffer quality for instance is based on the ratio of physical reads versus the total number of reads The lower the ratio the better the buffer quality The data buffer quality should be better than 94 the statistics should be based on 15 million total reads This number of reads ensures that the database is in an equilibrated state

Ratio of user and recursive calls A good performance is indicated by ratio values greater than 2 Otherwise the number of recursive calls compared to the number of user calls is too high Over time this ratio always declines because more and more SQL statements get parsed in the meantime

Number of reads for each user call If this value exceeds 30 blocks for each user call this indicates an expensive SQL statement

Check the value of TimeUser call Values larger than 15 ms often indicate an optimization issue

Compare busy wait time versus CPU time A ratio of 6040 generally indicates a well-tuned system Significantly higher values (for example 8020) indicate room for improvement

The DD-cache quality should be better than 80

933 Monitoring Database (MS SQL)

Procedure

To display the most important performance parameters of the database call Transaction ST04 or choose

Tools Administration Monitor Performance Database Activity An analysis is only meaningful if

the database has been running for several hours with a typical workload To ensure a significant database

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

95

workload we recommend a minimum of 500 CPU busy seconds Note The default values displayed in

section Server Engine are relative values To display the absolute values press button Absolute values

Check the values in (1)

The cache hit ratio (2) which is the main performance indicator for the data cache shows the

average percentage of requested data pages found in the cache This is the average value since startup The value should always be above 98 (even during heavy workload) If it is significantly below 98 the data cache could be too small To check the history of these values use Transaction ST04 and choose Detail Analysis Menu Performance Database A snapshot is collected every 2 hours

Memory setting (5) shows the memory allocation strategy used and shows the following

FIXED SQL Server has a constant amount of memory allocated which is set by SQL Server configuration parameters min server memory (MB) = max server memory (MB)

RANGE SQL Server dynamically allocates memory between min server memory (MB) lt gt max server memory (MB)

AUTO SQL Server dynamically allocates memory between 4 MB and 2 PB which is set by min server memory (MB) = 0 and max server memory (MB) = 2147483647

FIXED-AWE SQL Server has a constant amount of memory allocated which is set by min server memory (MB) = max server memory (MB) In addition the address windowing extension functionality of Windows 2000 is enabled

934 Monitoring Database (DB2)

Procedure

To get an overview of the overall buffer pool usage catalog and package cache information go to

transaction ST04 choose Performance Database section Buffer Pool (or section Cache respectively)

1 1

1 2

1 3

1 5

1 4

gt 6h

gt 98

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

96

Buffer Pools Number Number of buffer pools configured in this system

Buffer Pools Total Size The total size of all configured buffer pools in KB If more than one buffer pool is used choose Performance Buffer Pools to get the size and buffer quality for every single buffer pool

Overall buffer quality This represents the ratio of physical reads to logical reads of all buffer pools

Data hit ratio In addition to overall buffer quality you can use the data hit ratio to monitor the database (data logical reads - data physical reads) (data logical reads) 100

Index hit ratio In addition to overall buffer quality you can use the index hit ratio to monitor the database (index logical reads - index physical reads) (index logical reads) 100

Data or Index logical reads The total number of read requests for data or index pages that went through the buffer pool

Data or Index physical reads The total number of read requests that required IO to place data or index pages in the buffer pool

Data synchronous reads or writes Read or write requests performed by db2agents

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

97

Catalog cache size Maximum size of the catalog cache that is used to maintain the most frequently accessed sections of the catalog

Catalog cache quality Ratio of catalog entries (inserts) to reused catalog entries (lookups)

Catalog cache overflows Number of times that an insert in the catalog cache failed because the catalog cache was full (increase catalog cache size)

Package cache size Maximum size of the package cache that is used to maintain the most frequently accessed sections of the package

Package cache quality Ratio of package entries (inserts) to reused package entries (lookups)

Package cache overflows Number of times that an insert in the package cache failed because the package cache was full (increase package cache size)

935 Monitoring Database (MaxDB SAP DB)

Procedure

As with the other database types call transaction ST04 to display the most important performance

parameters

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

98

With the Database Performance Monitor (transaction ST04) and its submonitors the CCMS in the SAP R3

System allows you to view in the SAP R3 System all of the information that can be used to identify

bottleneck situations

The SQL Statements section provides information about the number of SQL statements executed and related sizes

The IO Activity section lists physical and logical read and write accesses to the database

The Lock Activity section (in transaction ST04) allows you to identify possible bottlenecks in the assignment of locks by the database

The Logging Activity section combines information about the log area

The Scan and sort activity section can be helpful in identifying that suitable indexes are missing

The Cache Activity section provides information about the usage of the caches and the associated hit rates

o The data cache Hit rate should be 99 in a balanced system If this is not the case you need to check if the low hit rate is due to the size of the data cache being too small or due to an unsuitable SQL statement If necessary you can increase the data cache by increasing the MaxDB parameter Data_Cache_Size (lower than 74) or Cache_Size (74 or higher)

o The catalog cache hit rate should be 85 in a balanced system It is not a cause for concern if the catalog hit rate is temporarily lower since accesses to the system tables and an active command monitor can temporarily impair the hit rate If the catalog cache is busy the pages are paged out in the data cache rather than on the hard disk

o Log entries are not written directly to the log volume but first into a buffer (LOG_IO_QUEUE) so as to be able to write several log entries together asynchronously with one IO on the hard disk Only once a LOG_IO_QUEUE page is full is it written to the hard disk However there are situations in which you need to write LOG entries from the LOG_QUEUE onto the hard disk as quickly as possible for example if transactions are completed (COMMITROLLBACK) The transactions wait until the log writer reports the OK informing them that the log entry is on the hard disk Firstly this means that is important to use the quickest possible hard disks for the log volume(s) and secondly you must ensure that no LOG_QUEUE_OVERFLOWS occur in production operation If the LOG_IO_QUEUE is full then all transactions that want to write LOG entries must wait until free memory becomes available again in the LOG_IO_QUEUE At LOG_QUEUE_OVERFOLWS (Transaction DB50 -gt Current Status -gt Activities Overview -gt LOG_IO_QUEUE Overflow) you need to increase the LOG_IO_QUEUE parameter

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

99

o In accordance with Note 819324 - FAQ MaxDB SQL optimization check which SQL statements are responsible for the most disk accesses and whether they can be optimized You should also optimize SQL statements that have a poor runtime and poor selectivity

Bottleneck Analysis

The Bottleneck Analysis function is available in transaction ST04 under the Problem Analyzes

Performance Database Analyzer Bottlenecks You can use this function to activate the corresponding

analysis tool dbanalyzer

The dbanalyzer then collects important performance data every 15 minutes and evaluates this for possible

bottlenecks

The result of the bottleneck analysis is output in text format to provide a quick overview of possible causes of

performance problems For a more detailed description of the bottleneck messages see the online

documentation (httphelpsapcom) in the SAP Web Application Server area and search for SAP DB

bottleneck analysis messages

The dbanalyzer tool can also be started at operating system level It logs its analysis results in files with date

stamps in the subdirectory ltrundirectorygtanalyzer (you can find the actual directory by double-clicking on

lsquoPropertiesrsquo in DB50) of the relevant database instance

94 Monitoring Database Tables

Some of the tables of SAP PI grow very quickly and can cause severe performance problems if archiving or

deletion does not take place frequently For troubleshooting see SAP Note 872388 ndash Troubleshooting

Archiving and Deletion in PI

Procedure

Log on to your Integration Server and call transaction SE16 Enter the table names as listed below and execute In the following screen simply press the button ldquoNumber of Entriesrdquo The most important tables are SXMSPMAST (cleaned up by XML message archivingdeletion)

If you use the switch procedure you have to check SXMSPMAST2 as well

SXMSCLUR SXMSCLUP (cleaned up by XML message archivingdeletion)

If you use the switch procedure you have to check SXMSPCLUR2 and SXMSCLUP2 as well

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

100

SXMSPHIST (cleaned up by the deletion of history entries)

If you use the switch procedure you have to check SXMSPHIST2 as well

SXMSPFRAWH and SXMSPFRAWD (cleaned up by the performance jobs see SAP Note 820622 ndash Standard Jobs for XI Performance Monitoring)

SWFRXI (cleaned up by specific jobs see SAP Note 874708 - BPE HT Deleting Message Persistence Data in SWFRXI)

SWWWIHEAD (cleaned up by work item archivingdeletion)

Use an appropriate database tool for example SQLPLUS for Oracle for the database tables of the J2EE schema Main tables that could be affected of growth

PI Message tables BC_MSG and BC_MSG_AUDIT (when audit log persistence is enabled)

PI message logging information when stagingloging is used BC_MSG_LOG_VERSION

XI IDoc tables (if IDoc persistence is activated) XI_IDOC_IN_MSG and XI_IDCO_OUT_MSG

Check for all tables if the number of entries is reasonably small or remains roughly constant over a period of time If that is not the case check your archivingdeletion setup

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

101

10 TRACES LOGS AND MONITORING DECREASING PERFORMANCE

101 Integration Engine

There are three important locations for the configuration of tracing logging in the Integration Engine The

pipeline settings of PI itself the Internet Communication Manager (ICM) and the gateway All three are

heavily involved in message processing and therefore their tracing or logging settings can have an impact on

performance On top of this you might have changed the SM59 RFC destinations and have switched on the

trace in order to analyze a problem If so then this needs to be reset as well

Procedure

Start transaction SXMB_ADM and navigate to Integration Engine Configuration Specific Configuration and search for the following entries

Category Parameter Subparameter Current Value Default

RUNTIME TRACE_LEVEL ltnonegt ltyour valuegt 1

RUNTIME LOGGING ltnonegt ltyour valuegt 0

RUNTIME LOGGING_SYNC ltnonegt ltyour valuegt 0

Set the above parameters back to the default value which is the recommended value by SAP

Start transaction SMICM and check the value of the Trace Level that is displayed in the overview

(third line) Set the trace level to lsquo1rsquo if not already the case You can do so by navigating to Goto Trace Level Set

Start transaction SMGW navigate to Goto Parameters Display and check the parameter Trace Level Set the trace level to lsquo1rsquo if not already the case You can do so by navigating to Goto Trace Gateway Reduce Level (or Increase Level respectively)

Start transaction SM59 and search the following RFC destination for the flag ldquoTracerdquo on the tab Special Options and remove the Flag ldquoTracerdquo if it has been set

AI_RUNTIME_JCOSERVER

AI_DIRECTORY_JCOSERVER

LCRSAPRFC

SAPSLDAPI It is possible that other RFC destinations are used for sending out IDocs and similar

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

102

102 Business Process Engine

You only have to check the Event Trace in the Business Process Engine

Procedure

Call transaction SWELS and check if the Event Trace is switched on The recommended setting is lsquooffrsquo (as shown in the screenshot below)

103 Adapter Framework

Since the Adapter Framework runs on the J2EE Engine the tracing and logging can be conveniently

checked and controlled using NetWeaver Administrator To improve the performance SAP recommends that

you set all trace levels to default Higher trace levels are only acceptable in productive usage during problem

analysis Below you will find a description on how to set the default trace level for all locations at once It is of

possible to do it for every location separately but this way you might forget to reset one of the locations

Procedure

Start the NetWeaver Administrator and navigate to Problem Management (or Troubleshooting in NW 73 and higher) Logs and Traces and choose Log Configuration In the Show drop down list box choose Tracing Locations Select the Root Location in the tree and press the button Default Configuration This will set the default configuration for all locations

Start the NetWeaver Administrator and navigate to Troubleshooting Logviewer (direct link nwalinks) Open the View ldquoDeveloper Tracerdquo and check if you have very frequent reoccurring exceptions that fill up the trace Analyze the exception and check what is causing this (eg a wrong configuration of a Communication Channel causing repeated traces If you see very frequent errors for which you cannot find the root cause open a customer incident at SAP

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

103

An aggregated view of Java exceptions (by number date instance server node etc) are reported and available in SAP Solution Manager Root Cause Analysis Exception Analysis functionality

1031 Persistence of Audit Log information in PI 710 and higher

With SAP NetWeaver PI 71 the audit log is not persisted for successful messages in the database by default

to avoid performance overhead Therefore the audit log is only available in the cache for a limited period of

time (based on the overall message volume)

Audit log persistence can be activated temporarily for performance troubleshooting where audit log

information is required To do so use the NWA and change the parameter

ldquomessagingauditLogmemoryCacherdquo to false in service XPI Service Messaging System by setting the

parameter to false Details can be found in SAP Note 1314974 - PI 71 AF Messaging System audit log

persistence Only do so temporarily during the time of troubleshooting to avoid any performance problems

from the additional persistence

After implementing Note 1611347 - New data columns in Message Monitoring additional information like

message processing time in ms and server node is visible in the message monitor

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

104

11 ERRORS AS SOURCE OF PERFORMANCE PROBLEMS

If errors occur within messages or within technical components of the SAP Process Integration then this can

have a severe impact on the overall performance Thus to solve performance problems it is sometimes

necessary to analyze the log files of technical components and to search for messages with errors To

search for messages with errors use the PI Admin Check which is available using SAP Note 884865 - PIXI

Admin Check

Procedure

ICM (Internet Communication Manager)

Start transaction SMICM open the log file (Shift + F5 or GoTo Trace File Show All) and check for errors

Gateway

Start transaction SMGW open the Log File (CTRL + Shift + F10 or Goto Trace Gateway Display File) and check for errors

System Log

Start transaction SM21 choose an appropriate time interval and search the log entries for errors Repeat the procedure for remote systems if you are using dialog instances

ABAP Runtime Errors

Start transaction ST22 choose an appropriate time interval and search for PI-related dumps In general the number of ABAP dumps should be very small on a PI system and therefore all occurring dumps should be analyzed

Alerts CCMS

Start transaction RZ20 and search for recent alerts

Work Process and RFC trace

In the PI work directory check all files which begin with dev_rfc or dev_w for errors

J2EE Engine

In the PI work directory search for errors in file dev_server0out If you are using more than one J2EE server node check dev_serverltngtout as well

Applications running on the J2EE Engine

The traces for application-related errors are the defaulttrc files located in j2eeclusterserverltngtlog directory To monitor exceptions across multiple server nodes the best tool to use is the LogViewer service of the NWA This is available via Problem Management Logs and Traces and choose Log Viewer Search for exceptions in the area of the performance problem for example the PI Messaging system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

105

APPENDIX A

A1 Wily Introscope Transaction Trace

As mentioned above the Wily Transaction trace can be used to trace expensive steps that were noticed

during the mapping or module processing The transaction trace will allow you to drill down further into Java

performance problems and to distinguish if it is a pure coding problem or caused by a look-up to a remote

system or a slow connection to the local database

The Wily Transaction trace can be used to trace all steps that exceed a specific duration on the Java stack If

a user is known (for example for SAP Enterprise Portal) the tracing can be restricted to a specific user In PI

if you encounter a module that lasts several seconds then you can restrict the tracing as shown below

Note Starting the Transaction Trace increases the data collection on the satellite system (PI) and is

therefore only recommended in productive environment for troubleshooting purposes

With the selection above the trace will run for 10 minutes (this is the maximum and it can be canceled earlier)

and will trace all steps exceeding 3 seconds for the agents of the PI system If such long-running steps are

found then a new window will be displayed listing these steps

In the screenshot below you can see the result in the Trace View The Trace View shows the elapsed time

from left to rightndash in the example below around 48 seconds From top to bottom we can see the call stack of

the thread In general we are interested in long-running threads on the bottom of the trace view A long-

running block at the bottom means that this is the lowest level coding that was instrumented and which is

consuming all the time

In the example below we see a mapping call that is performing many individual database statements ndash this

will become visible by highlighting the lowest level In such a case you have to review the coding of the

mapping to see if the high amount of database calls can be summarized in one call

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

106

Another case that is often seen is that a lookup using JDBC to a remote database or RFC to an ABAP

system takes a long time in the mapping or adapter module In such a case there will be one long block at

the bottom of the transaction trace that also gives you some details about the statement that was executed

A2 XPI inspector for troubleshooting and performance analysis

The XPI Inspector is an additional tool that provides enhanced logging and tracing capabilities with a main

focus on troubleshooting But the tool can also be used for troubleshooting of performance issues

General information about the tool can be found at Note 1514898 - XPI Inspector for troubleshooting PI The

tool can be called using the following URL on your system httphostjava-portxpi_inspector

To analyze performance problems typically the example ldquo51 ndash ldquoPerformance Problem)rdquo is used As a basic

measurement it allows you to take multiple thread dumps in specific time intervals These thread dumps can

be analyzed later by your administrators or SAP support to identify what is causing the problem The Thread

Dump Viewer tool (TDV) from Note 1020246 - Thread Dump Viewer for SAP Java Engine can be used to

analyze the produced thread dumps (inside the results archive that can be downloaded to your local PC)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

107

In addition the tool allows you to do JVM profiling be either doing JVM Performance tracing or JVM Memory

Allocation tracing This can help to understand in detail which steps of the processing are taking a long time

As output the tool provides prf files that can be loaded into the JVM Profiler More information can be found

at httpwikiscnsapcomwikidisplayASJAVAJava+Profiling Please Note that these traces (especially the

memory profiling) will cause a high overhead on the J2EE engine and are therefore not recommended to be

used in production Instead it is advisable to reproduce the problem on your QA system and use the tool

there

Document Version 30 March 2014

wwwsapcom

copy 2014 SAP AG All rights reserved

SAP R3 SAP NetWeaver Duet PartnerEdge ByDesign SAP

BusinessObjects Explorer StreamWork SAP HANA and other SAP

products and services mentioned herein as well as their respective

logos are trademarks or registered trademarks of SAP AG in Germany

and other countries

Business Objects and the Business Objects logo BusinessObjects

Crystal Reports Crystal Decisions Web Intelligence Xcelsius and

other Business Objects products and services mentioned herein as

well as their respective logos are trademarks or registered trademarks

of Business Objects Software Ltd Business Objects is an SAP

company

Sybase and Adaptive Server iAnywhere Sybase 365 SQL

Anywhere and other Sybase products and services mentioned herein

as well as their respective logos are trademarks or registered

trademarks of Sybase Inc Sybase is an SAP company

Crossgate mgic EDDY B2B 360deg and B2B 360deg Services are

registered trademarks of Crossgate AG in Germany and other

countries Crossgate is an SAP company

All other product and service names mentioned are the trademarks of

their respective companies Data contained in this document serves

informational purposes only National product specifications may vary

These materials are subject to change without notice These materials

are provided by SAP AG and its affiliated companies (SAP Group)

for informational purposes only without representation or warranty of

any kind and SAP Group shall not be liable for errors or omissions

with respect to the materials The only warranties for SAP Group

products and services are those that are set forth in the express

warranty statements accompanying such products and services if

any Nothing herein should be construed as constituting an additional

warranty

wwwsapcom

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

8

Looking at the different involved components we get a first impression of where a performance problem

might be located In the Integration Engine itself in the Business Process Engine or in one of the Advanced

Adapter Engines (central non-central plain PCK) Of course a performance problem could also occur

anywhere in between for example in the network or around a firewall Note that there is a separation

between a performance issue ldquoin PIrdquo and a performance issue in any attached systems If viewed ldquothrough

the eyesrdquo of a message (that is from the point of view of the message flow) then the PI system technically

starts as soon as the message enters an adapter or (if HTTP communication is used) as soon as the

message enters the pipeline of the Integration Engine Any delay prior to this must be handled by the

sending system The PI system technically ends as soon as the message reaches the target system for

example a given receiver adapter (or in case of HTTP communication the pipeline of the Integration Server)

has received the success message of the receiver Any delay after this point in time must be analyzed in the

receiving system

There are generally two types of performance problems The first is a more general statement that the PI

system is slow and has a low performance level or does not reach the expected throughput The second is

typically connected to a specific interface failing to meet the business expectation with regard to the

processing time The layout of this check is based on the latter First you should try to determine the

component that is responsible for the long processing time or the component that needs the highest absolute

time for processing Once this has been clarified there is a set of transactions that will help you to analyze

the origin of the performance problem

If the recommendations given in this guide are not sufficient SAP can help you to optimize the service via

SAP consulting or SAP MaxAttention services SAP might also handle smaller problems restricted to a

specific interface if you describe your problem in a SAP customer incident Support offerings like ldquoSAP

Going Live Analysis (GA)rdquo and ldquoSAP Going Live Verification (GV)rdquo can be used to ensure that the main

system parameters are configured according to SAP best practice You can order any type of service using

SAP Service Marketplace or your local SAP contacts

If you have already worked with the PI Admin Check (SAP Note 884865) you will recognize some of the

transactions This is because SAP considers the regular checking of the performance to be an important

administrational task However this check tries to show the methodology to approach performance problems

to its reader Also it offers the most common reasons for performance problems and links to possible follow-

up actions but does not refer to any regular administrational transactions

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

9

Wily Introscope is a prerequisite for analyzing performance problems at the Java side It allows you to

monitor the resource usage of multiple J2EE server nodes and provides information about all the important

components on the Java stack like mapping runtime messaging system or module processor Furthermore

the data collected by Wily is stored for several days so it is still possible to analyze a performance problem at

a later date Wily Introscope is provided free-of-charge for SAP customers It is delivered as part of Solution

Manager Diagnostics but can also be installed separately For more information see

httpservicesapcomdiagnostics or SAP Note 797147

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

10

2 WORKING WITH THIS DOCUMENT

If you are experiencing performance problems on your Process Integration system start with Chapter 3

Determining the Bottleneck It helps you to determine the processing time for the different runtimes within PI

Integration Engine Business Process Engine and Adapter Engine or connected Proxy System Once you

have identified the area in which the bottleneck is most probably located continue with the relevant chapter

This is not always easy to do because a long processing time is not always an indication for a bottleneck It

can make sense to do this if a complicated step is involved (Business Process Engine) if an extensive

mapping is to be executed (IS) or if the payload is quite large (all runtimes) For this reason you need to

compare the value that you retrieve from Chapter 3 with values that you have received previously for

example The history data provided for the Java components by Wily Introscope is a big help If this is not

possible you will have to work with a hypothesis

Once the area of concern has been identified (or your first assumption leads you there) Chapters 4 5 6

and 7 will help you to analyze the Integration Engine the Business Process Engine the PI Adapter

Framework and the ABAP Proxy runtime

After (or preferably during) the analysis of the different process components it is important to keep in mind

that the bottlenecks you have observed could also be caused by other interfaces processing at the same

time It will lead you to slightly different conclusions and tuning measures You therefore have to distinguish

the following cases based on where the problem occurs

A with regard to the interface itself that is it occurs even if a single message of this interface is processed

B or with regard to the volume of this interface that is many messages of this interface are processed at the same time

C or with regard to the overall message volume processed on PI

This can be analyzed by repeating the procedure of Chapter 3 for A) a single message that is processed

while no other messages of this interface are processed and while no other interfaces are running Then

compare this value with B) the processing time for a typical amount of messages of this interface not simply

one message as before If the values of measurement A) and B) are similar repeat the procedure with C) a

typical volume of all interfaces that is during a representative timeframe on your productive PI system or

with the help of a tailored volume test

These three measurements ndash A) processing time of a single message B) processing time of a typical

amount of messages of a single interface and C) processing time of a typical amount of messages of all

interfaces ndash should enable you to distinguish between the three possible situations

o A specific interface has long processing steps that need to be identified and improved The tuning options are usually limited and a re-design of the interface might be required

o The mass processing of an interface leads to high processing times This situation typically calls for tuning measures

o The long processing time is a result of the overall load on the system This situation can be solved by tuning measures and by taking advantage of PI features for example to establish separation of interfaces If the bottleneck is hardware-related it could also require a re-sizing of the hardware that is used

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

11

Chapters 8 9 10 and 11 deal with more general reasons for bad performance such as heavy

tracinglogging error situations and general hardware problems They should be taken into account if the

reason for slow processing cannot be found easily or if situation C from above applies (long processing times

due to a high overall load)

Important

Chapter 9 provides the basic checks for the hardware of the PI server It should not only be used when

analyzing a hardware bottleneck due to a high overall load andor an insufficient sizing but should also be

used after every change made to the configuration of PI to ensure that the hardware is able to handle the

new situation This is important because a classical PI system uses three runtime engines (IS BPE AFW)

Tuning one engine for high throughput might have a direct impact on the others With every tuning action

applied you have to be aware of the consequences on the other runtimes or the available hardware

resources

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

12

3 DETERMINING THE BOTTLENECK

The total processing time in PI consists of the processing time in the Integration Server processing time in

the Business Process Engine (if ccBPM is used) as well as the processing time in the Adapter Framework (if

adapters except IDoc plain HTTP or ABAP proxies are used) In the subsequent chapters you will find a

detailed description of how to obtain these processing times

For reasons of completeness we will also have a look at the ABAP Proxy runtime and the available

performance evaluations

31 Integration Engine Processing Time

You can use an aggregated view of the performance (details about activation in Note 820622 - Standard jobs

for XI performance monitoring) data for the Integration Engine as shown below

Procedure

Log on to your Integration Engine and call transaction SXMB_IFR In the browser window follow the link to

the Runtime Workbench In there click ldquoPerformance Monitoringrdquo Change the display to ldquoDetailed Data

Aggregatedrdquo select lsquoISrsquo as the Data Source (or lsquoPMIrsquo if you have configured the Process Monitoring

Infrastructurersquo) and ldquoXIIntegrationServerltyour_hostgtrdquo as the component Then choose an appropriate time

interval for example the last day You have to enter the details of the specific interface you want to monitor

here

The interesting value is the ldquoProcessing Time [s]rdquo in the fifth column It is the sum of the processing time of the single steps in the Integration Server For now you are not interested in the single steps that are listed on the right side since you are still trying to find out the part of the PI system where the most time is lost Chapter 44 will work with the individual steps

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

13

If you do not know which interface is affected yet you first have to get an overview Instead of navigating to

Detailed Data Aggregated in the Runtime Workbench choose Overview Aggregated Use the button Display

Options and check the options for sender component receiver component sender interface and receiver

interface Check the processing times as described above

32 Adapter Engine Processing Time

Procedure

Log on to your Integration Server and call transaction SXMB_IFR In the browser window follow the link to

the Runtime Workbench In there click Message Monitoring choose Adapter Engine lthostgt from Database

from the drop down lists and press Display Enter your interface details and ldquoStartrdquo the selection Select one

message using the radio button and press Details

Calculate the difference between the start and end timestamp indicated on the screen

Do the above calculation for the outbound (AFW IS) as well as the inbound (IS AFW) messages

The audit log of successful message is no longer persisted in SAP NetWeaver PI 71 by default in order to minimize the load on the database The audit log has a high impact on the database load since it usually consists of many rows that are inserted individually in the database In newer releases therefore a cache was implemented that keeps the audit log information for a period of time only Thus no detailed information is available any longer from a historical point of view Instead you should use Wily Introscope for historical analyses of the performance of successful messages

The audit log can be persisted on the database to allow historical analysis of performance problems

on the Java stack To do so change the parameter ldquomessagingauditLogmemoryCacherdquo from

true to false for service ldquoXPI Service Messaging Systemrdquo using NetWeaver Administrator (NWA) as described in SAP Note 1314974 Note Only do this temporarily if you have identified the bottleneck on the AFW First try using Wily to determine the root cause of the problem

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

14

Due to above mentioned limitations Wily Introscope is the right tool for performance analysis of the Adapter Engine Wily persists the most important steps like queuing module processing and average response time in historical dashboards (aggregated data per intervals for all server nodes of a system or all systems with filters possible etc)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

15

Especially for Java-Only runtime using Integrated Configuration Wily shows the complete processing time after the initial sender adapter steps (receiver determination mapping and time in the backend call) as can be seen in the Message Processing Time dashboard below

Using the ldquoShow Minimum and Maximumrdquo functionality deviations from the average processing time can be seen In the example below a message processing step takes up to 22 seconds The reason for this could either be a slow mapping or a delay in the call to the receiver system (in the example below Java IDoc adapter is used to transfer the data to ERP)

In case of high processing times in one of your steps further analysis might be required using thread dumps or the Java Profiler as outlined in the appendix section A2 XPI inspector for troubleshooting and performance analysis

321 Adapter Engine Performance monitor in PI 731 and higher

Starting from PI 731 SP4 there is a new performance monitor available for the Adapter Engine More

information on the activation of the performance monitor can be found at Note 1636215 ndash Performance

Monitoring for the Adapter Engine Similar to the ABAP monitor in the RWB the data is displayed in an

aggregated format On the PI start page use the link ldquoConfiguration and Monitoring Homerdquo Go to ldquoAdapter

Enginerdquo and select ldquoPerformance Monitorrdquo The data is persisted on an hourly basis for the last 24 hours and

on a daily basis for the last 7 days The data is further grouped on interface level and per Java server node

With the information provided you can therefore see the minimum maximum and average response time of

an individual interface on a specific Java server node All individual steps of the message processing like

time spent in the Messaging System queues or Adapter modules are listed In the example below you can

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

16

see that most of the time (5 seconds) is spent in a module called SimpleWaitModule This would be the entry

point for the further analysis

Starting with 731 SP10 (740 SP05) this data can be exported to excel in two ways Overview or Detailed (details in SAP Note 1886761)

33 Processing Time in the Business Process Engine

Procedure

Log on to your integration server and call transaction SXMB_MONI Adjust your selection in a way that you

select one or more messages of the respective interface Once the messages are listed navigate to the

Outbound column (or Inbound if your integration process is the sender) and click on PE Alternatively you

can select the radio button for ldquoProcess Viewrdquo on the entranceselection screen of SXMB_MONI

To judge the performance of an integration process it is essential to understand the steps that are executed A long-running integration process in itself is not critical because it can wait for a trigger event or can include a synchronous call to a backend Therefore it is important to understand the steps that are included before analyzing an individual workflow log For example the screenshot below shows a long waiting time after the mapping This is perhaps due to an integrated wait step

Calculate the time difference between the first and the last entry Note that this time for the workflow does not include the duration of RFC calls for example To see this processing time navigate to the ldquoList with Technical Detailsrdquo (second button from the left on the screenshot below or shift + F9) Repeat this step for several messages to get an overview about the most time consuming steps

Please Note that the processing time mentioned above does not include the waiting time of the messages in the ccBPM queues Therefore it is essential to monitor a queue backlog in ccBPM queues as discussed in section ldquoQueuing and ccBPM (or increasing Parallelism for ccBPM Processes)rdquo

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

17

Via transaction SWI2_DURA you can display an aggregated performance overview across all ccBPM

processes running on your PI system In the initial screen you have to choose ldquo(Sub-) Workflow and the time

range you would like to look at

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

18

The results here allow you to compare the average performance of a process It shows you the processing

time based on barriers The 50 barrier means that 50 of the messages were processed faster than the

value specified Furthermore you also see a comparison of the processing time to eg the day before By

adjusting the selection criteria of the transaction you can therefore get a good overview about the normal

processing time of the Integration process and can judge if there is a general performance problem or just a

temporary one

The number of process instances per integration process can be easily checked via transaction

SWI2_FREQ The data shown allows you to check the volume of the Integration Processes and allow you to

judge if your performance problem is caused by an increase of volume which could cause a higher latency in

the ccBPM queues

New in PI 73 and higher

Starting from PI 73 a new monitoring for ccBPM processes is available This monitor can be started from

transaction SXMB_MONI_BPE Integration Process Monitoring (also available in ldquoConfiguration and

Monitoring Homerdquo on PI start page) This is new browser based view that allows a simplified and aggregated

view on the PI Integration Processes

On the initial screen you get an overview about all the Integration Processes executed in the selected time

interval Therefore you can immediately see the volume of each Integration Process

From there you can navigate to the Integration process facing the performance issues and look at the

individual process instances and the start and end time Furthermore there is a direct entry point to see the

PI messages that are assigned to this process

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

19

Choosing one special process instance ID you will see the time spend in each individual processing step

within the process In the example below you can see that most of the time is spend in the Wait Step

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

20

4 ANALYZING THE INTEGRATION ENGINE

If chapter Determining the Bottleneck has shown that the bottleneck is clearly in the Integration Engine there

are several transactions that can help you to analyze the reason for this To understand why this selection of

transactions helps to analyze the problem it is important to know that the processing within the Integration

Engine is done within the pipeline The central pipeline of the Integration Server executes the following main

steps

Central Pipeline Steps Description of Central Pipeline Steps

PLSRV_XML_VALIDATION_RQ_INB XML Validation Inbound Channel Request

PLSRV_RECEIVER_DETERMINATION Receiver Determination

PLSRV_INTERFACE_DETERMINATION Interface Determination

PLSRV_RECEIVER_MESSAGE_SPLIT Branching of Messages

PLSRV_MAPPING_REQUEST Mapping

PLSRV_OUTBOUND_BINDING Technical Routing

PLSRV_XML_VALIDATION_RQ_OUT XML Validation Outbound Channel Request

PLSRV_CALL_ADAPTER Transfer to Respective Adapter IDoc HTTP Proxy AFW

PLSRV_XML_VALIDATION_RS_INBXML Validation Inbound Channel Response

PLSRV_MAPPING_RESPONSE Response Message Mapping

PLSRV_XML_VALIDATION_RS_OUT Validation Outbound Channel Response

The last three steps highlighted in bold are available in case of synchronous messages only and reflect the

time spent in the mapping of the synchronous response message

The steps indicated in red are new in SAP NetWeaver PI 71 and higher and can be used to validate the

payload of a PI message against an XML schema These steps are optional and can be executed at different

times in the PI pipeline processing for example before and after a mapping (as shown above)

With PI 731 an additional option of using an external Virus Scan during PI message processing can be

activated as described in the Online Help The Virus Scan can be configured on multiple steps in the pipeline

ndash eg after the mapping (Virus Scan Outbound Channel Request) when receiving a synchronous response

(Virus Scan Inbound Channel Response) and after the response mapping (Virus Scan Outbound Channel

Response)

It is important to know is that these pipeline steps are executed based on qRFC Logical Unit of Work (LUW)

by using dialog work processes

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

21

With 711 and higher versions you can activate the Lean Message Search (LMS) to be able to search for

payload content This is described in more detail in chapter Long Processing Times for

ldquoLMS_EXTRACTIONrdquo

41 Work Process Overview (SM50SM66)

The work process overview is the central transaction to get an overview of the current processes running in

the Integration Engine For message processing PI is only using Dialog work processes (DIA WP) Therefore

it is essential to ensure that enough DIA WPs available

The process overview is only a snapshot analysis You therefore have to refresh the data multiple times to

get a feeling for the dynamics of the processes The most important questions to be asked are as follows

How many work processes are being used on average Use the CPU Time (clock symbol) to check that not all DIA WPs are used In case all DIA WPs have a high CPU time this indicates a WP bottleneck

Which users are using these processes It depends on the usage of your PI system All asynchronous messages are processed by the QIN scheduler who will start the message processing in DIA WPs The user that is shown in SM66 will be the one that triggered the QIN scheduler This can eg be a communication user for a connected backend (usually a copy of PIAPPLUSER) or the user used to send data from the Adapter Engine (per default PIAFUSER)

Activities related to the Business Process Engine (ccBPM) will be indicated by user WF-BATCH If you see high WF-BATCH activity take a look at chapter 51

Are there any long running processes and if so what are these processes doing with regard to the reports used and the tables accessed

o As a rule of thumb for the initial configuration the number of DIA work processes should be around six times the value of CPU in your PI system (rdispwp_no_dia = 6 to 8 times CPUs cores based on SAP Note 1375656 - SAP NetWeaver PI System Parameters)

If you would like to get an overview for an extended period of time without actually refreshing the transaction

at regular intervals use report SDFMON and schedule the Daily Monitoring It allows you to collect metrics

such as CPU usage amount of free Dialog work processes and most importantly a snapshot of the work

process activity as provided in transactions SM50 and SM66 The frequency for the data collection can be as

low as 10 seconds and the data can be viewed after the configurable timeframe of the analysis Note

SDFMON is only intended for analysis of performance issues and should only be scheduled for a limited

period of time

If you have Solution Manager Diagnostics installed and all the agents connected to PI you can also monitor

the ABAP resource usage using Wily Introscope As you can see in the dashboard below you can use it to

monitor the number of free work processes (prerequisite is that the SMD agent is running on the PI system)

The major advantage of Wily Introscope is that this information is also available from the past and allows

analysis after the problem has occurred in the system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

22

42 qRFC Resources (SARFC)

Depending on the configuration of the RFC server group not all dialog work processes can be used for

qRFC processing in the Integration Server As stated earlier all (ABAP based) asynchronous messages are

processed using qRFC Therefore tuning the RFC layer is one of the most important tasks to achieve good

PI performance

In case you have a high volume of (usually runtime critical) synchronous scenarios you have to ensure that

enough DIA WPs are kept free by the asynchronous interfaces running at the same time Since this is a very

difficult tuning exercise we usually recommend implementing runtime critical synchronous interfaces using

Java only configuration (ICO) whenever possible If this is not possible you have to ensure via SARFC tuning

that enough work processes are kept free to process the synchronous messages

The current resource situation in the system can be monitored using transaction SARFC

Procedure

First check which application servers can be used for qRFC inbound processing by checking the AS Group

assigned in transaction SMQR

Call transaction SARFC and refresh several times since the values are only snapshot values

Is the value for ldquoMax Resourcesrdquo close to your overall number of dialog work processes Max Resources is a fixed value and describes how many DIA work processes may be used for RFC communication If the value is too low your RFC parameters need tuning to provide the Integration Server with enough resource for the pipeline processing The RFC parameters can be displayed by double-clicking on ldquoServer Namerdquo

o A good starting point is to set the values as follows

Max no of logons = 90

Max disp of own logons = 90

Max no of WPs used = 90

Max wait time = 5

Min no of free WP = 3-10 (depending on the size of the application server)

These values are specified in SAP Note 1375656 ndash SAP NetWeaver PI System Parameters (this parameter must be increased if you plan to have synchronous interfaces) To avoid any bottleneck and blocking situation free DIA WPs should always be available The value should be adjusted based on the overall number of DIA WPs available at the system and the above requirements

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

23

Note You have to set the parameters using the SAP instance profile Otherwise the changes are lost

after the server is restarted

The field ldquoResourcesrdquo shows the number of DIA WPs available for RFC processing Is this value reasonably high for example above zero all the time If the ldquoResourcesrdquo value is zero then the Integration Server cannot process XML messages because no dialog work process is free to execute the necessary qRFC There are several conclusions that can be drawn from this observation as follows

1) Depending on the CPU usage (see Chapter ldquoMonitoring CPU Capacityrdquo) it might be necessary at this time to increase the number of dialog work processes in your system If the CPU usage however is already very high increasing the number of DIA will not solve the bottleneck

2) Depending on the number of concurrent PI queues (see Chapter qRFC Queues (SMQ2)) it might be necessary to decrease the degree of parallelism in your Integration Server because the resources are obviously not sufficient to handle the load The next section describes how to tune PI inbound and outbound queues

The data provided in SARFC is also collected and shown in a graphical and historical way using Solution

Manager Diagnostic and Wily Introscope (as can be seen in the screenshot below) This allows easy

monitoring of the RFC resources across all available PI application servers

43 Parallelization of PI qRFC Queues

The qRFC inbound queues on the ABAP stack are one of the most important areas in PI tuning Therefore it

is essential to understand the PI queuing concept

PI generally uses two types of queues for message processing ndash PI inbound and PI outbound queues Both

types are technical qRFC inbound queues and can therefore be monitored using SMQ2 PI inbound queues

are named XBTI (EO) or XBQI (EOIO) and are shared between all interfaces running on PI by default The

PI outbound queues are named XBTO (EO) and XBQO (EOIO) The queue suffix (in red XBTO0___0004)

specifies the receiver business system This way PI is using dedicated outbound queues for each receiver

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

24

system All interfaces belonging to the same receiver business system are contained in the same outbound

queue Furthermore there are dedicated queues for prioritization of separation of large messages To get an

overview about the available queues use SXMB_ADM Manage Queues

PI inbound and outbound queues execute different pipeline steps

PI Inbound Queues

Pipeline Steps Description of Pipeline Steps

PLSRV_XML_VALIDATION_RQ_INB XML Validation Inbound Channel Request

PLSRV_RECEIVER_DETERMINATION Receiver Determination

PLSRV_INTERFACE_DETERMINATION Interface Determination

PLSRV_RECEIVER_MESSAGE_SPLIT Branching of Messages

PI Outbound Queues

Pipeline Steps Description of Pipeline Steps

PLSRV_MAPPING_REQUEST Mapping

PLSRV_OUTBOUND_BINDING Technical Routing

PLSRV_XML_VALIDATION_RQ_OUT XML Validation Outbound Channel Request

PLSRV_CALL_ADAPTER Transfer to Respective Adapter IDoc HTTP Proxy AFW

Before the steps above are executed the messages are placed in a qRFC queue (SMQ2) The messages

will wait till they are first in queue and till a free DIA work process is assigned to a queue The wait time in the

queue are recorded in the performance header in DB_ENTRY_QUEUING (PI inbound queue) and

DB_SPLITTER_QUEUING (PI outbound queue) Most often you will see the long processing times there as

discussed in chapter Long Processing Times for ldquoDB_ENTRY_QUEUEINGrdquo and Long Processing Times for

ldquoDB_SPLITTER_QUEUEINGrdquo

Looking at the steps above it is clear that tuning the queues will have a direct impact on the connected PI

components and also backend systems For example by increasing the number of parallel outbound queues

more mappings will be executed in parallel which will in turn put a greater load on the Java stack or more

messages will be forwarded in parallel to the backend system in case of a Proxy call Thus when tuning PI

queues you must always keep the implications for the connected systems in mind

The tuning of PI ABAP queue parameters is done in transaction SXMB_ADM Integration Engine

Configuration and by selecting the category TUNING

For productive usage we always recommend to use inbound and outbound queues (parameter

EO_INBOUND_TO_OUTBOUND=1) Otherwise only inbound queues will be used which are shared across

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

25

all interfaces Hence a problem with one single backend system will affect all interfaces running on the

system

The principle of less is sometimes more also applies for tuning the number of parallel PI queues Increasing

the number of parallel queues will result in more parallel queues with fewer entries per queue In theory this

should result in a lower latency if enough DIA WPs are available

But practically this is not true for high volume systems The main reason for this is the overhead involved in

the reloading of queues (see details below) Furthermore important tuning measures like PI packaging (see

Chapter ldquoPI Message Packagingrdquo) aim to increase the throughput based on a higher number of messages in

the queue Thus from a throughput perspective it is definitely advisable to configure fewer queues

If you have very high runtime requirements you should prioritize these interfaces and assign a different

parallelism for high priority queues only This can be done using parameters

EO_INBOUND_PARALLEL_HIGH_PRIO and EO_OUTBOUND_PARALLEL_HIGH_PRIO as described in

section ldquoMessage Prioritization on the ABAP Stackrdquo

To tune the parallelism of inbound and outbound queues the relevant parameters are

EO_INBOUND_PARALLEL and EO_OUTBOUND_PARALLEL have to be used

Below you can see a screenshot of SMQ2 You can see the PI inbound queues and outbound queues Also

ccBPM queues (XBQO$PE) are displayed and will be discussed in section 5

Procedure

Log on to your Integration Server call transaction SMQ2 and execute If you are running ABAP proxies on a

separate client on the same system enter lsquorsquo for the client Transaction SMQ2 provides snapshots only and

must therefore be refreshed several times to get viable information

o The first value to check is the number of queues concurrently active over a period of time Since each queue needs a dialog work process to be worked on the number of concurrent queues must not be too high compared to the number of dialog work processes usable for RFC communication (see Chapter qRFC Resources (SARFC)) The optimum throughput can be achieved if the number of active queues is equal to or slightly higher than the number of dialog work processes (provided that the number of DIA work processes can be handled by the CPU of the system see transaction ST06)

In case many of the queues are EOIO queues (eg because the serialization is done on the material number) try to reduce the queues by following Chapter EOIO tuning

o The second value of importance is the number of entries per queue Hit refresh a few times to check if the numbers increasedecreaseremain the same An increase of entries for all queues or a specific

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

26

queue points to a bottleneck or a general problem on the system The conclusion that can be drawn from this is not simple Possible reasons have been found to include

1) A slow step in a specific interface

Bad processing time of a single message or a whole interface can be caused by expensive processing steps as for example the mapping step or receiver determination This can be confirmed by looking at the processing time for each step as shown in ldquoAnalyzing the runtime of pipeline stepsrdquo

2) Backlog in Queues

Check if inbound or outbound queues face a backlog A backlog in a queue is generally nothing critical since the queues ensure that the PI components as well as the backend systems are not overloaded For instance batch triggered interfaces are usually causing high backlogs that get processed over time Only in case the backlog prevents you to meet the business requirements for an interface this should be analyzed

If the backlog is caused by a high volume of messages arriving in a short period of time one solution to minimize the backlog is to increase the number of queues available for this interface (if sufficient hardware is available) If one interface is having a backlog in the outbound queues you could eg specify EO_OUTBOUND_PARALLEL with a sub parameter specifying your interface to increase the parallelism for this interface

But in general a backlog can also be caused by a long processing time in one of the pipeline steps as discussed in 1) or a performance problem with one of the connected components like PI Java stack or connected backend systems Due to the steps performed in the outbound queues (mapping or call adapter) it is more likely that a backlog will be seen in the outbound queues Inbound queue processing is only expensive if Content Based Routing or Extended Receiver Determination is used To understand which step is taking long follow once more chapter ldquoAnalyzing the runtime of pipeline stepsrdquo

In addition to tuning of the number of inbound and outbound queues also ldquoMessage Prioritization on the ABAP Stackrdquo and ldquoPI Message Packagingrdquo can help

The screenshot below shows such a backlog situation in Wily Introscope Please Note that the information about the queues is only collected by Wily in 5 minutes intervals explaining the gaps between the measurement points On the dashboard ldquoTotal qRFC Inbound Queue Entriesrdquo you can see that the number of messages in SMQ2 is increasing The number of inbound queues does not increase at the same time It also offers a dashboard for the number of entries by queue name and the time entries remain in SMQ2 With the information provided here it is also possible to distinguish a blocking situation or a general resource problem after the problem is no longer present in the system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

27

3) Queues stay in status READY in SMQ2

To see the status of the queues use the filter button in SMQ2 as shown below

It is a normal situation to see many queues in READY status for a limited amount of time The reason for this is the way the QIN scheduler reloads LUWs (see point 2 below) A waiting time of up to 5 seconds per queuing step is considered normal during normal load situation (even if there is no backlog in the queue and enough resources are available) If you observe a situation where many queues are in READY status for longer period of time the following situations can apply

A resource bottleneck with regard to RFC resources Confirm by following section qRFC Resources (SARFC)

To avoid overloading a system with RFC load the QIN scheduler only reloads new queues after a certain amount of queues have finished processing This can lead to long waiting times in the queues as explained in SAP Note 1115861 - Behaviour of Inbound Scheduler after Resource Bottleneck Since PI is a non-user system we recommend setting rfcinb_sched_resource_threshold to 3 as described in SAP Note 1375656 ndash SAP NetWeaver PI System Parameters

Long DB loading times for RFC tables If using Oracle ensure that SAP Note 1020260 - Delivery of Oracle statistics is applied

Additional overhead during the pipeline executions occurs due to PI internal mechanisms Per default a lock is set during processing of each message This is no longer necessary and should

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

28

be switched off by setting the parameter LOCK_MESSAGE of category RUNTIME to 0 as described in SAP Note 1058915 In addition during the processing of an individual message the message repeatedly calls an enqueue which leads to a deterioration of the throughput This can be avoided by setting parameter CACHE_ENQUEUE to 0 as described in SAP Note 1366904

Sometimes a queue stays in READY status for a very long time while other queues are getting processed fine After manual unlocking the queue is processed but then stops after some time again A potential reason is described in Note 1500048 - SMQ2 inbound Queue remains in READY state In a PI environment queues in status RUNNING for more than 30 minutes usually happen due to one individual step taking long (as discussed in step 1) or a problem on the infrastructure (eg memory problems on Java or HTTP 200 lost on network) After 30 minutes the QIN scheduler removes this queue from the scheduling and therefore it remains in READY To solve such cases the root cause for the blocking has to be understood

4) Queue in READY status due to queue blacklisting

If a queue is for a very long time in status RUNNING (eg due to a problem with a mapping or due to a very slow backend) it is possible that the QIN scheduler excludes this queue from queue scheduling Hence the queue stays in status READY even though enough work processes are available The ldquoblacklistingrdquo of queues takes place when the runtime of a queue exceeds the ldquoMonitoring thresholdrdquo configured on this queue Notes 1500048 - queues stay in ready (schedule_monitor) and Note 1745298 - SMQR Inbound queue scheduler identify excluded entries provide more input on this topic The screenshot below shows a setting of 2400 seconds (40 minutes) for this parameter

Check if a queue stays in READY status for a long time while others are processing without any issue Ensure Note 1745298 is implemented and check the system log for the following exception ldquoQINEXCLUDE ltqueue namegt lttimedate at which the queue scheduler startedgt If this is the case check why the long runtime occurs (see section Analyzing the runtime of PI pipeline steps) or increase the schedule_monitor in exceptional cases only (if eg the runtime of an individual processing step cannot be improved further)

5) An error situation is blocking one or more queues

Click the Alarm Bell (Change View) push button once to see only queues with error status Errors delay the queue processing within the Integration Server and may decrease the throughput if for example multiple retries occur Navigate to the first entry of the queue and analyze the problem by forward navigation to the relevant message in SXMB_MONI

Often you see EO queues in SYSFAIL or RETRY due to the problems during the processing of an individual message This can be prevented by following the description of Chapter Prevent blocking of EO queues

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

29

44 Analyzing the runtime of PI pipeline steps

The duration of the pipeline steps is the ultimate answer to long processing times in the Integration Engine

since it describes exactly how much time was spent at which point The recommended way to retrieve the

duration of the pipeline steps is the RWB and will be described below Advanced users may use the

Performance Header of the SOAP message using transaction SXMB_MONI but the timestamps are not

easy to read If you still prefer the latter method here is the explanation 20110409092656165 must be read

as yyyymmddhh(min)(min)(sec)(sec)(microseconds) that is the timestamp corresponds to April 9 2011 at

092656 and 165 microseconds Please note that these timestamps in Performance Header of

SXI_MONITOR transaction are stored and displayed in UTC time therefore conversion to system time must

be done when analyzing them

In case PI message packaging is configured the performance header will always reflect the processing time

per package Hence a duration of 50 seconds does not mean a single message took 50 seconds but

eventually the package contained 100 messages so that every message took 05 ms More details about

this can be found in section PI Message Packaging

Procedure

Log on to your Integration Server and call transaction SXMB_IFR In the browser window follow the link to

the Runtime Workbench In there click Performance Monitoring Change the display to Detailed Data

Aggregated and choose an appropriate time interval for example the last day For this selection you have to

enter the details of the specific interface you want to monitor

You will now see in the lowerndashpart of the screen a split of the processing time into single steps Check the time difference between the steps Does any step take longer than expected In the example in the screenshot below the DB_ENTRY_QUEUEING starts after 0032 seconds and ends after 0243 seconds which means it took 0211 seconds (211ms)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

30

Compare the processing times for the single steps for different measurements as outlined in Chapter

Pipeline Steps (SXMB_MONI or RWB) For example is a single step only long if many messages

are processed or if a single message is processed This helps you to decide if the problem is a

general design problem (single message has long processing step) or if it is related to the message

volume (only for a high number of messages this process step has large values)

Each step has different follow-up actions that are described next

441 Long Processing Times for ldquoPLSRV_RECEIVER_ DETERMINATIONrdquo PLSRV_INTERFACE_DETERMINATION

Steps that can take a long time in inbound processing are the Receiver and Interface Determination In these

steps the receiver system and interface is calculated Normally this is very fast but PI offers the possibility of

enhanced receiver determinations In these cases the calculation is based on the payload of a message

There are different implementation options

o Content-Based Routing (CBR)

CBR allows defining XPath expressions that are evaluated during runtime These expressions can be combined by logical operators for example to check the value for multiple fields The processing time of such a request directly correlates to the number and complexity of the conditions defined

No tuning options exist for the system in regard to CBR The performance of this step can only be changed by reducing the number of rules by changing the design of the interfaces Another option is to use an extended receiver determination (executing a mapping program) since this is faster than CBR using many rules

o Mapping to determine Receivers

A standard PI mapping (ABAP Java XSLT) can also be used to determine the receiver If you observe high runtimes in such a receiver determination follow the steps outlined in the next section since the same mapping runtime is used

442 Long Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo

Before analyzing the mapping you must understand which runtime is used Mappings can be implemented

in ABAP as graphical mappings in the Enterprise Service Builder as self-developed Java mappings or

XSLT mappings One interface can also be configured to use a sequence of mappings executed

sequentially In such a case analysis is more difficult because it is not clear which mapping in the sequence

is taking a long time

Normal debugging and tracing tools (transaction ST12) can be used for mappings executed in ABAP Any

type of mapping can be tested from ABAP using transaction SXI_MAPPING_TEST Knowing

senderreceiver partyserviceinterfacenamespace and source message payload it is possible to check the

target message (after mapping execution) and detailed trace output (similar to contents of trace of the

message in SXMB_MONI with TRACE_LEVEL = 3) It can also be used for debugging at runtime by using

the standard debugging functionality

For ABAP based XSLT mappings it is also possible via transaction XSLT_TOOL to test trace and debug

XSLT transformations on the ABAP stack

In general the mapping response time is heavily influenced by the complexity of the mapping and the

message size Therefore to analyze a performance problem in the mapping environment you should

compare the mapping runtime during the time of the problem with values reported several days earlier to get

a better understanding

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

31

If no AAE is used the starting point of a mapping is the Integration Engine that will send the request by RFC

destination AI_RUNTIME_JCOSERVER to the gateway There it will be picked up by a registered server

program The registered server program belongs to the J2EE Engine The request will be forwarded to the

J2EE Engine by a JCo call and then executed by the Java runtime When the mapping has been executed

the result is sent back to the ABAP pipeline There are therefore multiple places to check when trying to

determine why the mapping step took so long

Before analyzing the mapping runtime of the PI system check if only one interface is affected or if you face a

long mapping runtime for different interfaces To do so check the mapping runtime of messages being

processed at the same time in the system

The best tool for such an analysis is Wily Introscope which offers a dashboard for all mappings being

executed at a given time Each line in the dashboard represents one mapping and shows the average

response time and the number of invocations

In the screenshot below you can see that many different mapping steps have required around 500 seconds

for processing Comparing the data during the incident with the data from the day before will allow you to

judge if this might be a problem of the underlying J2EE engine as described in section J2EE Engine

Bottleneck

If there is only one mapping that faces performance problems there would be just one line sticking out in the

Wily graphs If you face a general problem that affects different interfaces you can choose a longer

timeframe that allows you to compare the processing times in a different time period and verify if it is only a

ldquotemporaryrdquo issue ndash this would for example indicate an overload of the mapping runtime

If you have found out that only one interface is affected then it is very unlikely to be a system problem but

rather a problem in the implementation of the mapping of that specific interface

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

32

Check the message size of the mapping in the runtime header using SXMB_MONI Verify if the message size is larger than usual (which would explain the longer runtime)

There can also be one or many lookups to a remote system (using RFC or JDBC) causing the long processing time of a mapping Together with the application you then have to check if the connection to the backend is working properly Such a mapping with lookups can be analyzed using Wily Transaction Trace as explained in the appendix section Wily Transaction Trace

If not one but several interfaces are affected a potential system bottleneck occurs and this is described in

the following

o Not enough resources (registered server programs) available That could either be the case if too many mapping requests are issued at the same time or if one J2EE server node is down and has not registered any server programs

To check if there were too many mapping requests for the available registered server programs compare the number of outbound queues that are concurrently active with the number of registered server programs The number of outbound queues that are concurrently active can be monitored with transaction SMQ2 and counting queues with the names XBTO amp XBQOXB2O XBTA and XBQAXB2A (high priority) XBTZ and XBQZXB2Z (low priority) and XBTM (large messages) In addition to this you have to take into account mapping steps that are executed by synchronous messages or in a ccBPM process Mappings executed by a ccBPM process are not queued but the ccBPM process calls the mapping program directly using tRFC The number of registered server programs can be determined with transaction SMGW Goto Logged On Clients and filtering by the program (ldquoTP Namerdquo) AI_RUNTIME_ltSIDgt In general we recommend configuring 20 parallel mapping connections per server node

If the problem occurred in the past it is more difficult to determine the number of queues that are concurrently active One way is to use transaction SXMB_MONI and specify every possible outbound queue (field ldquoQueue IDrdquo) in the advanced selection criteria (Note wildcard search not available) The timeframe can be restricted in the upper-part of the advanced selection criteria Advanced users could use transaction SE16 to search directly in table SXMSPMAST using the field ldquoQUEUEINTrdquo

The other option is to use the Wily ldquoqRFC Inbound Queue Detailrdquo dashboard as described in qRFC Queue Monitoring (SMQ2)

To check if the J2EE is not available or if the server program is not registered for a different reason use transaction SM59 to test the RFC destination AI_RUNTIME_JCOSERVER If the problem occurred in the past search the gateway trace (transaction SMGW GoTo Trace Gateway Display File) and the RFC developer traces dev_rfc files available in the work directory for the string ldquoAI_RUNTIMErdquo In addition check the std_serverltngtout in the work directory for restarts of the J2EE Engine

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

33

o Depending on the checks above there are two options to resolve the bottleneck Of course if the J2EE was not available the reason for this has to be found and prevented in the future

The two options for tuning are

1) the number of outbound queues that are concurrently active and

2) the number of mapping connections from the J2EE server to the ABAP gateway

The increase of mapping connections is only recommended for strong J2EE servers since each mapping thread needs resources like CPU and Heap memory Too many mapping connections mean too many mappings being executed on the J2EE Engine In turn this might reduce the performance of other parts of the PI system for example the pipeline processing in the Integration Engine or the processing within the adapters of the AFW or can even cause out-of-memory errors on the J2EE engine Add additional J2EE server nodes to balance the parallel requests Each J2EE server node will

register new destinations at the gateway and will therefore take a part of the mapping load

Another option is to reduce the number of outbound queues that are concurrently active to solve the bottleneck This will certainly result in higher backlogs in the queue and is therefore only applicable for less runtime critical interfaces This option can be used if the backlog is caused by a master data interface with high volume but no critical processing time requirements In such a case you can only lower the number of parallel queues for this interface by using a sub-parameter of EO_OUTBOUND_PARALLEL in transaction SXMB_ADM By doing so you slow down one interface to ensure other (more critical) interfaces have sufficient resources available

o If multiple application servers configured on the PI system ensure that all the server nodes are connected to their local gateway (local bundle option) as described in the guide How To Scale PI

443 Long Processing Times for ldquoPLSRV_CALL_ADAPTERrdquo

Call adapter is the last step executed in the PI pipeline and forwards the message to the next component

along the message flow In most cases (local Adapter Engine IDoc or BPE) this is a local call The IDoc

adapter only puts the message on the tRFC layer (SM58) of the PI system so that the actual transfer of the

IDoc is not included in the performance header For ABAP Proxies plain HTTP or calls to a central or

decentral Adapter Engine an HTTP call is made so that network time can have an influence here

Looking at the processing time we have to distinguish asynchronous (EO and EOIO) and synchronous (BE)

interfaces

In asynchronous interfaces the call adapter step includes the transfer of the message by the network and

the initial steps on the receiver side (in most cases this just means persisting the message on the receiverrsquos

database) A long duration can therefore have two reasons

o Network latency For large messages of several MB in particular the network latency is an important factor (especially if the receiving ABAP proxy system is located on another continent for example) Network latency has to be checked in case of a long call adapter step In case HTTP load balancers are used they are considered as part of the network

o Insufficient resources on receiver side Enough resources must be available at the receiver side to ensure quick processing of a message For instance enough ICM threads and dialog work processes must be available in the case of an ABAP proxy system Therefore the analysis of long call adapter steps always has to include the relevant backend system

For synchronous messages (requestresponse behavior) the call adapter step also includes the processing

time on the backend to generate the response message Therefore the call adapter for synchronous

messages includes the time of the transfer of the request message the calculation of the corresponding

response message at the receiver side and the transfer back to PI Therefore the processing time to

process a request at the receiving target system for synchronous messages must always be analyzed to find

the most costly processing steps

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

34

444 Long Processing Times for ldquoDB_ENTRY_QUEUEINGrdquo

The value for DB_ENTRY_QUEUEING describes the time that a message has spent waiting in a PI inbound

queue before processing started In case of errors the time also includes the wait time for restart of the LUW

in the queue The inbound queues (XBTI XBT1 XBT9 XBTL for EO messages and XBQIXB2I

XBQ1XB21 XBQ9XB29 for EOIO messages) process the pipeline steps for the Receiver

Determination the Interface Determination and the Message Split (and optionally XML inbound validation)

Thus if the step DB_ENTRY_QUEUEING has a high value the inbound queues have to be monitored using

transactions SMQ2 and SARFC The reasons are similar as those outlined in chapter qRFC Resources

(SARFC) and qRFC Queues (SMQ2)

o Not enough resources (DIA work processes for RFC communication) available

o A backlog in the queues caused by one of the following reasons

The number of parallel inbound queues is too low to handle the incoming messages in parallel Note that a simple increase of the parameter EO_INBOUND_PARALLEL might not always be the solution (as enough work processes also have to be available to process the queues in parallel) The number of work processes is in turn restricted by the number of CPUs that are available for your system If you would like to separate critical from non-critical sender systems define a dedicated set of inbound queues by specifying a sender ID as a sub parameter of EO_INBOUND_PARALLEL For example by doing so you can separate master data interfaces from business critical interfaces Please note that the separation on inbound queues only works on Business System level and not on interface level

The inbound queues were blocked by the first LUW Use transaction SMQ2 to check if that is still the case To identify if such a situation occurred in the past use the Wily Introscope dashboard ldquoABAP System qRFC Inbound Detailldquo It is generally not easy to find the one message that is blocking the queue It might be achieved by checking RFC traces or by searching for a single message with a long pipeline processing step For EO queues there is no business reason to block the queues in case of a single message error To prevent this follow the description in section Prevent blocking of EO queues

In case you have a different runtime of the messages in the queues (eg due to complex extended Receiver Determinations) the number of messages per queue might be uneven distributed between the queues In PI 73 a new balancing option is available as discussed in Avoid uneven backlogs with queue balancing

Problems when loading the LUW by the QIN scheduler as described in qRFC Queues (SMQ2)

Note If your system uses the parameter EO_INBOUND_TO_OUTBOUND = 0 you must also read chapter

445 for analyzing the reasons EO_INBOUND_TO_OUTBOUND only determines whether inbound queues

(value lsquo0rsquo) or inbound and outbound queues (value lsquo1rsquo) are used for the pipeline processing in the integration

server Check the value with transaction SXMB_ADM Integration Engine Configuration Specific

Configuration The default value is lsquo1rsquo meaning the usage of inbound and outbound queues (the

recommended behavior)

445 Long Processing Times for ldquoDB_SPLITTER_QUEUEINGrdquo

The value for DB_SPLITTER_QUEUEING describes the time that a message has spent waiting in a PI

outbound queue until a work process was assigned In case of errors the time also includes the wait time for

the restart of the LUW in the queue

The outbound queues (XBTO XBTA XBTZ XBTM for EO messages and XBQOXB2O

XBQAXB2A XBQZXB2Z for EOIO messages) process the pipeline steps for the mapping outbound

binding and call adapter

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

35

Thus if the step DB_SPLITTER_QUEUEING has a high value the outbound queues have to be monitored

using transactions SMQ2 and SARFC The reasons are similar as outlined in chapters qRFC Resources

(SARFC) and qRFC Queues (SMQ2) or described in the section above for PI inbound queues

o Not enough resources (DIA work processes for RFC communication) available

o A backlog in the queues caused by one of the following reasons

The number of parallel outbound queues is too low to handle the outgoing messages in parallel Note that simply increasing the parameter EO_OUTBOUND_ PARALLEL might not always be the solution as enough work processes have to be available to process the queues in parallel The number of work processes in turn is limited by the number of CPUs available for the system Also the parameter EO_OUTBOUND_PARALLEL is used differently than the parameter EO_INBOUND_ PARALLEL because it determines the degree of parallelism for each receiver system Thus it is possible to increase the number of parallel outbound queues for specific receivers whereas other receivers are handled with a lower degree of parallelism Note that not every receiver backend is able to handle a high degree of parallelism so that in case of an ABAP Proxy system it might not indicate a problem on PI in case you observe a high value for DB_SPLITTER_QUEUING

The outbound queues were blocked by the first LUW To identify if such a situation occurred in the past use the Wily Introscope dashboard ldquoABAP System qRFC Inbound Detailldquo In general it is not easy to find the one message that is blocking the queue It might be achieved by checking RFC traces or by searching for a single message with a long pipeline processing step For EO queues there is no business reason to block the queues in case of a single message error To prevent this follow the description in section Prevent blocking of EO queues

Since the outbound queues include a processing step that is bound to take longer than the other steps (that is the mapping and the call adapter step) this might mean a long waiting time for all subsequent messages As an example assume that a mapping takes 1 second and that 100 messages are waiting in a queue The first message gets executed at once the second message has to wait about 1 second (a bit more since more steps are executed than just the mapping but this should be ignored at the moment) the third takes 3 seconds and so on The 100

th message

has to wait 100 seconds until it is processed that is the value for DB_SPLITTER_QUEUING is as high as 100 For a given outbound queue you would therefore see an increase for the DB_SPLITTER_QUEUING value over time If you experience this situation proceed with Chapter 442

In case you have a different runtime of the messages in the queues (eg due to different message size) the number of messages per queue might be uneven distributed between the queues In PI 73 a new balancing option is available as discussed in Avoid uneven backlogs with queue balancing

o Problems when loading the LUW by the QIN scheduler as described in qRFC Queues (SMQ2)

Note The number of parallel outbound queues is also connected with the ability of the receiving system to

process a specific amount of messages for each time unit

In section Adapter Parallelism default restrictions of the specific adapters of the J2EE Engine are explained Some adapters (JDBC File Mail) can process only one message for each server node at a time (this can be changed) It does not make sense for such adapters to have too many parallel qRFC queues since this will only move the message backlog from ABAP to Java

For messages that are directed to the IDoc outbound adapter the value for EO_OUTBOUND_PARALLEL is connected to the MAXCONN value of the corresponding outbound destination You can define the maximum number of connections using transaction SMQS ndash the default value is 10 parallel connections If applicable SAP highly recommends that you use IDoc packaging in the outbound IDoc adapter to reduce the load during the transmission of tRFC calls from the PI system to the receiving system

For ABAP proxy systems the number of outbound queues directly determines the number of messages sent in parallel to the receiving system Thus you have to ensure that the resources

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

36

available on PI are aligned with those on the sendingreceiving ABAP proxy system

446 Long Processing Times for ldquoLMS_EXTRACTIONrdquo

Lean Message Search can be configured for newer PI releases as described in the Online Help After

applying Note 1761133 - PI runtime Enhancement of performance measurement an additional header for

LMS will be written to the performance header

The header could look like this indicating that around 25 seconds were spent in the LMS analysis

When using trace level two additional timestamps are written to provide details about this overall runtime

LMS_EXTRACTION_GET_VALUES

This timestamp describes the evaluation of the message according to user-defined filter criteria

LMS_EXTRACTION_ADJUST_VALUES

This timestamp describes the post-processing of the previously found values in the BAdI (as of PI 730)

The runtime of the LMS heavily depends on the number of payload attributes to be extracted Also the

payload size and the complexity of the XPath expression has a direct impact on the performance of the LMS

In general the number of elements to be indexed should be kept at a minimum and very deep and complex

XPath expressions should be avoided

If you want to minimize the impact of LMS on the PI message processing time you can define the extraction

method to use an external job In that case the messages will be indexed after processing only and it will

therefore have no performance impact during runtime Of course this method imposes a delay in the

indexing (based on the frequency of job SXMS_EXTRACT_MESSAGES) so that newly processed

messages cannot be searched using LMS If this delay is acceptable for the monitoring responsible the

messages should be indexed using an external job

447 Other step performed in the ABAP pipeline

As discussed earlier there are additional steps in the ABAP pipeline that can be executed based on the

configuration of your scenario Per default these steps will not be activated and should therefore not

consume any time All these steps are traced in the performance header of the message Below you can see

the details for

- XML validation

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

37

- Virus Scan

In case one of these steps is taking long you have to check the configuration of your scenario In the

example of the Virus scan the problem might be related to the external virus scanner used and tuning has to

happen on that side

45 PI Message Packaging for Integration Engine

PI message packaging was introduced with SAP NetWeaver PI 70 SP13 and is per default activated in PI

71 and higher Message packaging in BPE is independent of message packaging in PI (see chapter 56)

but they can be used together

To improve performance and message throughput asynchronous messages can be assembled to packages

and processed as one entity For this the qRFC scheduler was enabled to process a set of messages (a

package) in one LUW Thus instead of sending one message to the different pipeline steps (for example

mappingrouting) a package of messages will be sent that will reduce the number of context switches that is

required Furthermore access to databases is more efficient since requests can be bundled in one database

operation Depending on the number and size of messages in the queue this procedure improves

performance considerably In return message packaging can increase the runtime for individual messages

(latency) due to the delay in the packaging process

Message packaging is only applicable for asynchronous scenarios (QoS EO and EOIO) Due to the potential

latency of individual messages packaging is not suitable for interfaces with very high runtime requirements

In general packaging has the highest benefit for interfaces with high volume and small message size

The performance improvement achieved directly relates to the number of the messages bundled in each

package Message packaging must not solely be used in the PI system Tests have shown that the

performance improvement significantly increases if message packaging is configured end-to-end - that is

from the sending system using PI to the receiving system Message packaging is mainly applicable for

application systems connected to PI by ABAP proxy

From the runtime perspective no major changes are introduced when activating packaging Messages

remain individual entities in regards to persistence and monitoring Additional transactions are introduced

allowing the monitoring of the packaging process

Messages are also treated individually for error handling If an error occurs in a message processed in a

package then the package will be disassembled and all messages will be processed as single messages Of

course in a case where many errors occur (for example due to interface design) this will reduce the benefits

of message packaging SAP systems connected to PI via ABAP Proxy can also make use of the message

packaging to reduce the HTTP calls necessary to transmit messages between the systems Other sending

and receiving applications in particular will not see any changes because they send and receive individual

messages and receive individual commits

The PI message packaging can be adapted to meet your specific needs In general the packaging is

determined by three parameters

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

38

1) Message count Maximum number of messages in a package (default 100)

2) Maximum package size Sum of all messages in kilobytes (default 1 MB)

3) Delay time Time to wait before the queue is processed if the number of messages does not reach

the message count (default 0 meaning no waiting time)

These values can be adjusted using transactions SXMS_BCONF and SXMS_BCM To better use the

performance improvements offered by packaging you could for example define a specific packaging for

interfaces with very small messages to allow up to 1000 messages for each package Another option could

be to increase the waiting time (only if latency is not critical) to create bigger packages

See SAP Note 1037176 - XI Runtime Message Packaging for more details about the necessary

prerequisites and configuration of message packaging More information is also available at

httphelpsapcom SAP NetWeaver SAP NetWeaver PIMobileIdM 71 SAP NetWeaver Process

Integration 71 Including Enhancement Package 1 SAP NetWeaver Process Integration Library

Function-Oriented View Process Integration Integration Engine Message Packaging

46 Prevent blocking of EO queues

In general messages for interfaces using Quality of Service Exactly Once are independent of each other In

case of an error in one message there is no business reason to stop the processing of other messages But

exactly this happens in case an EO queue goes into error due to an error in the processing of a single

message The queue will then be automatically retried in configurable intervals This retry will cause a delay

of all other messages in the queue which cannot be processed due to the error of the first message in the

queue

To avoid this a new error handling was introduced in SAP Notes 1298448 - XI runtime No automatic retry

for EO message processing (set EO_RETRY_AUTOMATIC = 0 and schedule restart job

RSXMB_RESTART_MESSAGES) and SAP Note 1393039 - XI runtime Dump stops XI queue After

applying this Note EO queues will not go in SYSFAIL status any longer These Notes are recommended for

all PI systems and are per default activated on PI 73 systems By specifying a receiver ID as sub parameter

this behavior can be configured for individual interfaces only

In PI 73 you can also configure the number of messages that would go into error before the whole queue

would be stopped For permanent errors (eg due to a problem with the receiving backend) the queue would

go into SYSFAIL so that not too many messages are going into error This also reduces the number of alerts

for Message Based Alerting To define the threshold the parameter EO_RETRY_AUT_COUNT can be set

The default value is 0 and indicates that the number of messages is not restricted

47 Avoid uneven backlogs with queue balancing

From PI 73 on a new mechanism is available to address the distribution of messages in queues Per default

in all PI versions the messages are getting assigned to the different queues randomly In case of different

runtimes of LUWs caused by eg different message sizes or different runtimes in mappings this can cause

an uneven distribution of messages within the different queues This can increase the latency of the

messages waiting at the end of that queue

To avoid such an unequal distribution the system checks the queue length before putting a new message in

the queue Hence it tries to achieve an equal balancing during inbound processing A queue which has a

higher backlog will get less new messages assigned Queues with fewer entries will get more messages

assigned Therefore it is different than the old BALANCING parameter of category TUNING) which was used

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

39

to rebalance messages already assigned to a queue The assignment of messages to queues is shown in

the diagram below

To activate the new queue balancing you have to set the parameters EO_QUEUE_BALANCING_READ and

EO_QUEUE_BALANCING_SELECT of category TUNING in SXMB_ADM If the value of the parameter

EO_QUEUE_BALANCING_READ is 0 (default value) then the messages are distributed randomly across

the queues that are available If however EO_QUEUE_BALANCING_READ is set to a value n greater than

zero then on average the current fill level of the queue is determined after every nth message and stored in

the shared memory of the application server This data is used as the basis for determining the queue (see

description of parameter EO_QUEUE_BALANCING_SELECT) relevant for balancing

The parameter EO_QUEUE_BALANCING_SELECT specifies the relative fill levels of the queues in percent

Relative here means in relation to the maximum filled queue If there are queues with a lower fill level than

defined here then only these are taken into consideration for distribution If all queues have a higher fill level

then all queues are taken into consideration

Please note that determining the fill level takes place using database access and therefore impacts system

performance The value of EO_QUEUE_BALANCING_READ should for this reason be oriented towards the

message throughput and specific requirements or even distribution For higher volume system a higher value

should be chosen

Letrsquos look at an example In case you set the parameter EO_QUEUE_BALANCING_READ to 1000 after

every 1000th incoming message the queue distribution will be checked This requires a database access

and can therefore cause a performance impact The fill level for each queue will then be written to shared

memory In our example we configured EO_QUEUE_BALANCING_SELECT to 20 At a given point in time

you have three outbound queues on the system XBTO__A contains 500 entries XBTO__B 150 and

XBTO__C contains 50 messages Therefore XBTO__B has a fill level of 30 and XBTO__C of 10 That

means the only queue relevant for balancing is XBTO__C which will get more messages assigned

Based on this example you can see that it is important to find the correct values As a general guideline to

minimize performance impact you should set EO_QUEUE_BALANCING_READ to a rather large value The

correct value of EO_QUEUE_BALANCING_SELECT depends on the criticality of a delay caused by a

backlog

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

40

48 Reduce the number of parallel EOIO queues

As discussed in Chapter Parallelization of PI qRFC Queues it is important to keep the number of queues

limited As a rule of thumb the number of active queues should be equal to the number of available work

processes in the system It is especially crucial to not have a lot of queues containing only one message

since this will cause a high overhead on the QIN scheduler due to very frequent reloading of the queues

Especially for EOIO queues we often see a high number of parallel queues containing only one message

The reason for this is for example that the serialization has to be done on document number to eg ensure

that updates to the same material are not transferred out of sequence Typically during a batch triggered

load of master data this can result in hundreds of EOIO queues in SMQ2 just containing only one or two

messages This is very bad from a performance point of view and will cause significant performance

degradation for all other interfaces running at the same time To overcome this situation the overall number

of EOIO queues (for all interfaces) can be limited as of PI 71 as described in Change Number of EOIO

Queues

The maximum number of queues can be set in SXMB_ADM Configure Filter for Queue Prioritization

Number of EOIO queues

During runtime a new set of queues will be used with the name XB2 as can be seen below for the outbound

queues

These queues will be shared between all EOIO interfaces for the given interface priority and by setting the

number of queues the parallelization for all EOIO interfaces is limited Thus more messages will be using the

same EOIO queue and therefore PI message packaging will work better and also the reloading of the

queues by the QIN scheduler will show much better performance

In case of errors the messages will be removed from the XB2 queues and will be moved to the standard

XBQ queues All other messages for the same serialization context will be moved to the XBQ queue

directly to ensure the serialization is maintained This means that in case of an error the shared EOIO

queues will not be blocked and messages for other serialization contexts will be not delayed

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

41

49 Tuning the ABAP IDoc Adapter

Very often the IDoc adapter deals with very high message volume and tuning of this is very essential

491 ABAP basis tuning

As stated above the IDoc adapter uses the tRFC layer to transfer messages from PI to the receiver system

The IDoc adapter will only put the message in SM58 and from there the standard tRFC layer forwards the

LUW You therefore have to ensure that sufficient resources (mainly DIA WPs) are available for processing

tRFC requests

Wily Introscope also offers a dashboard (shown below) for the tRFC layer which shows the tRFC entries and

their different statuses With these dashboards you are also able to identify history backlogs on tRFC

In order to control the resources used when sending the IDocs from sender system to PI or from PI to the

receiver backend you can also think about registering the RFC destinations in SMQS in PI as known from

standard ALE tuning By changing the Max Conn or the Max Runtime values in SMQS you can limit or

increase the number of DIA WPs used by the tRFC layer to send the IDocs for a given destination This will

mitigate the risk of system overload on sender and receiver side

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

42

492 Packaging on sender and receiver side

One option for improving the performance of messages processed with the IDoc adapter is to use IDoc

packaging

The receiver IDoc adapter in PI (for sending messages from PI to the receiving application system) already

had the option for packaging IDocs in previous releases For this messages processed in the IDoc adapter

are bundled together before being sent out to the receiving system Thus only one tRFC call is required to

send multiple IDocs The message packaging that is being discussed in section PI Message Packaging uses

a similar approach and therefore replaces the ldquooldrdquo packaging of IDocs in the receiver IDoc adapter

Therefore we highly recommend configuring Message Packaging since this helps transferring data for the

IDoc adapter as well as the ABAP Proxy

For the sender IDoc adapter there was previously the option to activate packaging in the partner profile of

the sending system in case IDocs are not send immediately but in batch By specifying the ldquoMaximum

Number of IDocsrdquo in the report RSEOUT00 multiple IDocs are bundled in one tRFC calls At the PI side

these packages will be disassembled by the IDoc adapter and the messages will be processed individually

Therefore this will help mainly to reduce the tRFC resources involved in transferring the data between the

systems but not within PI

This behavior was changed with SAP NetWeaver PI 711 and higher If the sender backend sends an IDoc

package to PI a new option has been introduced which allows the processing of IDocs as a package on PI

as well You can specify an IDoc package size on the sender IDoc Communication Channel The necessary

configuration is described in detail in the following SDN blog - IDoc Package Processing Using Sender IDoc

Adapter in PI EhP1 A significant increase in throughput was recorded for high volume interfaces with small

IDocs

493 Configuration of IDoc posting on receiver side

As described in SAP Note 1828282 - IDoc adapter long runtime XI -gt IDoc-gt ALE also the way the IDocs

are posted on the receiver ERP backend will directly influence the processing of the LUW on the PI side In

general two options exist for IDoc posting

Trigger immediately The IDoc will be posted directly when it is received For this a free dialog workprocess will be required

Trigger by background program If this option is chosen the IDoc is persisted on the ERP side and the posting of the application data will happen asynchronously using report RBDAPP01 via a background job

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

43

When ldquotrigger immediatelyrdquo is configured the sender system (PI) has to wait till the IDoc is posted While the

workprocess will roll out you will see the LUW in ldquoTransaction Executingrdquo during this time For the

IDOC_AAE the RFC connection and the java threads will be kept busy until the IDOC is posted in the

backend therefore leading to backlogs on the IDOC_AAE

The coding for posting the application data can be very complex and therefore this operation can take a long

time consuming unnecessary resources also on the sender side With background processing of the IDocs

the sender and receiver systems are decoupled from each other

Due to this we generally recommend to use ldquotrigger immediatelyrdquo in exceptional cases where the data has to

be posted without any delay In all other cases like high volume replication of master data background

processing of IDocs should be chosen

With Note 1793313 and 1855518 a third option for posting IDocs on the ERP side was introduced This

option is using bgRFC on the ERP side to queue the IDocs As soon as the configured resources for the

bgRFC destination are available the IDocs will be posted By doing so the IDocs will be posted based on

resource availability on the receiver system and no additional background jobs will be required Furthermore

storing the LUWs in bgRFC will ensure that the sender and receiver systems are decoupled This option

therefore guarantees that the IDocs will be posted as soon as possible based on the available resources on

the receiver system without the requirement to schedule many background jobs This new option is therefore

the recommended way of IDoc posting in systems that fulfill the necessary requirements

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

44

410 Message Prioritization on the ABAP Stack

A performance problem is often caused by an overload of the system for example due to a high volume

master interface If business critical interfaces with a maximum response time are running at the same time

then they might face delays due to a lack of resources Furthermore as described in qRFC Queues (SMQ2)

messages of different interfaces use the same inbound queue by default which means that critical and less-

critical messages are mixed up in the same queue

We highly recommend using message prioritization on the ABAP stack to prevent such delays for critical

interfaces (for Java see Interface Prioritization in the Messaging System )

For more information about PI prioritization see httphelpsapcom and navigate to SAP NetWeaver SAP

NetWeaver PIMobileIdM 71 SAP NetWeaver Process Integration 71 Including Enhancement Package 1

SAP NetWeaver Process Integration Library Function-Oriented View Process Integration

Integration Engine Prioritized Message Processing

To balance the available resources further between the configured interfaces you can also configure a

different parallelization level for queues with different priorities Details for this can be found in SAP Note

1333028 ndash Different Parallelization for HP Queues

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

45

5 ANALYZING THE BUSINESS PROCESS ENGINE (BPE)

This chapter helps to analyze performance problems if Chapter Determining the Bottleneck has shown that

the BPE is the slowest step for a message during its time in the PI system Of course this type of runtime

engine is very susceptible to inefficient design of the implemented integration process Information about

best practices for designing BPE processes can be found in the documentation Making Correct Use of

Integration Processes In general the memory and resource consumption is higher than for a simple pipeline

processing in the Integration Engine As outlined in the document linked above every message that is sent

to BPE and every message that is sent from BPE is duplicated In addition work items are created for the

integration process itself as well as for every step More database space is therefore required than

previously expected and more CPU time is needed for the additional steps

51 Work Process Overview

The Business Process Engine executes the steps (work items) of an integration process using tRFC calls to

the RFC destination WORKFLOW_LOCAL_ltclientgt To check if there are enough resources available to

process the work items call transaction SM50 (on each application server) while one of the integration

processes is running

o Look for the user ldquoWF-BATCHrdquo and follow its activities by refreshing the transaction

o Are there always dialog work processes available

The most important point to keep in mind here is the concurrent activity of the Business Process Engine (for

ccBPM processes) and the Integration Engine (for pipeline processing) Both runtime environments need

dialog work processes Thus performance problems in one of the engines cannot be solved by restricting

the other Rather an appropriate balance between the two runtimes has to be found

The activity of the WF-BATCH user is not restricted by default It is highly recommended that you register the

RFC destination WORKFLOW_LOCAL_ltclientgt with transaction SMQS and assign a suitable amount of

maximum connections Otherwise ccBPM activity may block the pipeline processing of the Integration

Server and reduce the throughput of PI drastically SAP Note 888279 - Regulatingdistributing the Workflow

Load gives more details

In addition it might help to use specific servers (dialog instances) to carry out a given workflow or to use load

balancing Of course both options only apply for larger PI systems that consist of at least a central instance

and one dialog instance

52 Duration of Integration Process Steps

As described in section Processing Time in the Business Process Engine with PI 73 there is a new monitor

available in ldquoConfiguration and Monitoring Homerdquo on PI start page It shows the duration of individual steps

of an integration process in a very easy way and is the now tool to analyze performance related issues on

the ccBPM

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

46

For releases prior than PI 73 check in the workflow log for long-running steps as described in the previous

chapter Use the ldquoList with Technical Detailsrdquo to do so Since the design of integration processes varies

significantly there is no general rule for the duration of a step or for specific sequences of steps Instead

you have to get a feeling for your implemented integration processes

Did a specific process step decrease in performance over a period of time

Does one specific process step stick out with regard to the other steps of the same integration process

Do you observe long durations for a transformation step (ldquomappingrdquo)

Do you observe long durations of synchronous sendreceive steps in the integration process which correlates for example to a slow response from a connected remote system

Use the SAP Notes database to learn about recommendations and hints SAP Note 857530 - BPE

Composite Note Regarding Performance acts as a central note for all performance notes and might be used

as the entry point

Once you are able to answer the above questions there are several paths to follow

o Two common reasons should be taken into consideration for a performance decrease over time Firstly the load on PI might have increased over time as well Check the hardware resources (chapter 9) and carefully monitor the work process availability (chapter 51) Alternatively the underlying databases might slow down the integration processes Use chapter 54 to check this possibility

o If it is a specific process step that sticks out the process design itself has to be reviewed The SAP Notes mentioned above might help here

o If the mapping step takes a long time it is worth having a look at chapter 442 since the BPE and the IE use the same resources (mapping connections from the ABAP gateway to the J2EE server) to execute their mapping

o If there is a long processing time for a synchronous sendreceive step check the connected backend system for any performance issues

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

47

53 Advanced Analysis Load Created by the Business Process Engine (ST03N)

Since the Business Process Engine runs on the ABAP part of the PI Integration Server you can use the

statistical data written by the dialog work processes to analyze a performance problem

Procedure

Log in to your Integration Server and call transaction ST03N Switch to ldquoExpert Moderdquo and choose the

appropriate timeframe (this could be the ldquoLast Minutersquos Loadrdquo from the Detailed Analysis menu or a specific

day or week from the Workload menu) In the Analysis View navigate to User and Settlement Statistics and

then to User Profile The user you are interested in is the WF-BATCH user who does all ccBPM-related work

Compare the total CPU Time (in seconds) to the time frame (in seconds) of the analysis In the above example an analysis time frame of 1 hour has been chosen which corresponds to 3600 seconds To be exact these 3600 seconds have to be multiplied by the number of available CPUs Out of these 3600CPU seconds the WF-BATCH user used up 36 seconds

o The above value gives you a good idea of how much load the Business Process Engine creates on your PI Integration Server By comparison with the other users (as well as the CPU utilization from transaction ST06) you can determine if the Integration Server is able to handle this load with the available number of CPUs or not This does not help you to optimize the load but it does provide support when deciding how to distribute the workload of the BPE and the workload of the Integration Server (see Chapter 51)

Experienced ABAP administrators should also analyze the database time wait time and compare these values

54 Database Reorganization

The database is accessed many times during the processing of an Integration Process This is not usually

connected with performance problems but if a specific database table is large then statements may take

longer than for small database tables

Use transaction ST05 to collect a SQL trace while the integration process in question is being executed Restrict the trace to user WF-BATCH When the integration process is finished stop the trace and view it Look for high values in the duration column especially for SELECT statements

Use transaction SE16 to determine the number of entries for the typical workflow tables SWW_WI2OBJ and SWFRXI There are many more database tables used by BPE but if the number of entries is high for the above tables they are high for the rest as well See SAP Note 872388 - Troubleshooting Archiving and Deletion in XI and the links given in this note if you find high amounts of entries

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

48

55 Queuing and ccBPM (or increasing Parallelism for ccBPM Processes)

Until recently ccBPM processes accepted messages from only one queue (one queue XBQO$PEWS for

each workflow) If there was a high message throughput for this workflow then a high backlog occurred for

these queues A couple of enhancements have therefore been implemented to improve scalability if the

ccBPM runtime

o Inbound processing without buffering

o Use of one configurable queue that can be prioritized or assigned to a dedicated server

o Use of multiple inbound queues (queue name will be XBPE_WS)

o Configure transactional behavior of ccBPM process steps to reduce number of persistency steps

Details about the configuration and tuning of the ccBPM runtime are described at

httpswwwsdnsapcomirjsdnhowtoguides SAP Netweaver 70 End-to-End Process Integration

o How to Configure Inbound Processing in ccBPM Part I Delivery Mode

o How to Configure Inbound Processing in ccBPM Part II Queue Assignment

o How to Configure ccBPM Runtime Part III Transactional Behavior of an Integration Process

Procedure

Use transaction SMQ2 to check if the queues of type XBQO$PE show a high backlog

o Based on this information verify if the ccBPM process is suitable to run with multiple queues This is especially the case for collect patterns realized in BPE If you can use multiple queues configure the workflow based on the above guides

56 Message Packaging in BPE

Inbound processing takes up the most amount of processing time in many scenarios within BPE

Message packaging in BPE helps to improve performance by delivering multiple messages to BPE process

instances in one transaction This can lead to an increased throughput which means that the number of

messages that can be processed in a specific amount of time can increase significantly Message packaging

can also increase the runtime for individual messages (latency) due to the delay in the packaging process

The sending of packages can be triggered when the packages exceed a certain number of messages a

specific package size (in kB) or a maximum waiting time The extent of the performance improvement that

can be obtained depends on the type of scenario Scenarios with the following prerequisites are generally

most suitable

o Many messages are received for each process instance

o Messages that are sent to a process instance arrive together in a short period of time

o Generally high load on the process type

o Messages do not have to be delivered immediately

For example collect scenarios are particularly suitable for message packaging The higher the number of

messages is in a package the higher the potential performance improvements will be

Tests that have been performed have shown a high potential for throughput improvements up to factor 47 in

tested Collect Scenarios depending on the packaging size that has been configured For more details about

the functionality of Message Packaging in BPE and its configuration see SAP Note 1058623 - BPE

Message Packaging

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

49

Message packaging in BPE is independent of message packaging in PI (see Chapter 45) but they can be

used together

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

50

6 ANALYZING THE (ADVANCED) ADAPTER ENGINE

This chapter describes further checks and possible reasons for bottlenecks in the (Advanced) Adapter

Engine Although this is not the place to describe the architecture of the Adapter Framework in detail the

following aspects are important to know for the analysis of a performance problem on the AFW

o The AFW contains the Messaging System that is responsible for the persistence and queuing of messages Within the flow of a message it is placed in between the sender adapter (for example a file adapter that polls from a folder) and the Integration Server as well as between the Integration Server and the receiver adapter (for example a JMS adapter that sends a message out to an external system)

o The Messaging System uses 4 queues for each adapter type and these are responsible for receiving and sending messages To separate the message transfer each adapter (for example JMS SOAP JDBC and so on) has its own set of queues One queue for receiving synchronous messages (Request queue) one queue for sending synchronous messages (Call queue) one queue for receiving asynchronous messages (Receive queue) and one queue for sending asynchronous messages (Send queue) The name of the queue always states the adapter type and the queue name (for example JMS_httpsapcomxiXISystemReceive for the JMS receiver asynchronous queues)

o A configurable number of consumer threads are assigned on each queue These threads work on the queue in parallel meaning that the Messaging Queues are not strictly FirstIn-FirstOut Five threads are assigned by default to each queue and thus five messages can be processed in parallel But as will be discussed later the parallel execution of requests to the backend depends heavily on the adapter being used since not every adapter can work in parallel by default

o SAP NetWeaver PI 71 introduced a new queue The dispatcher queue The dispatcher queue is used to realize prioritization on the AAE (see chapter Prioritization in the Messaging System) Every message in the PI system is first assigned to the dispatcher queue before it is assigned to the adapter-specific queues

Viewed from the perspective of a message that enters the PI system using a J2EE Engine adapter (for

example File) and leaves the PI system using a J2EE Engine adapter (for example JDBC) asynchronously

the steps are as follows

1) Enter the sender adapter

2) Put into the dispatcher queue of the messaging system

3) Forwarded to the File send queue of the messaging system (based on Interface priority)

4) Retrieved from the send queue by a consumer thread

5) Sent from the messaging system to the pipeline of the ABAP Integration Engine (if no Integrated Configuration (AAE) is used

6) Processed by the pipeline steps of the Integration Engine

7) Sent to the messaging system by the Integration Engine

8) Put into the dispatcher queue of the messaging system

9) Forwarded to the JDBC receive queue of the messaging system (based on Interface priority)

10) Retrieved from the receive queue (based on the maxReceivers)

11) Sent to the receiver adapter

12) Sent to the final receiver

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

51

All the analysis is based on the information provided in the Audit Log With SAP NetWeaver PI 71 the audit

log is not persisted for successful messages in the database by default to avoid performance overhead

Therefore the audit log is only available in the cache for a limited period of time (based on the overall

message volume) More details can be found in chapter Persistence of Audit Log information in PI 710 and

higher

As described in chapter Adapter Engine Performance monitor in PI 731 and higher a new performance

monitor is available from PI 731 SP4 With this monitor you can display the processing time per interface in

a given interval and identify the processing steps in which a lot of time is spend

With 731 SP10 and 74 SP5 a download functionality for the performance monitor is provided as described

in SAP Note 1886761

61 Adapter Performance Problem

Compared to a bottleneck in the Messaging System it is relatively easy to detect a performance

problembottleneck in the adapter Since the timestamps for inbound and outbound messages are written at

different points in time the procedure is described separately for sender and receiver adapters

Note It is not generally possible for all sender and receiver adapters to work in parallel (discussed in section

Adapter Parallelism)

611 Adapter Parallelism

As mentioned before not all adapters are able to handle requests in parallel Thus increasing the number of

threads working on a queue in the messaging system will not solve a performance problembottleneck

There are 3 strategies to work around these restrictions

1) Create additional communication channels with a different name and adjust the respective senderreceiver agreements to use them in parallel

2) Add a second server node that will automatically have the same adapters and communication channel running as the first server node This does not work for polling sender adapters (File JDBC or Mail) since the adapter framework scheduler assigns only one server node to a polling communication channel

3) Install and use a non-central Adapter Framework for performance-critical interfaces to achieve better separation of interfaces

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

52

Some of the most frequently-used adapters and the possible options are discussed below

o Polling Adapters (JDBC Mail File)

At the sender side these adapters use an Adapter Framework Scheduler which assigns a server node that does the polling at the specified interval Thus only one server node in the J2EE cluster does the polling and no parallelization can be achieved Therefore scaling via additional server node is not possible More details will be presented in section Adapter Framework Scheduler For example since the channels are doing the same SELECT statement on the database or pick up files with the same file name parallel processing will only result in locking problems To increase the throughput for such interfaces the polling interval has to be reduced to avoid a big backlog of data that is to be polled If the volume is still too high you should think about creating a second interface for example the new interface would poll the data from a different directory or database table to avoid locking

At the receiver side the adapters work sequentially on each server node by default For example only one UPDATE statement can be executed for JDBC for each Communication Channel (independent of the number of consumer threads configured in the Messaging System) and all the other messages for the same Communication Channel will wait until the first one is finished This is done to avoid blocking situations on the remote database On the other hand this can cause blocking situations for whole adapters as discussed in section Avoid Blocking Caused by Single SlowHanging Receiver Interface

To allow better throughput for these adapters you can configure the degree of parallelism at the receiver side In the Processing tab of the Communication Channel in the field ldquoMaximum Concurrencyrdquo enter the number of messages to be processed in parallel by the receiver channel For example if you enter the value 2 then two messages are processed in parallel on one J2EE server node The parallel execution of these statements at database level of course depends on the nature of the statements and the isolation level defined on the database If all statements update the same database record database locking will occur and no parallelization can be achieved

o JMS Adapter

The JMS adapter is per default using a push mechanism on the PI sender side This means the data is pushed by the sending MQ provider Per default every Communication channel has one JMS connection established per J2EE server node On each connection it processes one message after the other so that the processing is sequential Since there is one connection per server node scaling via additional server nodes is an option

With Note 1502046 - JMS Adapter message pull mechanism the JMS protocol can be changed to use a pull mechanism By doing so you can specify a polling interval in the PI Communication Channel and PI will be the initiator of the communication Also here the Adapter Framework Scheduler is used which implies sequential processing But in contrary to JDBC or File sender channels the JMS polling sender channel will allow parallel processing on all server nodes of the J2EE cluster Therefore scaling via additional J2EE server nodes is an option

The JMS receiver side supports parallel operation out of the box Only some small parts during message processing (pure sending of the message to the JMS provider) are synchronized Therefore to enable parallel processing on the JMS receiver side no actions are necessary

o SOAP Adapter

The SOAP adapter is able to process requests in parallel The SOAP sender side has in general no limitations in the number of requests it can execute in parallel The limiting factor here is the FCAThreads available to process the incoming HTTP calls This is discussed in more detail in chapter Tuning SOAP sender adapter On the receiver side the parallelism depends on the number of the threads defined in the messaging system and the ability of the receiving system to cope with the load

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

53

o RFC Adapter

The RFC adapter offers parameters to adjust the degree of parallelism by defining the number of initial and maximum connections to be used The initial threads are allocated from the Application thread pool directly and are therefore not available for any other tasks Therefore the number of initial thread should be kept minimal To avoid bottlenecks during the peak time the maximum connections can be used A bottleneck is indicated by the following exception in the Audit Log

comsapaiiaframsapiDeliveryException error while processing message to remote systemcomsapaiiafrfccoreclientRfcClientException resource error could not get a client from JCOPool comsapmwjcoJCO$Exception (106) JCO_ERROR_RESOURCE Connection pool RfcClient hellip is exhausted The current pool size limit (max connections) is 1 connections

Also this should be done carefully since these threads will be taken from the J2EE application thread pool Therefore a very high value there can cause a bottleneck on the J2EE engine and therefore cause a major instability of the system As per RFC adapter online help the maximum number of connections are restricted to 50

o IDoc_AAE Adapter

The IDoc adapter on Java (IDoc_AAE) was introduced with PI 73 On the sender side the parallelization depends on the configuration mode chosen In Manual Mode the adapter works sequential per server node For channels in ldquoDefault Moderdquo it depends on the configuration of the inbound Resource Adapter (RA) Via the parameter MaxReaderThreadCount of the inbound RA you can configure how many threads are globally available for all IDoc adapters running in Default Mode Hence this determines the overall parallelization of the IDoc sender adapter per Java server node Currently the recommended maximum number of threads is 10

The receiver side of the Java IDoc adapter is working parallel per default

The table below gives a summary of the parallelism for the different adapter types

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

54

612 Sender Adapter

The first step(s) to be taken for a sender adapter depend on the type of adapter itself Afterwards however

the message flow is always the same First the message is processed in the Module Processor and

afterwards put into the queue of the Messaging System If the message is synchronous then it is the call

queue if the message is asynchronous that it is the send queue The final action of the Adapter Framework

is then to send the message from the messaging system to the Integration Server (for Non ICO interfaces)

Only the first entries of the audit log are of interest to establish if the sender adapter has a performance

problem From the first entry to the entry ldquoMessage successfully put into the queuerdquo These steps logically

belong to the sender adapter and the Module Processor while the subsequent steps belong to the

Messaging System The sender adapter uses one J2EE Engine application thread to execute its work This

is generally the same thread for all steps involved

Select the ldquoDetailsrdquo of a message in the AFW The first entry of the audit log will be the beginning of the activity of the sender adapter The end of the adapterrsquos activity is marked by the entry ldquoThe application sent the message asynchronously using connection Returning to applicationrdquo

6121 Tuning SOAP sender adapter As described above in general there is no limitation in the parallelization of the SOAP sender adapter itself

The incoming SOAP requests are limited by the available FCA Threads for HTTP processing The FCA

Threads are discussed in more detail in FCA Server Threads

Per default all interfaces using the SOAP adapter are using the same set of FCA Threads since they are all

using the same URL httplthostgtltportgtMessageServlet In case one interface is facing a very high load or

slow backend connections this can cause a blockage of the available FCA Threads and could cause a heavy

impact on the processing time of other interfaces

To avoid this SAP Note 1903431 allows the usage of different URLs for specific interfaces This way you

could isolate interfaces with a very high volume on a dedicated entry point using its own set of FCA Threads

In case of high load for this interface the other SOAP sender interfaces would not be affected by a shortage

of FCA Threads

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

55

To use this new feature you can specify on the sender system the following two entry points

o MessageServletInternal

o MessageServletExternal

More information about the URL of the SOAP adapter can be found here

613 Receiver Adapter

For a receiver adapter it is just the other way around when compared to the sender adapter First the

message is received from the Integration Server and then put into a queue of the Messaging System this

time either the request or the receive queue for asynchronous and the request queue for synchronous

messages The last step is then the activity of the receiver adapter itself

Select the ldquoDetailsrdquo of a message in the AFW The adapter activity starts after the entry ldquoThe message status set to DLNGrdquo The message will then be forwarded to the entry point of the AFW ldquoDelivering to channel ltchannel_namegtrdquo enters the Module Processor ldquoMP Entering Module Processorrdquo and ends with ldquoThe message was successfully delivered to the application using connection rdquo

Depending on the type of the adapter you now have to investigate why the receiver adapter needs a high amount of processing time This could be a complicated operating system command a bad network connection or a slow backend system among many other reasons

Here too custom modules can be the reason for a prolonged processing time Check Performance of Module Processing for more details

Note From 71 the audit log is not persisted any longer for performance reasons as described in Persistence of Audit Log information in PI 710 and higher

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

56

614 IDoc_AAE adapter tuning

A comprehensive and up-to date description about the tuning of the IDoc adapter is summarized in Note

1641541 - IDoc_AAE Adapter performance tuning Please refer to this Note for details

Similar to all other adapters running on the Adapter Engine also the IDOC_AAE adapter is using the Messaging System queues and the explanations of the next chapter Messaging System Bottleneck are also valid for the IDOC_AAE adapter Note that the IDOC_AAE adapter only supports async Scenarios ndash so it is only using the ldquoSendrdquo queue in Java-only scenarios and the ldquoSendrdquo and ldquoReceiverdquo queue in dual-stack scenariosrdquo

For the IDoc_AAE sender you can configure in addition bulk support as indicated in the screenshot below By doing so all messages received by PI in one RFC call from ERP will be processed as one message This is similar to the mechanism of the ABAP IDoc adapter described in the chapter Packaging on sender and receiver side except with the difference that the number of messages in one bulk cannot be further limited

It is also essential to ensure that the size of the IDocs or IDoc packages for sender or receiver IDocs is not

getting too large to avoid any negative impact on the Java heap memory and Garbage Collection In the

case of the IDoc_AAE adapter the XML size of the IDoc is less critical it is the number of segments in an

IDoc or the overall number of segments in all IDocs of an IDoc package that will influence the memory

allocation during message processing Based on the current implementation per IDoc segment around 5 KB

of memory during processing is required A package of 5 IDocs with 2000 segments for each IDoc would

consume roughly 500 MB during processing In such cases it is important to eg lower the package size

andor use large message queues as outlined in chapter Large message queues on PI Adapter Engine

615 Packaging for Proxy (SOAP in XI 30 protocol) and Java IDoc adapter

Due to message packaging in the Integration Engine (see PI Message Packaging for Integration Engine) the

ABAP based IDoc and Proxy adapter was always using packaging when sending asynchronous messages

to the receiver system With PI 731 packaging was also introduced for Java based IDoc and Proxy (SOAP in

XI 30 protocol) adapter The aim of packaging is to achieve similar benefits to packaging already existing in

ABAP pipeline

o Less calls to backend consuming less parallel Dialog Work Processes on the receiver

o Less overhead due to context switches authentication when calling the backend

Especially for high volume interfaces packaging should be enabled to avoid overloading of the backend

systems and impact on users operating on the system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

57

Important

In the ABAP stack the packaging is active per default (in 71 and higher) as described in chapter PI

Message Packaging for Integration Engine Experience shows that packaging can have a very high impact

on the DIA Work process usage on the ERP receiver side Especially for migration projects from dual-stack

PI to AEXPO it is important to evaluate packaging to avoid any negative impact on the receiving ECC due

to missing packaging

The packaging on Java works in general different then on ABAP While in ABAP the aggregation is done on

the individual qRFC queue level (which always contains messages to one receiver system only) this is not

possible on the Adapter Engine since messages for all receivers are located in the same queue Therefore

packages are built outside of the Adapter Engine queues The messages to be packaged will be forwarded

to a bulk handler which waits till either the time for the package is exceeded or the number of messages or

data size is reached After this the message is send by a bulk thread to the receiving backend system

Packaging on Java is always done for a server node individually If you have a high number of server nodes

packaging will work less efficiently due to the load balancing of messages across the available server nodes

When enabling message packaging the message status will stay in status ldquoDeliveringrdquo throughout all steps

described above In the audit log you can see the time it spends in packaging The audit log shown below

shows eg that the message was almost waiting one minute before the package was built and 9 IDocs were

sent with this package

Packaging can be enabled globally by setting the following parameters in Messaging System service as

described in Note 1913972

o messagingsystemmsgcollectorenabled Enable packaging or disable it globally

o messagingsystemmsgcollectormaxMemPerBulk The maximum size of a bulk message

o messagingsystemmsgcollectorbulkTimeout The wait time per bulk - default 60 seconds

o messagingsystemmsgcollectormaxMemTotal Maximum memory which can be used by message collector

o messagingsystemmsgcollectorpoolSize Number of parallel threads used to send out packages to the backend This value has to be tuned based on the general performance of the receiver system the network latency between the systems or the configuration of the IDoc posting on the ERP sender side (described in chapter IDoc posting on receiver side) Therefore tuning of these threads might be very important These threads can unfortunately not be monitored yet with any PI tool or Wily

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

58

Introscope To verify that not all packaging threads are in use you can use the Thread monitoring in the SAP Management Console and check for threads named ldquoBULK_EXECUTORrdquo as shown below

If you would like to adapt the packaging for specific communication channel this can also be done using the

configuration options in the Integration Directory

While in the SOAP adapter in XI 30 protocol you can define three different packaging KPIs this is currently

not possible for the IDoc adapter There you can only specify the package size based on number of

messages

616 Adapter Framework Scheduler

For polling adapters like File JDBC and JMS the Adapter Framework Scheduler determines on which server

node the polling takes place Per default a File or JDBC sender only works on one server node and the

Adapter Framework (AFW) Scheduler determines on which one the job is scheduled

The Adapter Frameworks scheduler had several performance and reliability issues in a cluster environment

especially in systems having many Java server nodes and many polling communication channels Based on

this and the symptoms described in Note 1355715 - AF Scheduler to avoid using cluster communication a

new scheduler was released We highly recommend using the new version of the AFW scheduler To

activate the new scheduler you have to set the parameter schedulerrelocMode of the Adapter framework

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

59

service property to some negative value For instance by setting the value to -15 after every 15th polling

interval a rebalancing of the channel within the J2EE cluster might happen This can allow for better

balancing of the incoming load across the available server nodes if the files are coming in on regular

intervals The balancing is achieved by doing a servlet call Based on the HTTP load balancing the channel

might then be dispatched to another server node To avoid the balancing overhead this value should not be

set too low A value between -10 and -15 should be acceptable in most cases For very short polling intervals

(eg every second) even lower values (eg -50) can be configured

In case many files are put at one given time on the directory (by eg using a batch process) all these files will

be processed by one polling process only Hence a proper load balancing cannot be achieved by the AFW

scheduler In such a case the only option is to use write the files with different names or different directories

so that you can configure multiple sender channels to pick up the files

Starting from PI 731 you can monitor the Adapter Framework Scheduler in pimon Monitoring

Background Job Processing Monitor Adapter Framework Scheduler Jobs There you can check on which

server node the Communication Channel is polling (Status=rdquoActiverdquo) Also you can see the time the channel

was polling the last time and will poll next time

You cannot influence the server node on which the channel is polling This is determined via the AFW

scheduler You can only influence the frequency after which a channel can potentially move to another

server node by tuning the parameter relocMode as outlined above

Prior to PI 731 you have to use the following URL to monitor the Adapter Framework Scheduler

httpserverportAdapterFrameworkschedulerschedulerjsp

Above you see an example of a File sender channel The type value determines if the channel runs only on

one server node (type = L (local)) or on all server nodes (Type = G (global)) The time in the brackets

[60000] determines the polling interval in ms ndash in this case 60 seconds The Status column shows the

status of the channel

o ldquoONrdquo Currently polling

o ldquoonrdquo Currently waiting for the next polling

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

60

o ldquooffrdquo not scheduled any longer (eg channel deactivated or scheduled on another server node)

62 Messaging System Bottleneck

As described earlier the Messaging System is responsible for persisting and queuing messages that come

from any sender adapter and go to the Integration Server pipeline or Java only interfaces (ICO) or that come

from the Integration Server pipeline and go to any receiver adapter The time difference between receiving

the message from the one side and delivering it to the other side is the value that can be used to analyze

bottlenecks in the Messaging System The following chapters describe how to determine the time difference

for both directions

A bottleneck (or rather a ldquofakerdquo bottleneck) in the Messaging System can usually be closely connected with a

performance problem in the receiver adapter You must therefore make sure you execute the check

described in chapter 613 If the receiver adapter needs a lot of time to process messages then of course

the messages get queued in the messaging system and remain there for a long time since the adapter is not

ready to process the next one yet It looks like the messaging system is not fast enough but actually the

receiver adapter is the limiting factor

Execute the checks described in chapter 621 and 622 to get an overview of the situation you should do this for multiple messages

Decide if a specific receiver adapter is causing long waiting times in the messaging system by executing the check described in chapter 613 An additional hint at a receiver adapter problem is if only messages to specific receivers show a long wait time in the queue

Use the queue monitoring in the Runtime Workbench (RWB) to analyze how many threads are active for each of the queue types and how large the queue size currently is To do so choose Component Monitoring Adapter Engine Press the Engine Status button and choose tab ldquoAdditional Datardquo This will open the page below showing the number of messages in a queue the threads currently working and the maximum available threads Note This view only shows the Messaging System of one J2EE server node To get the whole picture you have to check all the server nodes that are available in your system via the drop down list box

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

61

o Since checking all the server nodes with this browser frontend can be very tedious it is better to use Wily Introscope since it allows easy graphical analysis of the queues and thread usage of the messaging system across Java server nodes

The starting point in Wily Introscope is usually the PI triage showing all the backlogs in the queue In the screenshot below a backlog in the dispatcher queue (for 71 and higher only) can be seen

The dispatcher queue is not the problem here since it will forward messages to the adapter queue if any free

threads are available on the adapter specific consumer threads The analysis must start in the PI inbound

queues By using the navigation button in the upper right corner of the inbound queue size you can directly

jump to a more detailed view where you can see that the file adapter was causing the backlog

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

62

To see the consumer thread usage you can then follow the link to the file adapter In the screenshot below

you can see that all consumer threads were exceeded during the time of the backlog Thus increasing the

number of consumer threads could be a solution if the resulting delay is not acceptable for business

o If it is obvious that the messaging system does not have enough resources as in the case above increase the number of consumers for the queue in question Use the results of the previous check to decide which adapter queues need more consumers

The adapter specific queues in the messaging system have to be configured in the NWA using

service ldquoXPI Service AF Corerdquo and property messagingconnectionDefinition The default

values for the sending and receiving consumer threads are set as follows

(name=global messageListener=localejbsAFWListener exceptionListener=

localejbsAFWListener pollInterval=60000 pollAttempts=60

SendmaxConsumers=5 RecvmaxConsumers=5 CallmaxConsumers=5

RqstmaxConsumers=5)

To set individual values for a specific adapter type you have to add a new property set to the default set with the name of the respective adapter type for example

(name=JMS_httpsapcomxiXISystem

messageListener=localejbsAFWListener

exceptionListener=localejbsAFWListener pollInterval=60000

pollAttempts=60 SendmaxConsumers=7 RecvmaxConsumers=7

CallmaxConsumers=7 RqstmaxConsumers=7)

The name of the adapter specific queue is based on the pattern ltadapter_namegt_ ltnamespacegt for

example RFC_httpsapcomxiXISystem

Note that you must not change parameters such as pollInterval and pollAttempts For more details see SAP Note 791655 - Documentation of the XI Messaging System Service Properties

o Not all adapters use the parameter above Some special adapters like CIDX RNIF or Java Proxy can be changed by using the service ldquoXPI Service Messaging Systemrdquo and property messagingconnectionParams by adding the relevant lines for the adapter in question as described above

o A performance problem on the Messaging System can also be caused by a high logging as described in chapter Logging Staging on the AAE (PI 73 and higher)

o Important Make sure that your system is able to handle the additional threads that is monitor the CPU usage and application thread availability as well as the memory of the J2EE Engine after you have applied the change

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

63

o Important Keep in mind that not all adapters are able to handle requests in parallel even if you assign more threads on the Messaging System queues In such a case adding additional threads will not speed up processing significantly See chapter Adapter Parallelism for details

o A backlog of messages in the messaging system is highly critical for synchronous scenarios due to timeouts that might occur Therefore the number of synchronous threads always has to be tuned so that no bottleneck occurs Often a backlog in synchronous queues is caused due to bad performance on the receiving backend system

o More information is provided in the blog Tuning the PI Messaging System Queues

621 Messaging System in between AFW Sender Adapter and Integration Server (Outbound)

Open the Audit Log of a message and look for the following entry ldquoThe message was successfully retrieved

from the send queuerdquo Note that the queue name would be ldquoCall Queuerdquo if the message was synchronous

To determine how long the message waited to be picked up from the queue compare the timestamp of the

above entry with the timestamp for the step ldquoMessage successfully put into the queuerdquo A large time

difference between those two timestamps would therefore indicate a bottleneck for consumer threads on the

sender queues in the Messaging System

Instead of checking the latency of single messages we recommend using Wily Introscope to monitor the

queue behavior as shown above

Asynchronous messages only use one thread for processing in the messaging system Multiple threads are

used for synchronous messages The adapter thread puts the message into the messaging system queue

and will wait till the messaging system delivers the response The adapter thread is therefore not available

for other tasks until the response is returned A consumer thread on the call queue sends the message to the

Integration Engine The response will be received by a third thread (consumer thread of send queue) which

correlates the response with the original request After this the initiating adapter thread will be notified to

send the response to the original sender system This correlation can be seen in the audit log of the

synchronous message below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

64

622 Messaging System in Between Integration Server and AFW Receiver Adapter (Inbound)

Open the Audit Log of a message and look for the following entry ldquoMessage successfully put into the

queuerdquo Compare this timestamp with the timestamp for the step ldquoThe message was successfully retrieved

from the receive queuerdquo The time difference between these two timestamps is the time that the message

waited in the messaging system for a free consumer thread A large time difference between those two

timestamps indicates a bottleneck in the adapter specific inbound queues of the Messaging System

623 Interface Prioritization in the Messaging System

SAP NetWeaver PI 71 and higher offers the possibility for prioritization of interfaces similar to the ABAP

stack This feature is especially helpful if high volume interfaces run at the same time with business critical

interfaces To minimize the delay for critical messages the Messaging System now makes it possible for you

to define high medium and low priority processing at interface level Based on the priority the Dispatcher

Queue of the messaging system (which will be the first entry point for all messages) will forward the

messages to the standard adapter-specific queues The priority assigned to an interface determines the

number of messages that are forwarded once the adapter-specific queues have free consumer threads

available This is done based on a weighting of the messages to be reloaded The weights for the different

priorities are as follows High 75 Medium 20 Low 5

Based on this approach you can ensure that more resources can be used for high priority interfaces The

screenshot below shows the UI for message prioritization available in pimon Configuration and

Administration Message Prioritization

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

65

The number of messages per priority can be seen in a dashboard in Wily as shown below

You can find more details on configuring usage prioritization within the AAE at SAP Online Help navigate to

SAP NetWeaver SAP NetWeaver PI SAP NetWeaver Process Integration 71 Including Enhancement

Package 1 SAP NetWeaver Process Integration Library Function-Oriented View Process Integration

Process Integration Monitoring Component Monitoring Prioritizing the Processing of Messages

624 Avoid Blocking Caused by Single SlowHanging Receiver Interface

As already discussed above some receiver adapters process messages sequentially on one server node by

default This is independent of the number of consumer threads defined for the corresponding receiver

queue in the messaging system For example even though you have configured 20 threads for the JDBC

Receiver one Communication Channel will only be able to send one request to the remote database at a

given time If there are many messages for the same interface all of them will get a thread from the

messaging system but will be blocked when performing the adapter call

In the case of a slow message transmission for example for EDI interfaces using ISDN technology or

remote system with small network bandwidth this behavior can cause a ldquohangingrdquo situation for a specific

adapter since one interface can block all messaging threads As a result all the other interfaces will not get

any resources and will be blocked

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

66

Based on this a parameter messagingsystemqueueParallelismmaxReceivers was introduced which can

be set in NWA in service XPI Service Messaging System With this parameter you can restrict the maximum

number of receiver consumer threads that can be used by one interface In older releases (lower 731 SP11

and 74 SP6) this is a global parameter at the moment that affects all adapters It should therefore not be set

too restrictively If you need high parallel throughput one option is to set the maxReceiver parameter to 5 (so

that each interface can use 5 consumer threads for each server node) and increase the overall number of

threads on the receive queue for the adapter in question (for example JDBC) to 20 With this configuration it

will be possible for four interfaces to get resources in parallel before all threads are blocked For more

information see Note 1136790 - Blocking Receiver Channel May Affect the Whole Adapter Type and SDN

blog Tuning the PI Messaging System Queues

In the screenshot below you see the impact of maxReceivers parameter when a backlog for one interface

occurs Even though there are more free SOAP threads available they are not consumed Hence the free

SOAP threads could be assigned to other interfaces running in parallel

This parameter has a direct impact on the queue where message backlogs occur As mentioned earlier

usually a backlog should occur on the Dispatcher queue only since it is responsible to dispatch threads only

if there are free consumer threads in the adapter specific queue When setting maxReceiver you restrict the

threads per interface and this means that less consumer threads are blocked When one interface is facing a

high backlog there will always be free consumer threads and therefore the dispatcher queue can dispatch

the message to the adapter specific queue Hence the backlog will appear in the adapter specific queue so

that the message prioritization would not work properly any longer

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

67

Per default the maxReceiver parameter is only relevant for asynchronous message processing (ICo and

classical dual-stack scenarios) The reason here is that a wait time for synchronous scenarios should be

avoided by all means and therefore additional restrictions in the number of available threads can be very

critical Therefore it is usually advisable to not limit the threads per interface but increase the overall number

of available threads In case you have many high volume synchronous scenarios with different priority that

run in parallel it might be advisable to limit the threads each interface can use To do so you have to set the

parameter messagingsystemqueueParallelismqueueTypes to Recv ICoAll as described in SAP Note

1493502 - Max Receiver Parameter for Integrated Configurations

Enhancement with 731 SP11 and 74 SP6

With Note 1916598 - NF Receiver Parallelism per Interface an enhancement was introduced that allows

the specification of the maximum parallelization on more granular level This new feature has to be activated

by setting the parameter messagingsystemqueueParallelismperInterface to true Using a configuration UI

you can specify rules to determine the parallelization for all interfaces of a given receiver service If no rule

for a given interface is specified the global maxReceivers value will be considered If the receiver service

corresponds to a technical business system this configuration would help to restrict the parallel requests

from PI to that system This also works across protocols so that eg it could be used to restrict the number of

parallel requests using Proxy or IDoc to the same ERP receiver system Below you can find a screenshot of

the configuration UI in NWA SOA Monitoring

With the improvement mentioned above also the dispatching mechanism in the dispatcher queue is

changed so that it is aware of the maxReceiver settings This means that now again the backlog will now be

placed in the dispatcher queue and the prioritization will work properly

625 Overhead based on interface pattern being used

Also the interface pattern configured can cause some overhead When choosing the interface pattern

ldquoStateless Operationrdquo each message is parsed against its data type during inbound processing This can

cause performance and memory overhead but gives a syntactical check of the data received

When choosing ldquoStateless (XI30-Compatible)rdquo no check of the data type is performed and therefore the

overhead is avoided Therefore for interface with good data quality and high message throughput this

interface pattern should be chosen

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

68

63 Performance of Module Processing

If your analysis in chapter Adapter Performance Problem pointed to a problem in the Module Processing

then you must analyze the runtime of the modules used in the Communication Channel Adapter modules

can be combined so that one CC calls multiple modules in a defined sequence Adapter modules can be

custom-developed or SAP standard and can be used for many different purposes ndash like to transform the EDI

message before sending it to the partner or to change header attributes of a PI message before forwarding it

to a legacy application Due to the flexible usage of adapter modules there is also no standard way to

approach performance problems

In the audit log shown below you can see two adapter modules One module is a customer-developed

module called SimpleWaitModule The next module called CallSAPAdapter which is a standard module that

inserts the data into the messaging system queues

In the audit log you will get a first impression of the duration of the module In the example above you can

see that the SimpleWaitModule requires 4 seconds of processing time

Once again Wily Introscope offers a better overview of the processing time of modules In the screenshot

below you can see a dashboard showing the cumulativeaverage response times and number of invocations

of different modules If there is one that has been running for a very long time then it would be very easy to

identify since there will be a line indicating a much higher average response time The tooltip displays the

name of the module

Module

Processing

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

69

Once you have identified the module with the long running time you have to talk to the responsible developer

to understand why it is taking that long Eventually the module executes a look-up to a remote system using

JCo or JDBC which could be responsible for the delay In the best case the module will print out information

in addition to the audit log to detect such steps If not use the Wily Introscope transaction trace as explained

in appendix Wily Transaction Trace

64 Java only scenarios Integrated Configuration objects

From SAP NetWeaver PI 71 on you can configure scenarios that only run on the Java stack using the

Advance Adapter Engine (AAE) when only Java-based adapters are used A new configuration object is

used for this that is called Integrated Configuration When using this the steps executed so far in the ABAP

pipeline (Receiver Determination Interface Determination and Mapping) are also executed by the services in

the Adapter Engine

641 General performance gain when using Java only scenarios

The major advantage of AAE processing is a reduced overhead due to the context switches between ABAP

and Java Thus the overall throughput can be increased significantly and the overall latency of a PI

message can be reduced greatly If possible interfaces should always use the Integrated Configuration to

achieve best performance Therefore the best tuning option is to change a scenario that is using Java based

sender and receiver adapters from a classical ABAP-based scenario to an AAE-based one

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

70

Based on SAP internal measurements performed on 71 releases the throughput as well as the response

time could be improved significantly as shown in the diagrams below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

71

Based on these measurements the following statements can be made

1) Significant performance improvements can be achieved always with local processing (if available) for a certain scenario

2) This is valid for all adapter types huge mapping runtimes slow networks slow (receiver) applications reduce the throughput and therefore rated for the overall scenario also reduce the benefit of local processing in terms of overall throughput and response time measurements

3) Greatest benefit is recognized for small payloads (comparison 10k 50k 500k) and asynchronous messages

642 Message Flow of Java only scenarios

All the Java based tuning options mentioned in chapter Analyzing the (Advanced) Adapter Engine and Long

Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo are still valid Just the calling sequence is different

The following describes the steps used in Integrated Configuration to help you better understand the

message flow of an interface The example is JMS to Mail

1) Enter the JMS sender adapter

2) Put into the dispatcher queue of the messaging system

3) Forwarded to the JMS send queue of the messaging system

4) Message is taken by JMS send consumer thread

a No message split used

In this case the JMS consumer thread will do all the necessary steps previously done in the ABAP pipeline (like receiver determination interface determination and mapping) and will then also transfer the message to the backend system (remote database in our example) Thus all the steps are executed by one thread only

b Message split used (1n message relation)

In this case there is a context switch The thread taking the message out of the queue will process the message up to the message split step It will create the new message and put it in the Send queue again from where it will be taken by different threads (which will then map the child message and finally send it to the receiving system)

As we can see in this example for Integrated Configuration only one thread does all the different steps of a

message The consumer thread will not be available for other messages during the execution of these steps

The tuning of the Send queue (Call for synchronous messages) is therefore much more important for

scenarios using Integrated Configuration than for ABAP-based scenarios

The different steps of the message processing can be seen in the audit log of a message If for example a

long-running mapping or adapter module is indicated then you can use the relevant chapter of this guide

There is no difference in the analysis except that for mappings no JCo connection is required since the

mapping call is done directly from the Java stack

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

72

The example audit logs below are based on a Java only synchronous SOAP to SOAP scenario

In the highlighted areas you can see that all the steps are very fast except the mapping call which lasts

around 7 seconds To analyze the long duration of the mapping now the same steps as discussed in chapter

Long Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo have to be followed

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

73

643 Avoid blocking of Java only scenarios

Like classical ABAP based scenarios Java only scenarios can face backlog situations in case of a slow or

hanging receiver backend Since the Java only interfaces are only using send queues the restriction of

consumer threads on the receiver queue as described in chapter Avoid Blocking Caused by Single

SlowHanging Receiver Interface is no solution

Based on that an additional property messagingsystemqueueParallelismqueueTypes of service SAP XI AF

MESSAGING was introduced via Note 1493502 - Max Receiver Parameter for Integrated Configurations

The parameter can be set for synchronous and asynchronous interfaces Please keep in mind that

messagingsystemqueueParallelismmaxReceivers is a global parameter for all adapters and for all

configured Quality of Service This global restriction can be avoided starting with 731 SP11 740 SP06 as

per SAP Note 1916598 - NF Receiver Parallelism per Interface Usually a restriction for the parallelization

can be highly critical for synchronous interfaces Therefore we generally recommend to set

messagingsystemqueueParallelismqueueTypes in most cases to Recv IcoAsync only

644 Logging Staging on the AAE (PI 73 and higher)

Up to PI 711 on the PI Adapter Engine only one message version was persisted Especially for Java only

scenarios this was often not sufficient for troubleshooting since for instance the result of a mapping could not

be verified

Starting from PI 73 the persistence steps can be configured globally in the Adapter Engine for synchronous

and asynchronous interfaces This is similar to the LOGGING or LOGGING_SYNC on the ABAP stack In

later version of PI you will be able to do the configuration on interface level

The versions that can be persisted are shown in the diagram below (the abbreviations in green are the

values for the configuration)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

74

In the Messaging System we generally distinguish Staging (versioning) and Logging An overview is given

below

For details about the configuration please refer to the SAP online help Saving Message Versions

Similar to the LOGGING on the ABAP Integration Server persisting several versions of the Java message

can cause a high overhead on the DB and can cause a decrease in performance

The persistence steps can be seen directly in the audit log of a message

Also the Message Monitor shows the persisted versions

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

75

While this additional monitoring attributes are extremely helpful for troubleshooting they should be used

carefully from a performance perspective Especially the number of message versions and the

loggingstaging mode you choose can have a severe impact on performance Therefore you have to find the

balance between business requirement and performance overhead Some guidelines on how to use staging

and logging are summarized in SAP Note 1760915 - FAQ Staging and Logging in PI 73 and higher

65 J2EE HTTP load balancing

With NetWeaver version 71 and higher the Java HTTP load balancing is no longer done by the Java

Dispatcher but by the ICM The default load balancing rules of the ICM are designed with a stateful

application in mind giving priority to the balancing of HTTP sessions These default rules are not optimal for

stateless applications such as SAP PI

In the past we could observe an unequal load distribution across Java server nodes caused by this In case

of high backlogs this can cause a delay in the overall message processing of the interface because one

server node has more messages assigned than others and therefore not all available resources are used

A typical example can be seen in the Wily screenshot below

SAP Note 1596109 ndash Uneven distribution of HTTP requests on Java server nodes introduces new load

balancing rules for stateless applications like PI Please follow the description in the Note to ensure the

messages are distributed equally across the available server nodes In the meantime these load balancing

rules can also be executed using the CTC templates as described in Notes 1756963 and 1757922 Also this

has been included in the PI Initial Setup wizard in order to execute this task automatically as a post-

installation step (see Note 1760700 - PI CTC Add HTTP loadbalancing to initial setup)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

76

For the example given above we could see a much better load balancing after the new load balancing rules

were implemented This can be seen in the following screenshot

Please Note The load balancing rules mentioned above are only responsible to balance the messages

across the available server nodes of one instance HTTP load balancing across application servers is done

by the SAP Web Dispatcher More information about this is provided in the Guide How To Scale SAP PI 71

66 J2EE Engine Bottleneck

All tuning that can be done in the AFW is limited by the ability of the J2EE Engine to handle a given amount

of threads and to provide enough memory that is needed to process all requests Of course the CPU is also

a limiting factor but this will be discussed in Chapter 91

From SAP NetWeaver PI 71 onwards the SAP VM is used on all platforms Therefore the analysis of all

platforms can use the same tools

661 Java Memory

The first analysis of the J2EE memory is usually done using the garbage collection (GC) behavior of the

Java Virtual Machine (JVM) To get the information of the GC written to the standard log file the JVM has to

be started with the parameter ndashverbosegc

Analyze the file std_serverltngtout for the frequency of GCs their duration and the allocated memory

Look for unusual patterns indicating an Out-of-Memory error or an unintentional restart of your J2EE engine This would be visible if the allocated memory drops to 0 at a given point in time A healthy Garbage Collection output would display a saw tooth pattern ndash that is the memory usage would increase over time but then go down to its original value as soon as a full Garbage Collection is triggered You should see the memory usage go down to initial values during low volume times (night-time)

Pay attention to GCs with a high duration This is important because no PI message can be mapped or processed by the J2EE Adapter Framework during a GC in a JAVA VM One of the many reasons for long GCs that should be mentioned here is paging (see chapter 92) If the heap space of the J2EE Engine is exhausted all objects in the heap are evaluated for their references If there is still a reference the object is kept If there is no reference the object is deleted If some or all of the objects have been swapped out due to insufficient RAM then they have to be swapped in to be evaluated This leads to very long runtimes for GCs GCs of more than 15 minutes have been observed on swapping systems

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

77

Different tools exist for the Garbage Collection Analysis

1) Solution Manager Diagnostics

Solution Manager Diagnostics (SMD) offers a memory analysis application that is able to read the GC output in the std_serverltngtout files To start call the Root Cause Analysis section In the Common Task list you find the entry Java Memory Analysis which will give you the chance to upload your std_serverltngtout file It shows the memory allocated after the GC and also prints the duration of each GC (black dots with duration scale on right axis) A normal GC output should show a saw tooth pattern where the memory always goes down to an initial value (especially during low volume times) This is shown in the example screenshot below

2) Wily Introscope

Again Wily Introscope offers different dashboards that can be used to check the Garbage Collection behavior on the J2EE engine The dashboard can be found using the J2EE Overview J2EE GC Overview and shows important KPIs for the GC such as the count of GC or the duration for each interval The GC Time dashboards shows the ratio of time spent in GC An example is shown in the screenshot below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

78

3) Netweaver Administrator

The NWA also offers a view to monitor the Java heap usage However the important information about the duration of the GC is not available This monitor can be found by navigating to Availability and Performance Management Java System Reports Choose Report System Health

In NetWeaver 73 you find this data in Availability and Performance Management Resource Monitoring History Reports There you can build your own Memory report of the monitoring data provided The screenshot below eg shows the Used Memory per server node

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

79

4) SAP JVM Profiler

The SAP JVM profiler allows the analysis of the extensive GC information stored in the sapjvm_gcprf files of the server node folders or to directly connect to the server node to retrieve this information More information can be found at httpwikiscnsapcomwikidisplayASJAVAJava+Profiling

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

80

Procedure

o Search for restarts of the J2EE Engine by searching for the string ldquois startingrdquo in the file std_serverltngtout Apart from the first of these entries (which simply marks the initial start of the J2EE Engine) the second and any later entries mark a fatal situation after which the J2EE server node had to be restarted This could for example be an outndashof-memory situation Search the log above those entries for a possible reason

o Search for exceptions and fatal entries Was an out-of-memory error reported (search for pattern OutOfMemory) Was a thread shortage reported

o This document does not deal with tuning the J2EE Engine If at this point it becomes clear that the J2EE Engine is the limiting factor proceed with SAP Notes or open a customer incident Only some very basic recommendations are given here

o The parameters for the SAP VM are defined by Zero Admin templates From time to time new Zero Admin templates might be released which change important parameters for the SAP VM Thus you should check SAP Note 1248926 - AS Java VM Parameters for NetWeaver 71-based Products regularly and apply the changes accordingly

o The problem might be caused by too high parallelization when processing large messages This should be restricted to avoid overloading of the Java heap memory

1) In case the interfaces are using the ABAP Integration Server use the large message queue filters to restrict the parallelization of mapping calls from ABAP queues To do so set parameter EO_MSG_SIZE_LIMIT to eg 5000 to direct all large messages to dedicated XBTL or XBTM queues

2) To restrict the parallelization on the Java Messaging System you can use the large message queue mechanism described in chapter Large message queues on the Messaging System

o If it is already obvious that the memory of your J2EE Engine is getting short and nothing can be changed on the scenarios itself you have several options for scaling your PI landscape Especially for large PI installations with high message throughput or large messages SAP Note 1502676 - Scaling up large PI installations describes potential options

1) Increase the Java Heap of your server node If sufficient physical memory is available increase the default heap size of 2 GB to a higher value such as 3 or 4 GB Experience shows that the SAP VM can handle these heap sizes without major impact on performance Larger heap sizes can cause longer processing time of GCs which in turn can affect the PI application negatively Increasing the maximum heap size on the J2EE engine will automatically adapt the new area of the heap due to a dynamic configuration (in newer J2EE versions this will be per default set to 16 of the overall heap) After increasing the heap size monitor the duration of the GC execution

2) Add an additional server node to distribute the load over more processes SAP recommends that productive systems have at least 2 Java server nodes per instance Prerequisite is that your physical host has enough RAM and CPU To minimize the overhead for J2EE cluster communication it is in general recommended configuring larger server nodes (as described above) then a high number of server nodes

3) Add an additional application server consisting of a double stack Web Application Server (ABAP and J2EE Engine) to distribute the load to an additional hardware Find details on the scaling of PI with multiple instances in the Guide How To Scale up SAP NetWeaver Process Integration

4) Use a non-central Adapter Engine to separate large messages that cause the memory problem from business critical scenarios

662 Java System and Application Threads

A system or application thread is required for every task that the J2EE Engine executes Each of these

thread types are maintained using a thread pool If the pool of available threads is expired no more requests

can be processed It is therefore important that you have enough threads available at all times

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

81

o In general SAP recommends that you increase the system and application thread count for a PI system As described in SAP Note 937159 - XI Adapter Engine is stuck the application threads should be increased to 350 and the system threads to lsquo120rsquo for all J2EE server nodes

o There are two options for checking the thread usage

1) Wily Introscope

Wily Introscope provides a dashboard in the J2EE Overview section called J2EE CPU and Memory Detail that can also be used to monitor the thread usage on a PI engine Different views exist for Application and System Threads as shown in the screenshot below

2) NetWeaver Administrator (NWA)

The NWA also offers a monitor similar to the one available for the memory mentioned above This monitor can be found by navigating to Availability and Performance Management Java System Reports Choose Report System Health and looking at the windows shown in the screenshot below

Again in NetWeaver 73 and higher the monitor can be found at Availability and Performance

Management Resource Monitoring History Report An example for the Application Thread

Usage is shown below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

82

Check if the ThreadPoolUsageRate is below 80 for both the application threads and the system threads Also check the maximum value by choosing a longer history in Wily Introscope

Has the thread usage increased over the last daysweeks

Is there one server node that shows a higher thread usage than others

3) SAP Management Console

The SAP Management Console is an essential tool to analyze different aspects of the J2EE engine It also offers a thread dump view similar to SM50 in ABAP showing the activity of all the threads of a given Java Application Server More information about the SAP MC can be found in Note 1014480 - SAP Management Console (SAP-MC) and the online help The SAP Management Console can be started as Java web-start application using httpltservergt5ltInstanc_numbergt13

Navigate to AS Java Threads and check for any threads in red status (gt20 secs running time) This indicates long running threads and you should check for the related thread stacks The example below shows a long running thread related to the PI directory You can see the user doing the request and can see that the thread is waiting for an HTTP response from the CPACache

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

83

If you are not sure what is causing the problem you should take multiple thread dumps (at least 3 every 30 seconds) for all server nodes This can be done in the Management Console navigating to AS Java Process Table

The Thread Dump Viewer tool (TDV) from Note 1020246 - Thread Dump Viewer for SAP Java Engine can be used to analyze the produced thread dumps (inside the results archive that can be downloaded to your local PC)

663 FCA Server Threads

An additional type of thread was introduced with SAP NetWeaver PI 71 that is responsible for HTTP traffic

FCA Server Threads The FCA Server Threads are responsible for receiving HTTP calls at the Java side

(after the Java dispatcher is no longer available) Also FCA Threads use a Thread Pool Fifteen FCA

Threads are configured by default but based on SAP Note 1375656 - SAP Netweaver PI System Parameters

we recommend that you increase it to 50 This can be done in the NWA by changing the parameter

FCAServerThreadCount of service HTTP Provider

FCA Server Threads are particularly crucial for synchronous message transfer and for HTTP-based

scenarios like WebServices or calls from the ABAP engine to the Adapter Engine If there is a long duration

of the HTTP call due to a slow backend system then the thread is blocked for the whole time and not

available to send other HTTP requests (other PI messages)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

84

The configured FCAServerThreadCount represents the maximum number of threads working on a single

entry point An entry point in the PI sense could eg be the SOAP Adapter Servlet (share with all channels

as described in Tuning SOAP sender adapter)

o If response times for specific entry point are high (gt15 seconds) additional FCA threads will be spawned to be available for parallel HTTP based incoming requests using different entry points only This ensures that in case of problems with one application not all other applications will be blocked constantly

o An overall maximum of 20xFCAServerThreadCount may be used in the J2EE engine

There is currently no standard monitor available for FCA Threads except the thread view in the SAP

Management Console A new dashboard will be integrated into Wily Introscope soon and will also show the

FCA Server Threads that are in use

664 Switch Off VMC

The Virtual Machine Container (VMC) is enabled by default for an ABAP WebAS 71 Since PI does not use

VMC at runtime the VMC can be switched off on a PI system to avoid resource overhead This

recommendation is based on SAP Note 1375656 - SAP NetWeaver PI System Parameters

You can activate the VM Container setting the profile parameters vmcjenable = off which can be changed

using profile maintenance (transaction RZ10) Once you have made the change you need to restart the SAP

Web Application Server instance

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

85

7 ABAP PROXY SYSTEM TUNING

Every ABAP WebAS gt 620 includes an ABAP Proxy runtime This enables a SAP system to talk native PI

protocol (XI-SOAP) so that no costly transformation is necessary

In general ABAP proxy is just a framework that is triggered by (sender) or triggers (receiver) application-

specific coding It is not possible to give a general tuning recommendation because the applications and use

cases of ABAP proxy can differ so greatly

In this section we would like to highlight the system tuning options that can be applied to improve the

throughput at the ABAP Proxy backend side

In general you can increase the throughput of EO interfaces by changing the queue parallelization Two

different parameters exist to change the number of qRFC queues on the ABAP Proxy backend As in PI

ABAP Proxy processing is based on qRFC inbound queues (SMQ2) with specific names A sender proxy

uses XBTS queues (10 parallel by default) and a receiver proxy XBTR (20 parallel by default)

The sender proxy only forwards the message to PI by executing an HTTP call that must not take too much

time and which is not resource-critical On the other hand the receiver proxy executes the inbound

processing of the messages based on the application context (and which can be very time-consuming) It is

therefore more likely to face a backlog situation in the Receiver Proxy queues In the screenshot below you

can see the performance header of a receiver proxy message that required around 20 minutes in the

PLSRV_CALL_INBOUND_PROXY (which corresponds to the posting of the application data)

Of course such a long-running message will block the queue and all messages behind it will face a higher

latency Since this step is purely application-related it is only possible to perform tuning at the application

side

The number of sender and receiver ABAP Proxy queues can be changed in transaction SXMB_ADM on the

proxy backend with parameter EO_INBOUND_PARALLEL and sub parameter SENDER (XBTS queues)

and RECEIVER (XBTR queues) The prerequisite for increasing the number of queues is that there are

enough qRFC resources are available at the proxy system (as discussed in section qRFC Resources

(SARFC))

Also as described in chapter Message Prioritization on the ABAP Stack prioritization can be configured in

the ABAP Proxy system This can be used to separate runtime-critical interfaces from interfaces with a long

application processing (as shown above)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

86

71 New enhancements in Proxy queuing

The queuing mechanism was enhanced for sender and receiver proxy systems with Note 1802294

(Receiver) and Note 1831889 (Sender) After implementing these two Notes interface specific queues will be

used and tuning of these queues on interface level will be possible This can be very helpful in cases where

one receiver interface shows very long posting time in the application coding that cannot be further

improved Other messages for more business critical interfaces will eventually be blocked by this message

due to the common usage of the XBTR queues On the sender side of a Proxy system the processing of the

queues call the central PI hub In general the processing time there should be fast But in case of high

volume interfaces you might want to slow down less business critical interfaces to avoid overloading of your

central PI hub The prioritization mechanisms available prior to these Notes did not provide such options

This is achieved by adding an interface specific identifier to the queue-name In the screenshot below you

see a comparison in the queue names for the old framework (red) and the new framework (blue)

For the sender queues the following new parameters are introduced

o Parameter EO_QUEUE_PREFIX_INTERFACE with sub parameter SENDERSENDER_BACK Should be set to lsquo1rsquo to active interface specific queue This is a global switch affecting all sender interfaces

o Parameter EO_INBOUND_PARALLEL_SENDER without sub parameter determines the general number of queues per interface

Using a sub parameter ltsender IDgt the parallelization of individual interfaces can be changed This can eg be used to assign more queues for high priority interfaces

Wildcards can be used in the ltsender IDgt to group interfaces sharing eg the same service name or the same interface name but different services

For the receiver queues the following new parameters are introduced

o Parameter EO_QUEUE_PREFIX_INTERFACE with sub parameter RECEIVER Should be set to lsquo1rsquo to active interface specific queue This is a global switch affecting all receiver interfaces

o Parameter EO_INBOUND_PARALLEL_RECEIVER without sub parameter determines the general number of queues per interface

Using a sub parameter ltsender IDgt the parallelization of individual interfaces can be changed This can eg be used to assign more queues for high priority interfaces

Wildcards can be used in the ltsender IDgt to group interfaces sharing eg the same service name or the same interface name but different services

This new feature also replaces the currently existing prioritization since it is in general more flexible and

powerful We therefore recommend adjusting existing prioritization rules using the new queuing possibilities

described above

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

87

8 MESSAGE SIZE AS SOURCE OF PERFORMANCE PROBLEMS

The message size directly influences the performance of an interface The size of a PI message depends on

two elements The PI header elements with a rather static size and the payload which can vary greatly

between interfaces or over time for one interface (for example larger messages during year-end closing)

The size of the PI message header can cause a major overhead for small messages of only a few kB and

can cause a decrease in the overall throughput of the interface Furthermore many system operations (like

context switches or database operations) are necessary for only a small payload The larger the message

payload the smaller the overhead due to the PI message header On the other hand large messages

require a lot of memory on the Java stack which can cause heavy memory usage on ABAP or excessive

garbage collection activity (see section Java Memory) that will also reduce the overall system performance

Very large messages can even crash the PI system by causing an Out-of-Memory exception for example

You therefore have to find a compromise for the PI message size

Below you see throughput measurements performed by SAP In general the best throughput was identified

for messages sizes of 1 to 5 MB

The message size in these measurements corresponds to the XML message size processed in PI and not

the size of the file or IDoc being sent to PI You can use the Runtime Header of the ABAP stack to check the

message size Below you can see an example of a very small message While the MessageSizePayload

field describes the size of the payload in bytes (here 433 bytes) the MessageSizeTotal describes the total

message size (header + payload) In the example this is around 14 KB demonstrating the overhead by the PI

header for small messages The next two lines describe the payload size before and after the mapping In

the example below the mapping reduces the payload size The last two lines determine the size of the

response message that is sent back to PI before and after the response mapping for synchronous

messages

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

88

Based on the above observations we highly recommend that you use a reasonable message size for your

interfaces During the design and implementation of the interface we therefore recommend using a message

size of 1 to 5 MB if possible Messages can be collected at the sender side or by using IDoc packaging as

described in Tuning the IDoc Adapter to achieve this In case of large messages a split has to be performed

by changing the sender processing or by using the split functions available in the structure conversion of the

File adapter

81 Large message queues on PI ABAP

In case the interfaces are using the ABAP Integration Server use the large message queue filters to restrict

the parallelization of mapping calls from ABAP queues To do so set the parameter EO_MSG_SIZE_LIMIT of

category TUNING to eg 5000 to direct all messages larger 5 MB to dedicated XBTL or XBTM queues

The value of the parameter depends on the number of large messages and the acceptable delay that might

be caused due to a backlog in the large message queue

To reduce the backlog the number of large message queues can also be configured via parameter

EO_MSG_SIZE_LIMIT_PARALLEL of category TUNING The default value is 1 so that all messages larger

than the defined threshold will be processed in one single queue Naturally the parallelization should not be

set higher than 2 or 3 to avoid overloading of the Java memory due to parallel large message requests

82 Large message queues on PI Adapter Engine

SAP Note 1727870 - Handling of large messages in the Messaging System introduces large message

queues (virtual queues no adapter behind) also for the Java based Adapter Engine Contrary to the

Integration Engine it is not the size of a single large message only that determines the parallelization

Instead the sum of the size of the large messages across all adapters on a given Java server node is limited

to avoid overloading the Java heap This is based on so called permits that define a threshold of a message

size Each message larger than the permit threshold is considered as large message The number of permits

can be configured as well to determine the degree of parallelization Per default the permit size is 10 MB and

10 permits are available This means that large messages will be processed in parallel if 100 MB are not

exceeded

To show this let us look at an example using the default values Let us assume we have 5 messages waiting

to be processed (status ldquoTo Be Deliveredrdquo on one server node Message A has 5 MB message B has 10

MB message C has 50 MB message D 150 MB message E 50 MB and message F 40 MB Message A is

not considered large since the size is smaller than the permit size and is not considered large and can be

immediately processed Message B requires 1 permit message C requires 5 Since enough permits are

available processing will start (status DLNG) Hence for message D all available 10 permits would be

required Since the permits are currently not available it cannot be scheduled If blacklisting is enabled the

message will be put to error status (NDLV) since it exceed the maximum number of defined permits In that

case the message would have to be restarted manually Message E requires 5 permits and can also not be

scheduled But since there are 4 permits left message F is put to DLNG Due to the smaller size message B

and message F finish first releasing 5 permits This is sufficient to schedule message E which requires 5

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

89

permits Only after message E and C have finished message D can be scheduled consuming all available

permits

The example above shows a potential delay a large message could face due to the waiting time for the

permits But the assumption is that large messages are not time critical and therefore additional delay is less

critical than potential overload of the system

The large message queue handling is based on the Messaging System queues This means that restricting

the parallelization is only possible after the initial persistence of the message in the Messaging System

queues Per default this is only done after the Receiver Determination Therefore if you have a very high

parallel load of incoming large requests this feature will not help Instead you would have to restrict the size

of incoming requests on the sender channel (eg file size limit in the file adapter or the

icmHTTPmax_request_size_KB limit in the ICM for incoming HTTP requests) If you have very complex

extended receiver determination or complex content based routing it might be useful to configure staging in

the first processing step of the Messaging System (BI=3) as described in Logging Staging on the AAE (PI

73 and higher)

The number of permits consumed can be monitored in PIMON Monitoring Adapter Engine Status The

number of threads corresponds to the number of consumed permits

In newer Wily versions there is also a dashboard showing the number of consumed permits as well The

Worker Threads on the right hand side of the screenshot correspond to the permits

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

90

9 GENERAL HARDWARE BOTTLENECK

During all tuning actions discussed in the previous chapters you must keep in mind that the limitation of all

activities is set by the underlying CPU and memory capacity The physical server and its hardware have to

provide resources for three PI runtimes the Adapter Engine the Integration Engine and the Business

Process Engine Tuning one of the engines for high throughput leaves fewer resources for the remaining

engines Thus the hardware capacity has to be monitored closely

91 Monitoring CPU Capacity

When monitoring the CPU capacity it is not enough to use the average that is displayed in the ldquoPrevious

Hoursrdquo section of transaction ST06 Instead you should start your interface and monitor the CPU while it is

running The best procedure is to use operating system tools to monitor the CPU usage (particularly if using

hardware virtualization)

The SMD Host Agent also reports CPU data to Wily Introscope which can be viewed using dashboards as

shown below This example shows two systems where one is facing a temporary CPU overload and the

other a permanent one

Also the NWA offers a view on the CPU activity from NetWeaver 73 on via Availability and Performance

Management Resource Monitoring History Reports There you can build your own report based on the

ldquoCPU utilizationrdquo data as shown in the screenshot

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

91

You can also monitor the CPU usage in ST06 as described below

Procedure

Log on to your Integration Server and call transaction ST06 Take a look at the snapshot that is provided on

the entry screen than navigate to ldquoDetailed Analysis Menurdquo

Snapshot view Is the idle time (upper left corner) at a reasonable value at all times for example around 10 or higher

Snapshot view The load average is a good indicator of the dimension of the bottleneck It describes the number of processes for each CPU that are in a wait queue before they are assigned to a free CPU As long as the average remains at one process for each available CPU the CPU resources are sufficient Once there is an average of around three processes for each available CPU there is a bottleneck at the CPU resources In connection with a high CPU usage a high value here can indicate that too many processes are active on the server In connection with a low CPU usage a high value here can indicate that the main memory is too small The processes are then waiting due to excessive paging

Detailed Analysis View TOP CPU Which thread uses up the highest amount of CPU Is it the J2EE Engine the work processes or the database

o There are different follow-up actions to be taken depending on the findings of the second check The first option is of course to simply increase the hardware This should be accompanied by a thorough sizing (for which SAP provides the Quicksizer in the Service Marketplace httpservicesapcomquicksizer )

o The second option is to identify the CPU-consuming threads and to think about a reduction of the load in this component For example if the J2EE Engine is the largest consumer then it is worth re-evaluating the number of concurrent mapping threads andor consumer queues as described in the previous chapters

o SAP Note 742395 - Analyzing High CPU Usage by the J2EE Engine provides a good entry point if the J2EE Engine consumes too much CPU

o If it is the work processes that consume a lot of CPU use transaction ST03N to figure out if it is the Integration Server or the Business Process Engine You can do that by looking at the CPU usage per user BPE uses the user WF-BATCH while the Integration Server uses the last user who activated a qRFC queue (typically PIAFUSER or PIAPPLUSER) Use the previous chapter to restrict these activities

92 Monitoring Memory and Paging Activity

As stated above paging is very critical for the Java stack since it influence the Java GC behavior directly

Therefore paging should be avoided in every case for a Java-based system

The OS tools are the most reliable for monitoring the paging activity on your system Transaction ST06 can

also be used to perform a snapshot analysis Take a look at the snapshot that is provided on the entry

screen

Snapshot view Is there enough physical memory available

Snapshot view Is there considerable paging Paging can have a negative influence on your J2EE Engine If it does this can be seen in long GC times as described in Chapter 661

93 Monitoring the Database

Monitoring database performance is a rather complex task and requires information to be gathered from a

variety of sources Only a basic set of indicators is given for the purpose of this guide A more detailed

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

92

analysis is required if these indicators point to a major performance problem in the database If assistance is

needed open a support message under the component BC-DB-ORA (or MSS SDB DB6 respectively)

931 Generic J2EE database monitoring in NWA

The next sections of this chapter will be mainly based on ABAP transactions to do a detailed analysis of the

database performanceThe NWA however offers possibilities to monitor the database performance for the

Java based tables This is especially helpful for Java only AEXPO or Non-Central AAEs In the NWA you

can use the ldquoOpen SQL Monitorrdquo in the Troubleshooting section of NWA The official documentation can be

found in the Online Help Monitoring DB via DBACOCKPIT transaction (ABAP based) is possible also for the

Java-only systems from SAP Solution Manager (DBA Cockpit must be configured in the Managed System

Setup)

It will be too much to outline all the available functionalities here Instead only a few key capabilities will be

demonstrated If you want to eg see the number of select update or insert statements in your system you

can use the ldquoTable Statistics Monitorrdquo It clearly shows you which tables in your system are accessed the

most frequently

To see the processing time of the different statements on your system you can use the ldquoOpen SQL statisticsrdquo

monitor There you can see the count the total average and max processing time of the individual SQL

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

93

statements and can therefore identify the expensive statements on your system

Per default the recorded period is always from the last restart of the system If you would like to just look at

the statistics for a specific time period (eg during a test) you have the option of resetting the statistics in the

individual monitor prior to the test

932 Monitoring Database (Oracle)

Procedure

Log on to your Integration Server and call transaction ST04

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

94

Many of the numbers in the screenshot above depend on each other The following checklist names a few

key performance figures of your Oracle database

The data buffer quality for instance is based on the ratio of physical reads versus the total number of reads The lower the ratio the better the buffer quality The data buffer quality should be better than 94 the statistics should be based on 15 million total reads This number of reads ensures that the database is in an equilibrated state

Ratio of user and recursive calls A good performance is indicated by ratio values greater than 2 Otherwise the number of recursive calls compared to the number of user calls is too high Over time this ratio always declines because more and more SQL statements get parsed in the meantime

Number of reads for each user call If this value exceeds 30 blocks for each user call this indicates an expensive SQL statement

Check the value of TimeUser call Values larger than 15 ms often indicate an optimization issue

Compare busy wait time versus CPU time A ratio of 6040 generally indicates a well-tuned system Significantly higher values (for example 8020) indicate room for improvement

The DD-cache quality should be better than 80

933 Monitoring Database (MS SQL)

Procedure

To display the most important performance parameters of the database call Transaction ST04 or choose

Tools Administration Monitor Performance Database Activity An analysis is only meaningful if

the database has been running for several hours with a typical workload To ensure a significant database

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

95

workload we recommend a minimum of 500 CPU busy seconds Note The default values displayed in

section Server Engine are relative values To display the absolute values press button Absolute values

Check the values in (1)

The cache hit ratio (2) which is the main performance indicator for the data cache shows the

average percentage of requested data pages found in the cache This is the average value since startup The value should always be above 98 (even during heavy workload) If it is significantly below 98 the data cache could be too small To check the history of these values use Transaction ST04 and choose Detail Analysis Menu Performance Database A snapshot is collected every 2 hours

Memory setting (5) shows the memory allocation strategy used and shows the following

FIXED SQL Server has a constant amount of memory allocated which is set by SQL Server configuration parameters min server memory (MB) = max server memory (MB)

RANGE SQL Server dynamically allocates memory between min server memory (MB) lt gt max server memory (MB)

AUTO SQL Server dynamically allocates memory between 4 MB and 2 PB which is set by min server memory (MB) = 0 and max server memory (MB) = 2147483647

FIXED-AWE SQL Server has a constant amount of memory allocated which is set by min server memory (MB) = max server memory (MB) In addition the address windowing extension functionality of Windows 2000 is enabled

934 Monitoring Database (DB2)

Procedure

To get an overview of the overall buffer pool usage catalog and package cache information go to

transaction ST04 choose Performance Database section Buffer Pool (or section Cache respectively)

1 1

1 2

1 3

1 5

1 4

gt 6h

gt 98

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

96

Buffer Pools Number Number of buffer pools configured in this system

Buffer Pools Total Size The total size of all configured buffer pools in KB If more than one buffer pool is used choose Performance Buffer Pools to get the size and buffer quality for every single buffer pool

Overall buffer quality This represents the ratio of physical reads to logical reads of all buffer pools

Data hit ratio In addition to overall buffer quality you can use the data hit ratio to monitor the database (data logical reads - data physical reads) (data logical reads) 100

Index hit ratio In addition to overall buffer quality you can use the index hit ratio to monitor the database (index logical reads - index physical reads) (index logical reads) 100

Data or Index logical reads The total number of read requests for data or index pages that went through the buffer pool

Data or Index physical reads The total number of read requests that required IO to place data or index pages in the buffer pool

Data synchronous reads or writes Read or write requests performed by db2agents

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

97

Catalog cache size Maximum size of the catalog cache that is used to maintain the most frequently accessed sections of the catalog

Catalog cache quality Ratio of catalog entries (inserts) to reused catalog entries (lookups)

Catalog cache overflows Number of times that an insert in the catalog cache failed because the catalog cache was full (increase catalog cache size)

Package cache size Maximum size of the package cache that is used to maintain the most frequently accessed sections of the package

Package cache quality Ratio of package entries (inserts) to reused package entries (lookups)

Package cache overflows Number of times that an insert in the package cache failed because the package cache was full (increase package cache size)

935 Monitoring Database (MaxDB SAP DB)

Procedure

As with the other database types call transaction ST04 to display the most important performance

parameters

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

98

With the Database Performance Monitor (transaction ST04) and its submonitors the CCMS in the SAP R3

System allows you to view in the SAP R3 System all of the information that can be used to identify

bottleneck situations

The SQL Statements section provides information about the number of SQL statements executed and related sizes

The IO Activity section lists physical and logical read and write accesses to the database

The Lock Activity section (in transaction ST04) allows you to identify possible bottlenecks in the assignment of locks by the database

The Logging Activity section combines information about the log area

The Scan and sort activity section can be helpful in identifying that suitable indexes are missing

The Cache Activity section provides information about the usage of the caches and the associated hit rates

o The data cache Hit rate should be 99 in a balanced system If this is not the case you need to check if the low hit rate is due to the size of the data cache being too small or due to an unsuitable SQL statement If necessary you can increase the data cache by increasing the MaxDB parameter Data_Cache_Size (lower than 74) or Cache_Size (74 or higher)

o The catalog cache hit rate should be 85 in a balanced system It is not a cause for concern if the catalog hit rate is temporarily lower since accesses to the system tables and an active command monitor can temporarily impair the hit rate If the catalog cache is busy the pages are paged out in the data cache rather than on the hard disk

o Log entries are not written directly to the log volume but first into a buffer (LOG_IO_QUEUE) so as to be able to write several log entries together asynchronously with one IO on the hard disk Only once a LOG_IO_QUEUE page is full is it written to the hard disk However there are situations in which you need to write LOG entries from the LOG_QUEUE onto the hard disk as quickly as possible for example if transactions are completed (COMMITROLLBACK) The transactions wait until the log writer reports the OK informing them that the log entry is on the hard disk Firstly this means that is important to use the quickest possible hard disks for the log volume(s) and secondly you must ensure that no LOG_QUEUE_OVERFLOWS occur in production operation If the LOG_IO_QUEUE is full then all transactions that want to write LOG entries must wait until free memory becomes available again in the LOG_IO_QUEUE At LOG_QUEUE_OVERFOLWS (Transaction DB50 -gt Current Status -gt Activities Overview -gt LOG_IO_QUEUE Overflow) you need to increase the LOG_IO_QUEUE parameter

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

99

o In accordance with Note 819324 - FAQ MaxDB SQL optimization check which SQL statements are responsible for the most disk accesses and whether they can be optimized You should also optimize SQL statements that have a poor runtime and poor selectivity

Bottleneck Analysis

The Bottleneck Analysis function is available in transaction ST04 under the Problem Analyzes

Performance Database Analyzer Bottlenecks You can use this function to activate the corresponding

analysis tool dbanalyzer

The dbanalyzer then collects important performance data every 15 minutes and evaluates this for possible

bottlenecks

The result of the bottleneck analysis is output in text format to provide a quick overview of possible causes of

performance problems For a more detailed description of the bottleneck messages see the online

documentation (httphelpsapcom) in the SAP Web Application Server area and search for SAP DB

bottleneck analysis messages

The dbanalyzer tool can also be started at operating system level It logs its analysis results in files with date

stamps in the subdirectory ltrundirectorygtanalyzer (you can find the actual directory by double-clicking on

lsquoPropertiesrsquo in DB50) of the relevant database instance

94 Monitoring Database Tables

Some of the tables of SAP PI grow very quickly and can cause severe performance problems if archiving or

deletion does not take place frequently For troubleshooting see SAP Note 872388 ndash Troubleshooting

Archiving and Deletion in PI

Procedure

Log on to your Integration Server and call transaction SE16 Enter the table names as listed below and execute In the following screen simply press the button ldquoNumber of Entriesrdquo The most important tables are SXMSPMAST (cleaned up by XML message archivingdeletion)

If you use the switch procedure you have to check SXMSPMAST2 as well

SXMSCLUR SXMSCLUP (cleaned up by XML message archivingdeletion)

If you use the switch procedure you have to check SXMSPCLUR2 and SXMSCLUP2 as well

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

100

SXMSPHIST (cleaned up by the deletion of history entries)

If you use the switch procedure you have to check SXMSPHIST2 as well

SXMSPFRAWH and SXMSPFRAWD (cleaned up by the performance jobs see SAP Note 820622 ndash Standard Jobs for XI Performance Monitoring)

SWFRXI (cleaned up by specific jobs see SAP Note 874708 - BPE HT Deleting Message Persistence Data in SWFRXI)

SWWWIHEAD (cleaned up by work item archivingdeletion)

Use an appropriate database tool for example SQLPLUS for Oracle for the database tables of the J2EE schema Main tables that could be affected of growth

PI Message tables BC_MSG and BC_MSG_AUDIT (when audit log persistence is enabled)

PI message logging information when stagingloging is used BC_MSG_LOG_VERSION

XI IDoc tables (if IDoc persistence is activated) XI_IDOC_IN_MSG and XI_IDCO_OUT_MSG

Check for all tables if the number of entries is reasonably small or remains roughly constant over a period of time If that is not the case check your archivingdeletion setup

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

101

10 TRACES LOGS AND MONITORING DECREASING PERFORMANCE

101 Integration Engine

There are three important locations for the configuration of tracing logging in the Integration Engine The

pipeline settings of PI itself the Internet Communication Manager (ICM) and the gateway All three are

heavily involved in message processing and therefore their tracing or logging settings can have an impact on

performance On top of this you might have changed the SM59 RFC destinations and have switched on the

trace in order to analyze a problem If so then this needs to be reset as well

Procedure

Start transaction SXMB_ADM and navigate to Integration Engine Configuration Specific Configuration and search for the following entries

Category Parameter Subparameter Current Value Default

RUNTIME TRACE_LEVEL ltnonegt ltyour valuegt 1

RUNTIME LOGGING ltnonegt ltyour valuegt 0

RUNTIME LOGGING_SYNC ltnonegt ltyour valuegt 0

Set the above parameters back to the default value which is the recommended value by SAP

Start transaction SMICM and check the value of the Trace Level that is displayed in the overview

(third line) Set the trace level to lsquo1rsquo if not already the case You can do so by navigating to Goto Trace Level Set

Start transaction SMGW navigate to Goto Parameters Display and check the parameter Trace Level Set the trace level to lsquo1rsquo if not already the case You can do so by navigating to Goto Trace Gateway Reduce Level (or Increase Level respectively)

Start transaction SM59 and search the following RFC destination for the flag ldquoTracerdquo on the tab Special Options and remove the Flag ldquoTracerdquo if it has been set

AI_RUNTIME_JCOSERVER

AI_DIRECTORY_JCOSERVER

LCRSAPRFC

SAPSLDAPI It is possible that other RFC destinations are used for sending out IDocs and similar

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

102

102 Business Process Engine

You only have to check the Event Trace in the Business Process Engine

Procedure

Call transaction SWELS and check if the Event Trace is switched on The recommended setting is lsquooffrsquo (as shown in the screenshot below)

103 Adapter Framework

Since the Adapter Framework runs on the J2EE Engine the tracing and logging can be conveniently

checked and controlled using NetWeaver Administrator To improve the performance SAP recommends that

you set all trace levels to default Higher trace levels are only acceptable in productive usage during problem

analysis Below you will find a description on how to set the default trace level for all locations at once It is of

possible to do it for every location separately but this way you might forget to reset one of the locations

Procedure

Start the NetWeaver Administrator and navigate to Problem Management (or Troubleshooting in NW 73 and higher) Logs and Traces and choose Log Configuration In the Show drop down list box choose Tracing Locations Select the Root Location in the tree and press the button Default Configuration This will set the default configuration for all locations

Start the NetWeaver Administrator and navigate to Troubleshooting Logviewer (direct link nwalinks) Open the View ldquoDeveloper Tracerdquo and check if you have very frequent reoccurring exceptions that fill up the trace Analyze the exception and check what is causing this (eg a wrong configuration of a Communication Channel causing repeated traces If you see very frequent errors for which you cannot find the root cause open a customer incident at SAP

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

103

An aggregated view of Java exceptions (by number date instance server node etc) are reported and available in SAP Solution Manager Root Cause Analysis Exception Analysis functionality

1031 Persistence of Audit Log information in PI 710 and higher

With SAP NetWeaver PI 71 the audit log is not persisted for successful messages in the database by default

to avoid performance overhead Therefore the audit log is only available in the cache for a limited period of

time (based on the overall message volume)

Audit log persistence can be activated temporarily for performance troubleshooting where audit log

information is required To do so use the NWA and change the parameter

ldquomessagingauditLogmemoryCacherdquo to false in service XPI Service Messaging System by setting the

parameter to false Details can be found in SAP Note 1314974 - PI 71 AF Messaging System audit log

persistence Only do so temporarily during the time of troubleshooting to avoid any performance problems

from the additional persistence

After implementing Note 1611347 - New data columns in Message Monitoring additional information like

message processing time in ms and server node is visible in the message monitor

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

104

11 ERRORS AS SOURCE OF PERFORMANCE PROBLEMS

If errors occur within messages or within technical components of the SAP Process Integration then this can

have a severe impact on the overall performance Thus to solve performance problems it is sometimes

necessary to analyze the log files of technical components and to search for messages with errors To

search for messages with errors use the PI Admin Check which is available using SAP Note 884865 - PIXI

Admin Check

Procedure

ICM (Internet Communication Manager)

Start transaction SMICM open the log file (Shift + F5 or GoTo Trace File Show All) and check for errors

Gateway

Start transaction SMGW open the Log File (CTRL + Shift + F10 or Goto Trace Gateway Display File) and check for errors

System Log

Start transaction SM21 choose an appropriate time interval and search the log entries for errors Repeat the procedure for remote systems if you are using dialog instances

ABAP Runtime Errors

Start transaction ST22 choose an appropriate time interval and search for PI-related dumps In general the number of ABAP dumps should be very small on a PI system and therefore all occurring dumps should be analyzed

Alerts CCMS

Start transaction RZ20 and search for recent alerts

Work Process and RFC trace

In the PI work directory check all files which begin with dev_rfc or dev_w for errors

J2EE Engine

In the PI work directory search for errors in file dev_server0out If you are using more than one J2EE server node check dev_serverltngtout as well

Applications running on the J2EE Engine

The traces for application-related errors are the defaulttrc files located in j2eeclusterserverltngtlog directory To monitor exceptions across multiple server nodes the best tool to use is the LogViewer service of the NWA This is available via Problem Management Logs and Traces and choose Log Viewer Search for exceptions in the area of the performance problem for example the PI Messaging system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

105

APPENDIX A

A1 Wily Introscope Transaction Trace

As mentioned above the Wily Transaction trace can be used to trace expensive steps that were noticed

during the mapping or module processing The transaction trace will allow you to drill down further into Java

performance problems and to distinguish if it is a pure coding problem or caused by a look-up to a remote

system or a slow connection to the local database

The Wily Transaction trace can be used to trace all steps that exceed a specific duration on the Java stack If

a user is known (for example for SAP Enterprise Portal) the tracing can be restricted to a specific user In PI

if you encounter a module that lasts several seconds then you can restrict the tracing as shown below

Note Starting the Transaction Trace increases the data collection on the satellite system (PI) and is

therefore only recommended in productive environment for troubleshooting purposes

With the selection above the trace will run for 10 minutes (this is the maximum and it can be canceled earlier)

and will trace all steps exceeding 3 seconds for the agents of the PI system If such long-running steps are

found then a new window will be displayed listing these steps

In the screenshot below you can see the result in the Trace View The Trace View shows the elapsed time

from left to rightndash in the example below around 48 seconds From top to bottom we can see the call stack of

the thread In general we are interested in long-running threads on the bottom of the trace view A long-

running block at the bottom means that this is the lowest level coding that was instrumented and which is

consuming all the time

In the example below we see a mapping call that is performing many individual database statements ndash this

will become visible by highlighting the lowest level In such a case you have to review the coding of the

mapping to see if the high amount of database calls can be summarized in one call

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

106

Another case that is often seen is that a lookup using JDBC to a remote database or RFC to an ABAP

system takes a long time in the mapping or adapter module In such a case there will be one long block at

the bottom of the transaction trace that also gives you some details about the statement that was executed

A2 XPI inspector for troubleshooting and performance analysis

The XPI Inspector is an additional tool that provides enhanced logging and tracing capabilities with a main

focus on troubleshooting But the tool can also be used for troubleshooting of performance issues

General information about the tool can be found at Note 1514898 - XPI Inspector for troubleshooting PI The

tool can be called using the following URL on your system httphostjava-portxpi_inspector

To analyze performance problems typically the example ldquo51 ndash ldquoPerformance Problem)rdquo is used As a basic

measurement it allows you to take multiple thread dumps in specific time intervals These thread dumps can

be analyzed later by your administrators or SAP support to identify what is causing the problem The Thread

Dump Viewer tool (TDV) from Note 1020246 - Thread Dump Viewer for SAP Java Engine can be used to

analyze the produced thread dumps (inside the results archive that can be downloaded to your local PC)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

107

In addition the tool allows you to do JVM profiling be either doing JVM Performance tracing or JVM Memory

Allocation tracing This can help to understand in detail which steps of the processing are taking a long time

As output the tool provides prf files that can be loaded into the JVM Profiler More information can be found

at httpwikiscnsapcomwikidisplayASJAVAJava+Profiling Please Note that these traces (especially the

memory profiling) will cause a high overhead on the J2EE engine and are therefore not recommended to be

used in production Instead it is advisable to reproduce the problem on your QA system and use the tool

there

Document Version 30 March 2014

wwwsapcom

copy 2014 SAP AG All rights reserved

SAP R3 SAP NetWeaver Duet PartnerEdge ByDesign SAP

BusinessObjects Explorer StreamWork SAP HANA and other SAP

products and services mentioned herein as well as their respective

logos are trademarks or registered trademarks of SAP AG in Germany

and other countries

Business Objects and the Business Objects logo BusinessObjects

Crystal Reports Crystal Decisions Web Intelligence Xcelsius and

other Business Objects products and services mentioned herein as

well as their respective logos are trademarks or registered trademarks

of Business Objects Software Ltd Business Objects is an SAP

company

Sybase and Adaptive Server iAnywhere Sybase 365 SQL

Anywhere and other Sybase products and services mentioned herein

as well as their respective logos are trademarks or registered

trademarks of Sybase Inc Sybase is an SAP company

Crossgate mgic EDDY B2B 360deg and B2B 360deg Services are

registered trademarks of Crossgate AG in Germany and other

countries Crossgate is an SAP company

All other product and service names mentioned are the trademarks of

their respective companies Data contained in this document serves

informational purposes only National product specifications may vary

These materials are subject to change without notice These materials

are provided by SAP AG and its affiliated companies (SAP Group)

for informational purposes only without representation or warranty of

any kind and SAP Group shall not be liable for errors or omissions

with respect to the materials The only warranties for SAP Group

products and services are those that are set forth in the express

warranty statements accompanying such products and services if

any Nothing herein should be construed as constituting an additional

warranty

wwwsapcom

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

9

Wily Introscope is a prerequisite for analyzing performance problems at the Java side It allows you to

monitor the resource usage of multiple J2EE server nodes and provides information about all the important

components on the Java stack like mapping runtime messaging system or module processor Furthermore

the data collected by Wily is stored for several days so it is still possible to analyze a performance problem at

a later date Wily Introscope is provided free-of-charge for SAP customers It is delivered as part of Solution

Manager Diagnostics but can also be installed separately For more information see

httpservicesapcomdiagnostics or SAP Note 797147

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

10

2 WORKING WITH THIS DOCUMENT

If you are experiencing performance problems on your Process Integration system start with Chapter 3

Determining the Bottleneck It helps you to determine the processing time for the different runtimes within PI

Integration Engine Business Process Engine and Adapter Engine or connected Proxy System Once you

have identified the area in which the bottleneck is most probably located continue with the relevant chapter

This is not always easy to do because a long processing time is not always an indication for a bottleneck It

can make sense to do this if a complicated step is involved (Business Process Engine) if an extensive

mapping is to be executed (IS) or if the payload is quite large (all runtimes) For this reason you need to

compare the value that you retrieve from Chapter 3 with values that you have received previously for

example The history data provided for the Java components by Wily Introscope is a big help If this is not

possible you will have to work with a hypothesis

Once the area of concern has been identified (or your first assumption leads you there) Chapters 4 5 6

and 7 will help you to analyze the Integration Engine the Business Process Engine the PI Adapter

Framework and the ABAP Proxy runtime

After (or preferably during) the analysis of the different process components it is important to keep in mind

that the bottlenecks you have observed could also be caused by other interfaces processing at the same

time It will lead you to slightly different conclusions and tuning measures You therefore have to distinguish

the following cases based on where the problem occurs

A with regard to the interface itself that is it occurs even if a single message of this interface is processed

B or with regard to the volume of this interface that is many messages of this interface are processed at the same time

C or with regard to the overall message volume processed on PI

This can be analyzed by repeating the procedure of Chapter 3 for A) a single message that is processed

while no other messages of this interface are processed and while no other interfaces are running Then

compare this value with B) the processing time for a typical amount of messages of this interface not simply

one message as before If the values of measurement A) and B) are similar repeat the procedure with C) a

typical volume of all interfaces that is during a representative timeframe on your productive PI system or

with the help of a tailored volume test

These three measurements ndash A) processing time of a single message B) processing time of a typical

amount of messages of a single interface and C) processing time of a typical amount of messages of all

interfaces ndash should enable you to distinguish between the three possible situations

o A specific interface has long processing steps that need to be identified and improved The tuning options are usually limited and a re-design of the interface might be required

o The mass processing of an interface leads to high processing times This situation typically calls for tuning measures

o The long processing time is a result of the overall load on the system This situation can be solved by tuning measures and by taking advantage of PI features for example to establish separation of interfaces If the bottleneck is hardware-related it could also require a re-sizing of the hardware that is used

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

11

Chapters 8 9 10 and 11 deal with more general reasons for bad performance such as heavy

tracinglogging error situations and general hardware problems They should be taken into account if the

reason for slow processing cannot be found easily or if situation C from above applies (long processing times

due to a high overall load)

Important

Chapter 9 provides the basic checks for the hardware of the PI server It should not only be used when

analyzing a hardware bottleneck due to a high overall load andor an insufficient sizing but should also be

used after every change made to the configuration of PI to ensure that the hardware is able to handle the

new situation This is important because a classical PI system uses three runtime engines (IS BPE AFW)

Tuning one engine for high throughput might have a direct impact on the others With every tuning action

applied you have to be aware of the consequences on the other runtimes or the available hardware

resources

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

12

3 DETERMINING THE BOTTLENECK

The total processing time in PI consists of the processing time in the Integration Server processing time in

the Business Process Engine (if ccBPM is used) as well as the processing time in the Adapter Framework (if

adapters except IDoc plain HTTP or ABAP proxies are used) In the subsequent chapters you will find a

detailed description of how to obtain these processing times

For reasons of completeness we will also have a look at the ABAP Proxy runtime and the available

performance evaluations

31 Integration Engine Processing Time

You can use an aggregated view of the performance (details about activation in Note 820622 - Standard jobs

for XI performance monitoring) data for the Integration Engine as shown below

Procedure

Log on to your Integration Engine and call transaction SXMB_IFR In the browser window follow the link to

the Runtime Workbench In there click ldquoPerformance Monitoringrdquo Change the display to ldquoDetailed Data

Aggregatedrdquo select lsquoISrsquo as the Data Source (or lsquoPMIrsquo if you have configured the Process Monitoring

Infrastructurersquo) and ldquoXIIntegrationServerltyour_hostgtrdquo as the component Then choose an appropriate time

interval for example the last day You have to enter the details of the specific interface you want to monitor

here

The interesting value is the ldquoProcessing Time [s]rdquo in the fifth column It is the sum of the processing time of the single steps in the Integration Server For now you are not interested in the single steps that are listed on the right side since you are still trying to find out the part of the PI system where the most time is lost Chapter 44 will work with the individual steps

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

13

If you do not know which interface is affected yet you first have to get an overview Instead of navigating to

Detailed Data Aggregated in the Runtime Workbench choose Overview Aggregated Use the button Display

Options and check the options for sender component receiver component sender interface and receiver

interface Check the processing times as described above

32 Adapter Engine Processing Time

Procedure

Log on to your Integration Server and call transaction SXMB_IFR In the browser window follow the link to

the Runtime Workbench In there click Message Monitoring choose Adapter Engine lthostgt from Database

from the drop down lists and press Display Enter your interface details and ldquoStartrdquo the selection Select one

message using the radio button and press Details

Calculate the difference between the start and end timestamp indicated on the screen

Do the above calculation for the outbound (AFW IS) as well as the inbound (IS AFW) messages

The audit log of successful message is no longer persisted in SAP NetWeaver PI 71 by default in order to minimize the load on the database The audit log has a high impact on the database load since it usually consists of many rows that are inserted individually in the database In newer releases therefore a cache was implemented that keeps the audit log information for a period of time only Thus no detailed information is available any longer from a historical point of view Instead you should use Wily Introscope for historical analyses of the performance of successful messages

The audit log can be persisted on the database to allow historical analysis of performance problems

on the Java stack To do so change the parameter ldquomessagingauditLogmemoryCacherdquo from

true to false for service ldquoXPI Service Messaging Systemrdquo using NetWeaver Administrator (NWA) as described in SAP Note 1314974 Note Only do this temporarily if you have identified the bottleneck on the AFW First try using Wily to determine the root cause of the problem

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

14

Due to above mentioned limitations Wily Introscope is the right tool for performance analysis of the Adapter Engine Wily persists the most important steps like queuing module processing and average response time in historical dashboards (aggregated data per intervals for all server nodes of a system or all systems with filters possible etc)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

15

Especially for Java-Only runtime using Integrated Configuration Wily shows the complete processing time after the initial sender adapter steps (receiver determination mapping and time in the backend call) as can be seen in the Message Processing Time dashboard below

Using the ldquoShow Minimum and Maximumrdquo functionality deviations from the average processing time can be seen In the example below a message processing step takes up to 22 seconds The reason for this could either be a slow mapping or a delay in the call to the receiver system (in the example below Java IDoc adapter is used to transfer the data to ERP)

In case of high processing times in one of your steps further analysis might be required using thread dumps or the Java Profiler as outlined in the appendix section A2 XPI inspector for troubleshooting and performance analysis

321 Adapter Engine Performance monitor in PI 731 and higher

Starting from PI 731 SP4 there is a new performance monitor available for the Adapter Engine More

information on the activation of the performance monitor can be found at Note 1636215 ndash Performance

Monitoring for the Adapter Engine Similar to the ABAP monitor in the RWB the data is displayed in an

aggregated format On the PI start page use the link ldquoConfiguration and Monitoring Homerdquo Go to ldquoAdapter

Enginerdquo and select ldquoPerformance Monitorrdquo The data is persisted on an hourly basis for the last 24 hours and

on a daily basis for the last 7 days The data is further grouped on interface level and per Java server node

With the information provided you can therefore see the minimum maximum and average response time of

an individual interface on a specific Java server node All individual steps of the message processing like

time spent in the Messaging System queues or Adapter modules are listed In the example below you can

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

16

see that most of the time (5 seconds) is spent in a module called SimpleWaitModule This would be the entry

point for the further analysis

Starting with 731 SP10 (740 SP05) this data can be exported to excel in two ways Overview or Detailed (details in SAP Note 1886761)

33 Processing Time in the Business Process Engine

Procedure

Log on to your integration server and call transaction SXMB_MONI Adjust your selection in a way that you

select one or more messages of the respective interface Once the messages are listed navigate to the

Outbound column (or Inbound if your integration process is the sender) and click on PE Alternatively you

can select the radio button for ldquoProcess Viewrdquo on the entranceselection screen of SXMB_MONI

To judge the performance of an integration process it is essential to understand the steps that are executed A long-running integration process in itself is not critical because it can wait for a trigger event or can include a synchronous call to a backend Therefore it is important to understand the steps that are included before analyzing an individual workflow log For example the screenshot below shows a long waiting time after the mapping This is perhaps due to an integrated wait step

Calculate the time difference between the first and the last entry Note that this time for the workflow does not include the duration of RFC calls for example To see this processing time navigate to the ldquoList with Technical Detailsrdquo (second button from the left on the screenshot below or shift + F9) Repeat this step for several messages to get an overview about the most time consuming steps

Please Note that the processing time mentioned above does not include the waiting time of the messages in the ccBPM queues Therefore it is essential to monitor a queue backlog in ccBPM queues as discussed in section ldquoQueuing and ccBPM (or increasing Parallelism for ccBPM Processes)rdquo

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

17

Via transaction SWI2_DURA you can display an aggregated performance overview across all ccBPM

processes running on your PI system In the initial screen you have to choose ldquo(Sub-) Workflow and the time

range you would like to look at

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

18

The results here allow you to compare the average performance of a process It shows you the processing

time based on barriers The 50 barrier means that 50 of the messages were processed faster than the

value specified Furthermore you also see a comparison of the processing time to eg the day before By

adjusting the selection criteria of the transaction you can therefore get a good overview about the normal

processing time of the Integration process and can judge if there is a general performance problem or just a

temporary one

The number of process instances per integration process can be easily checked via transaction

SWI2_FREQ The data shown allows you to check the volume of the Integration Processes and allow you to

judge if your performance problem is caused by an increase of volume which could cause a higher latency in

the ccBPM queues

New in PI 73 and higher

Starting from PI 73 a new monitoring for ccBPM processes is available This monitor can be started from

transaction SXMB_MONI_BPE Integration Process Monitoring (also available in ldquoConfiguration and

Monitoring Homerdquo on PI start page) This is new browser based view that allows a simplified and aggregated

view on the PI Integration Processes

On the initial screen you get an overview about all the Integration Processes executed in the selected time

interval Therefore you can immediately see the volume of each Integration Process

From there you can navigate to the Integration process facing the performance issues and look at the

individual process instances and the start and end time Furthermore there is a direct entry point to see the

PI messages that are assigned to this process

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

19

Choosing one special process instance ID you will see the time spend in each individual processing step

within the process In the example below you can see that most of the time is spend in the Wait Step

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

20

4 ANALYZING THE INTEGRATION ENGINE

If chapter Determining the Bottleneck has shown that the bottleneck is clearly in the Integration Engine there

are several transactions that can help you to analyze the reason for this To understand why this selection of

transactions helps to analyze the problem it is important to know that the processing within the Integration

Engine is done within the pipeline The central pipeline of the Integration Server executes the following main

steps

Central Pipeline Steps Description of Central Pipeline Steps

PLSRV_XML_VALIDATION_RQ_INB XML Validation Inbound Channel Request

PLSRV_RECEIVER_DETERMINATION Receiver Determination

PLSRV_INTERFACE_DETERMINATION Interface Determination

PLSRV_RECEIVER_MESSAGE_SPLIT Branching of Messages

PLSRV_MAPPING_REQUEST Mapping

PLSRV_OUTBOUND_BINDING Technical Routing

PLSRV_XML_VALIDATION_RQ_OUT XML Validation Outbound Channel Request

PLSRV_CALL_ADAPTER Transfer to Respective Adapter IDoc HTTP Proxy AFW

PLSRV_XML_VALIDATION_RS_INBXML Validation Inbound Channel Response

PLSRV_MAPPING_RESPONSE Response Message Mapping

PLSRV_XML_VALIDATION_RS_OUT Validation Outbound Channel Response

The last three steps highlighted in bold are available in case of synchronous messages only and reflect the

time spent in the mapping of the synchronous response message

The steps indicated in red are new in SAP NetWeaver PI 71 and higher and can be used to validate the

payload of a PI message against an XML schema These steps are optional and can be executed at different

times in the PI pipeline processing for example before and after a mapping (as shown above)

With PI 731 an additional option of using an external Virus Scan during PI message processing can be

activated as described in the Online Help The Virus Scan can be configured on multiple steps in the pipeline

ndash eg after the mapping (Virus Scan Outbound Channel Request) when receiving a synchronous response

(Virus Scan Inbound Channel Response) and after the response mapping (Virus Scan Outbound Channel

Response)

It is important to know is that these pipeline steps are executed based on qRFC Logical Unit of Work (LUW)

by using dialog work processes

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

21

With 711 and higher versions you can activate the Lean Message Search (LMS) to be able to search for

payload content This is described in more detail in chapter Long Processing Times for

ldquoLMS_EXTRACTIONrdquo

41 Work Process Overview (SM50SM66)

The work process overview is the central transaction to get an overview of the current processes running in

the Integration Engine For message processing PI is only using Dialog work processes (DIA WP) Therefore

it is essential to ensure that enough DIA WPs available

The process overview is only a snapshot analysis You therefore have to refresh the data multiple times to

get a feeling for the dynamics of the processes The most important questions to be asked are as follows

How many work processes are being used on average Use the CPU Time (clock symbol) to check that not all DIA WPs are used In case all DIA WPs have a high CPU time this indicates a WP bottleneck

Which users are using these processes It depends on the usage of your PI system All asynchronous messages are processed by the QIN scheduler who will start the message processing in DIA WPs The user that is shown in SM66 will be the one that triggered the QIN scheduler This can eg be a communication user for a connected backend (usually a copy of PIAPPLUSER) or the user used to send data from the Adapter Engine (per default PIAFUSER)

Activities related to the Business Process Engine (ccBPM) will be indicated by user WF-BATCH If you see high WF-BATCH activity take a look at chapter 51

Are there any long running processes and if so what are these processes doing with regard to the reports used and the tables accessed

o As a rule of thumb for the initial configuration the number of DIA work processes should be around six times the value of CPU in your PI system (rdispwp_no_dia = 6 to 8 times CPUs cores based on SAP Note 1375656 - SAP NetWeaver PI System Parameters)

If you would like to get an overview for an extended period of time without actually refreshing the transaction

at regular intervals use report SDFMON and schedule the Daily Monitoring It allows you to collect metrics

such as CPU usage amount of free Dialog work processes and most importantly a snapshot of the work

process activity as provided in transactions SM50 and SM66 The frequency for the data collection can be as

low as 10 seconds and the data can be viewed after the configurable timeframe of the analysis Note

SDFMON is only intended for analysis of performance issues and should only be scheduled for a limited

period of time

If you have Solution Manager Diagnostics installed and all the agents connected to PI you can also monitor

the ABAP resource usage using Wily Introscope As you can see in the dashboard below you can use it to

monitor the number of free work processes (prerequisite is that the SMD agent is running on the PI system)

The major advantage of Wily Introscope is that this information is also available from the past and allows

analysis after the problem has occurred in the system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

22

42 qRFC Resources (SARFC)

Depending on the configuration of the RFC server group not all dialog work processes can be used for

qRFC processing in the Integration Server As stated earlier all (ABAP based) asynchronous messages are

processed using qRFC Therefore tuning the RFC layer is one of the most important tasks to achieve good

PI performance

In case you have a high volume of (usually runtime critical) synchronous scenarios you have to ensure that

enough DIA WPs are kept free by the asynchronous interfaces running at the same time Since this is a very

difficult tuning exercise we usually recommend implementing runtime critical synchronous interfaces using

Java only configuration (ICO) whenever possible If this is not possible you have to ensure via SARFC tuning

that enough work processes are kept free to process the synchronous messages

The current resource situation in the system can be monitored using transaction SARFC

Procedure

First check which application servers can be used for qRFC inbound processing by checking the AS Group

assigned in transaction SMQR

Call transaction SARFC and refresh several times since the values are only snapshot values

Is the value for ldquoMax Resourcesrdquo close to your overall number of dialog work processes Max Resources is a fixed value and describes how many DIA work processes may be used for RFC communication If the value is too low your RFC parameters need tuning to provide the Integration Server with enough resource for the pipeline processing The RFC parameters can be displayed by double-clicking on ldquoServer Namerdquo

o A good starting point is to set the values as follows

Max no of logons = 90

Max disp of own logons = 90

Max no of WPs used = 90

Max wait time = 5

Min no of free WP = 3-10 (depending on the size of the application server)

These values are specified in SAP Note 1375656 ndash SAP NetWeaver PI System Parameters (this parameter must be increased if you plan to have synchronous interfaces) To avoid any bottleneck and blocking situation free DIA WPs should always be available The value should be adjusted based on the overall number of DIA WPs available at the system and the above requirements

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

23

Note You have to set the parameters using the SAP instance profile Otherwise the changes are lost

after the server is restarted

The field ldquoResourcesrdquo shows the number of DIA WPs available for RFC processing Is this value reasonably high for example above zero all the time If the ldquoResourcesrdquo value is zero then the Integration Server cannot process XML messages because no dialog work process is free to execute the necessary qRFC There are several conclusions that can be drawn from this observation as follows

1) Depending on the CPU usage (see Chapter ldquoMonitoring CPU Capacityrdquo) it might be necessary at this time to increase the number of dialog work processes in your system If the CPU usage however is already very high increasing the number of DIA will not solve the bottleneck

2) Depending on the number of concurrent PI queues (see Chapter qRFC Queues (SMQ2)) it might be necessary to decrease the degree of parallelism in your Integration Server because the resources are obviously not sufficient to handle the load The next section describes how to tune PI inbound and outbound queues

The data provided in SARFC is also collected and shown in a graphical and historical way using Solution

Manager Diagnostic and Wily Introscope (as can be seen in the screenshot below) This allows easy

monitoring of the RFC resources across all available PI application servers

43 Parallelization of PI qRFC Queues

The qRFC inbound queues on the ABAP stack are one of the most important areas in PI tuning Therefore it

is essential to understand the PI queuing concept

PI generally uses two types of queues for message processing ndash PI inbound and PI outbound queues Both

types are technical qRFC inbound queues and can therefore be monitored using SMQ2 PI inbound queues

are named XBTI (EO) or XBQI (EOIO) and are shared between all interfaces running on PI by default The

PI outbound queues are named XBTO (EO) and XBQO (EOIO) The queue suffix (in red XBTO0___0004)

specifies the receiver business system This way PI is using dedicated outbound queues for each receiver

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

24

system All interfaces belonging to the same receiver business system are contained in the same outbound

queue Furthermore there are dedicated queues for prioritization of separation of large messages To get an

overview about the available queues use SXMB_ADM Manage Queues

PI inbound and outbound queues execute different pipeline steps

PI Inbound Queues

Pipeline Steps Description of Pipeline Steps

PLSRV_XML_VALIDATION_RQ_INB XML Validation Inbound Channel Request

PLSRV_RECEIVER_DETERMINATION Receiver Determination

PLSRV_INTERFACE_DETERMINATION Interface Determination

PLSRV_RECEIVER_MESSAGE_SPLIT Branching of Messages

PI Outbound Queues

Pipeline Steps Description of Pipeline Steps

PLSRV_MAPPING_REQUEST Mapping

PLSRV_OUTBOUND_BINDING Technical Routing

PLSRV_XML_VALIDATION_RQ_OUT XML Validation Outbound Channel Request

PLSRV_CALL_ADAPTER Transfer to Respective Adapter IDoc HTTP Proxy AFW

Before the steps above are executed the messages are placed in a qRFC queue (SMQ2) The messages

will wait till they are first in queue and till a free DIA work process is assigned to a queue The wait time in the

queue are recorded in the performance header in DB_ENTRY_QUEUING (PI inbound queue) and

DB_SPLITTER_QUEUING (PI outbound queue) Most often you will see the long processing times there as

discussed in chapter Long Processing Times for ldquoDB_ENTRY_QUEUEINGrdquo and Long Processing Times for

ldquoDB_SPLITTER_QUEUEINGrdquo

Looking at the steps above it is clear that tuning the queues will have a direct impact on the connected PI

components and also backend systems For example by increasing the number of parallel outbound queues

more mappings will be executed in parallel which will in turn put a greater load on the Java stack or more

messages will be forwarded in parallel to the backend system in case of a Proxy call Thus when tuning PI

queues you must always keep the implications for the connected systems in mind

The tuning of PI ABAP queue parameters is done in transaction SXMB_ADM Integration Engine

Configuration and by selecting the category TUNING

For productive usage we always recommend to use inbound and outbound queues (parameter

EO_INBOUND_TO_OUTBOUND=1) Otherwise only inbound queues will be used which are shared across

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

25

all interfaces Hence a problem with one single backend system will affect all interfaces running on the

system

The principle of less is sometimes more also applies for tuning the number of parallel PI queues Increasing

the number of parallel queues will result in more parallel queues with fewer entries per queue In theory this

should result in a lower latency if enough DIA WPs are available

But practically this is not true for high volume systems The main reason for this is the overhead involved in

the reloading of queues (see details below) Furthermore important tuning measures like PI packaging (see

Chapter ldquoPI Message Packagingrdquo) aim to increase the throughput based on a higher number of messages in

the queue Thus from a throughput perspective it is definitely advisable to configure fewer queues

If you have very high runtime requirements you should prioritize these interfaces and assign a different

parallelism for high priority queues only This can be done using parameters

EO_INBOUND_PARALLEL_HIGH_PRIO and EO_OUTBOUND_PARALLEL_HIGH_PRIO as described in

section ldquoMessage Prioritization on the ABAP Stackrdquo

To tune the parallelism of inbound and outbound queues the relevant parameters are

EO_INBOUND_PARALLEL and EO_OUTBOUND_PARALLEL have to be used

Below you can see a screenshot of SMQ2 You can see the PI inbound queues and outbound queues Also

ccBPM queues (XBQO$PE) are displayed and will be discussed in section 5

Procedure

Log on to your Integration Server call transaction SMQ2 and execute If you are running ABAP proxies on a

separate client on the same system enter lsquorsquo for the client Transaction SMQ2 provides snapshots only and

must therefore be refreshed several times to get viable information

o The first value to check is the number of queues concurrently active over a period of time Since each queue needs a dialog work process to be worked on the number of concurrent queues must not be too high compared to the number of dialog work processes usable for RFC communication (see Chapter qRFC Resources (SARFC)) The optimum throughput can be achieved if the number of active queues is equal to or slightly higher than the number of dialog work processes (provided that the number of DIA work processes can be handled by the CPU of the system see transaction ST06)

In case many of the queues are EOIO queues (eg because the serialization is done on the material number) try to reduce the queues by following Chapter EOIO tuning

o The second value of importance is the number of entries per queue Hit refresh a few times to check if the numbers increasedecreaseremain the same An increase of entries for all queues or a specific

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

26

queue points to a bottleneck or a general problem on the system The conclusion that can be drawn from this is not simple Possible reasons have been found to include

1) A slow step in a specific interface

Bad processing time of a single message or a whole interface can be caused by expensive processing steps as for example the mapping step or receiver determination This can be confirmed by looking at the processing time for each step as shown in ldquoAnalyzing the runtime of pipeline stepsrdquo

2) Backlog in Queues

Check if inbound or outbound queues face a backlog A backlog in a queue is generally nothing critical since the queues ensure that the PI components as well as the backend systems are not overloaded For instance batch triggered interfaces are usually causing high backlogs that get processed over time Only in case the backlog prevents you to meet the business requirements for an interface this should be analyzed

If the backlog is caused by a high volume of messages arriving in a short period of time one solution to minimize the backlog is to increase the number of queues available for this interface (if sufficient hardware is available) If one interface is having a backlog in the outbound queues you could eg specify EO_OUTBOUND_PARALLEL with a sub parameter specifying your interface to increase the parallelism for this interface

But in general a backlog can also be caused by a long processing time in one of the pipeline steps as discussed in 1) or a performance problem with one of the connected components like PI Java stack or connected backend systems Due to the steps performed in the outbound queues (mapping or call adapter) it is more likely that a backlog will be seen in the outbound queues Inbound queue processing is only expensive if Content Based Routing or Extended Receiver Determination is used To understand which step is taking long follow once more chapter ldquoAnalyzing the runtime of pipeline stepsrdquo

In addition to tuning of the number of inbound and outbound queues also ldquoMessage Prioritization on the ABAP Stackrdquo and ldquoPI Message Packagingrdquo can help

The screenshot below shows such a backlog situation in Wily Introscope Please Note that the information about the queues is only collected by Wily in 5 minutes intervals explaining the gaps between the measurement points On the dashboard ldquoTotal qRFC Inbound Queue Entriesrdquo you can see that the number of messages in SMQ2 is increasing The number of inbound queues does not increase at the same time It also offers a dashboard for the number of entries by queue name and the time entries remain in SMQ2 With the information provided here it is also possible to distinguish a blocking situation or a general resource problem after the problem is no longer present in the system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

27

3) Queues stay in status READY in SMQ2

To see the status of the queues use the filter button in SMQ2 as shown below

It is a normal situation to see many queues in READY status for a limited amount of time The reason for this is the way the QIN scheduler reloads LUWs (see point 2 below) A waiting time of up to 5 seconds per queuing step is considered normal during normal load situation (even if there is no backlog in the queue and enough resources are available) If you observe a situation where many queues are in READY status for longer period of time the following situations can apply

A resource bottleneck with regard to RFC resources Confirm by following section qRFC Resources (SARFC)

To avoid overloading a system with RFC load the QIN scheduler only reloads new queues after a certain amount of queues have finished processing This can lead to long waiting times in the queues as explained in SAP Note 1115861 - Behaviour of Inbound Scheduler after Resource Bottleneck Since PI is a non-user system we recommend setting rfcinb_sched_resource_threshold to 3 as described in SAP Note 1375656 ndash SAP NetWeaver PI System Parameters

Long DB loading times for RFC tables If using Oracle ensure that SAP Note 1020260 - Delivery of Oracle statistics is applied

Additional overhead during the pipeline executions occurs due to PI internal mechanisms Per default a lock is set during processing of each message This is no longer necessary and should

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

28

be switched off by setting the parameter LOCK_MESSAGE of category RUNTIME to 0 as described in SAP Note 1058915 In addition during the processing of an individual message the message repeatedly calls an enqueue which leads to a deterioration of the throughput This can be avoided by setting parameter CACHE_ENQUEUE to 0 as described in SAP Note 1366904

Sometimes a queue stays in READY status for a very long time while other queues are getting processed fine After manual unlocking the queue is processed but then stops after some time again A potential reason is described in Note 1500048 - SMQ2 inbound Queue remains in READY state In a PI environment queues in status RUNNING for more than 30 minutes usually happen due to one individual step taking long (as discussed in step 1) or a problem on the infrastructure (eg memory problems on Java or HTTP 200 lost on network) After 30 minutes the QIN scheduler removes this queue from the scheduling and therefore it remains in READY To solve such cases the root cause for the blocking has to be understood

4) Queue in READY status due to queue blacklisting

If a queue is for a very long time in status RUNNING (eg due to a problem with a mapping or due to a very slow backend) it is possible that the QIN scheduler excludes this queue from queue scheduling Hence the queue stays in status READY even though enough work processes are available The ldquoblacklistingrdquo of queues takes place when the runtime of a queue exceeds the ldquoMonitoring thresholdrdquo configured on this queue Notes 1500048 - queues stay in ready (schedule_monitor) and Note 1745298 - SMQR Inbound queue scheduler identify excluded entries provide more input on this topic The screenshot below shows a setting of 2400 seconds (40 minutes) for this parameter

Check if a queue stays in READY status for a long time while others are processing without any issue Ensure Note 1745298 is implemented and check the system log for the following exception ldquoQINEXCLUDE ltqueue namegt lttimedate at which the queue scheduler startedgt If this is the case check why the long runtime occurs (see section Analyzing the runtime of PI pipeline steps) or increase the schedule_monitor in exceptional cases only (if eg the runtime of an individual processing step cannot be improved further)

5) An error situation is blocking one or more queues

Click the Alarm Bell (Change View) push button once to see only queues with error status Errors delay the queue processing within the Integration Server and may decrease the throughput if for example multiple retries occur Navigate to the first entry of the queue and analyze the problem by forward navigation to the relevant message in SXMB_MONI

Often you see EO queues in SYSFAIL or RETRY due to the problems during the processing of an individual message This can be prevented by following the description of Chapter Prevent blocking of EO queues

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

29

44 Analyzing the runtime of PI pipeline steps

The duration of the pipeline steps is the ultimate answer to long processing times in the Integration Engine

since it describes exactly how much time was spent at which point The recommended way to retrieve the

duration of the pipeline steps is the RWB and will be described below Advanced users may use the

Performance Header of the SOAP message using transaction SXMB_MONI but the timestamps are not

easy to read If you still prefer the latter method here is the explanation 20110409092656165 must be read

as yyyymmddhh(min)(min)(sec)(sec)(microseconds) that is the timestamp corresponds to April 9 2011 at

092656 and 165 microseconds Please note that these timestamps in Performance Header of

SXI_MONITOR transaction are stored and displayed in UTC time therefore conversion to system time must

be done when analyzing them

In case PI message packaging is configured the performance header will always reflect the processing time

per package Hence a duration of 50 seconds does not mean a single message took 50 seconds but

eventually the package contained 100 messages so that every message took 05 ms More details about

this can be found in section PI Message Packaging

Procedure

Log on to your Integration Server and call transaction SXMB_IFR In the browser window follow the link to

the Runtime Workbench In there click Performance Monitoring Change the display to Detailed Data

Aggregated and choose an appropriate time interval for example the last day For this selection you have to

enter the details of the specific interface you want to monitor

You will now see in the lowerndashpart of the screen a split of the processing time into single steps Check the time difference between the steps Does any step take longer than expected In the example in the screenshot below the DB_ENTRY_QUEUEING starts after 0032 seconds and ends after 0243 seconds which means it took 0211 seconds (211ms)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

30

Compare the processing times for the single steps for different measurements as outlined in Chapter

Pipeline Steps (SXMB_MONI or RWB) For example is a single step only long if many messages

are processed or if a single message is processed This helps you to decide if the problem is a

general design problem (single message has long processing step) or if it is related to the message

volume (only for a high number of messages this process step has large values)

Each step has different follow-up actions that are described next

441 Long Processing Times for ldquoPLSRV_RECEIVER_ DETERMINATIONrdquo PLSRV_INTERFACE_DETERMINATION

Steps that can take a long time in inbound processing are the Receiver and Interface Determination In these

steps the receiver system and interface is calculated Normally this is very fast but PI offers the possibility of

enhanced receiver determinations In these cases the calculation is based on the payload of a message

There are different implementation options

o Content-Based Routing (CBR)

CBR allows defining XPath expressions that are evaluated during runtime These expressions can be combined by logical operators for example to check the value for multiple fields The processing time of such a request directly correlates to the number and complexity of the conditions defined

No tuning options exist for the system in regard to CBR The performance of this step can only be changed by reducing the number of rules by changing the design of the interfaces Another option is to use an extended receiver determination (executing a mapping program) since this is faster than CBR using many rules

o Mapping to determine Receivers

A standard PI mapping (ABAP Java XSLT) can also be used to determine the receiver If you observe high runtimes in such a receiver determination follow the steps outlined in the next section since the same mapping runtime is used

442 Long Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo

Before analyzing the mapping you must understand which runtime is used Mappings can be implemented

in ABAP as graphical mappings in the Enterprise Service Builder as self-developed Java mappings or

XSLT mappings One interface can also be configured to use a sequence of mappings executed

sequentially In such a case analysis is more difficult because it is not clear which mapping in the sequence

is taking a long time

Normal debugging and tracing tools (transaction ST12) can be used for mappings executed in ABAP Any

type of mapping can be tested from ABAP using transaction SXI_MAPPING_TEST Knowing

senderreceiver partyserviceinterfacenamespace and source message payload it is possible to check the

target message (after mapping execution) and detailed trace output (similar to contents of trace of the

message in SXMB_MONI with TRACE_LEVEL = 3) It can also be used for debugging at runtime by using

the standard debugging functionality

For ABAP based XSLT mappings it is also possible via transaction XSLT_TOOL to test trace and debug

XSLT transformations on the ABAP stack

In general the mapping response time is heavily influenced by the complexity of the mapping and the

message size Therefore to analyze a performance problem in the mapping environment you should

compare the mapping runtime during the time of the problem with values reported several days earlier to get

a better understanding

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

31

If no AAE is used the starting point of a mapping is the Integration Engine that will send the request by RFC

destination AI_RUNTIME_JCOSERVER to the gateway There it will be picked up by a registered server

program The registered server program belongs to the J2EE Engine The request will be forwarded to the

J2EE Engine by a JCo call and then executed by the Java runtime When the mapping has been executed

the result is sent back to the ABAP pipeline There are therefore multiple places to check when trying to

determine why the mapping step took so long

Before analyzing the mapping runtime of the PI system check if only one interface is affected or if you face a

long mapping runtime for different interfaces To do so check the mapping runtime of messages being

processed at the same time in the system

The best tool for such an analysis is Wily Introscope which offers a dashboard for all mappings being

executed at a given time Each line in the dashboard represents one mapping and shows the average

response time and the number of invocations

In the screenshot below you can see that many different mapping steps have required around 500 seconds

for processing Comparing the data during the incident with the data from the day before will allow you to

judge if this might be a problem of the underlying J2EE engine as described in section J2EE Engine

Bottleneck

If there is only one mapping that faces performance problems there would be just one line sticking out in the

Wily graphs If you face a general problem that affects different interfaces you can choose a longer

timeframe that allows you to compare the processing times in a different time period and verify if it is only a

ldquotemporaryrdquo issue ndash this would for example indicate an overload of the mapping runtime

If you have found out that only one interface is affected then it is very unlikely to be a system problem but

rather a problem in the implementation of the mapping of that specific interface

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

32

Check the message size of the mapping in the runtime header using SXMB_MONI Verify if the message size is larger than usual (which would explain the longer runtime)

There can also be one or many lookups to a remote system (using RFC or JDBC) causing the long processing time of a mapping Together with the application you then have to check if the connection to the backend is working properly Such a mapping with lookups can be analyzed using Wily Transaction Trace as explained in the appendix section Wily Transaction Trace

If not one but several interfaces are affected a potential system bottleneck occurs and this is described in

the following

o Not enough resources (registered server programs) available That could either be the case if too many mapping requests are issued at the same time or if one J2EE server node is down and has not registered any server programs

To check if there were too many mapping requests for the available registered server programs compare the number of outbound queues that are concurrently active with the number of registered server programs The number of outbound queues that are concurrently active can be monitored with transaction SMQ2 and counting queues with the names XBTO amp XBQOXB2O XBTA and XBQAXB2A (high priority) XBTZ and XBQZXB2Z (low priority) and XBTM (large messages) In addition to this you have to take into account mapping steps that are executed by synchronous messages or in a ccBPM process Mappings executed by a ccBPM process are not queued but the ccBPM process calls the mapping program directly using tRFC The number of registered server programs can be determined with transaction SMGW Goto Logged On Clients and filtering by the program (ldquoTP Namerdquo) AI_RUNTIME_ltSIDgt In general we recommend configuring 20 parallel mapping connections per server node

If the problem occurred in the past it is more difficult to determine the number of queues that are concurrently active One way is to use transaction SXMB_MONI and specify every possible outbound queue (field ldquoQueue IDrdquo) in the advanced selection criteria (Note wildcard search not available) The timeframe can be restricted in the upper-part of the advanced selection criteria Advanced users could use transaction SE16 to search directly in table SXMSPMAST using the field ldquoQUEUEINTrdquo

The other option is to use the Wily ldquoqRFC Inbound Queue Detailrdquo dashboard as described in qRFC Queue Monitoring (SMQ2)

To check if the J2EE is not available or if the server program is not registered for a different reason use transaction SM59 to test the RFC destination AI_RUNTIME_JCOSERVER If the problem occurred in the past search the gateway trace (transaction SMGW GoTo Trace Gateway Display File) and the RFC developer traces dev_rfc files available in the work directory for the string ldquoAI_RUNTIMErdquo In addition check the std_serverltngtout in the work directory for restarts of the J2EE Engine

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

33

o Depending on the checks above there are two options to resolve the bottleneck Of course if the J2EE was not available the reason for this has to be found and prevented in the future

The two options for tuning are

1) the number of outbound queues that are concurrently active and

2) the number of mapping connections from the J2EE server to the ABAP gateway

The increase of mapping connections is only recommended for strong J2EE servers since each mapping thread needs resources like CPU and Heap memory Too many mapping connections mean too many mappings being executed on the J2EE Engine In turn this might reduce the performance of other parts of the PI system for example the pipeline processing in the Integration Engine or the processing within the adapters of the AFW or can even cause out-of-memory errors on the J2EE engine Add additional J2EE server nodes to balance the parallel requests Each J2EE server node will

register new destinations at the gateway and will therefore take a part of the mapping load

Another option is to reduce the number of outbound queues that are concurrently active to solve the bottleneck This will certainly result in higher backlogs in the queue and is therefore only applicable for less runtime critical interfaces This option can be used if the backlog is caused by a master data interface with high volume but no critical processing time requirements In such a case you can only lower the number of parallel queues for this interface by using a sub-parameter of EO_OUTBOUND_PARALLEL in transaction SXMB_ADM By doing so you slow down one interface to ensure other (more critical) interfaces have sufficient resources available

o If multiple application servers configured on the PI system ensure that all the server nodes are connected to their local gateway (local bundle option) as described in the guide How To Scale PI

443 Long Processing Times for ldquoPLSRV_CALL_ADAPTERrdquo

Call adapter is the last step executed in the PI pipeline and forwards the message to the next component

along the message flow In most cases (local Adapter Engine IDoc or BPE) this is a local call The IDoc

adapter only puts the message on the tRFC layer (SM58) of the PI system so that the actual transfer of the

IDoc is not included in the performance header For ABAP Proxies plain HTTP or calls to a central or

decentral Adapter Engine an HTTP call is made so that network time can have an influence here

Looking at the processing time we have to distinguish asynchronous (EO and EOIO) and synchronous (BE)

interfaces

In asynchronous interfaces the call adapter step includes the transfer of the message by the network and

the initial steps on the receiver side (in most cases this just means persisting the message on the receiverrsquos

database) A long duration can therefore have two reasons

o Network latency For large messages of several MB in particular the network latency is an important factor (especially if the receiving ABAP proxy system is located on another continent for example) Network latency has to be checked in case of a long call adapter step In case HTTP load balancers are used they are considered as part of the network

o Insufficient resources on receiver side Enough resources must be available at the receiver side to ensure quick processing of a message For instance enough ICM threads and dialog work processes must be available in the case of an ABAP proxy system Therefore the analysis of long call adapter steps always has to include the relevant backend system

For synchronous messages (requestresponse behavior) the call adapter step also includes the processing

time on the backend to generate the response message Therefore the call adapter for synchronous

messages includes the time of the transfer of the request message the calculation of the corresponding

response message at the receiver side and the transfer back to PI Therefore the processing time to

process a request at the receiving target system for synchronous messages must always be analyzed to find

the most costly processing steps

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

34

444 Long Processing Times for ldquoDB_ENTRY_QUEUEINGrdquo

The value for DB_ENTRY_QUEUEING describes the time that a message has spent waiting in a PI inbound

queue before processing started In case of errors the time also includes the wait time for restart of the LUW

in the queue The inbound queues (XBTI XBT1 XBT9 XBTL for EO messages and XBQIXB2I

XBQ1XB21 XBQ9XB29 for EOIO messages) process the pipeline steps for the Receiver

Determination the Interface Determination and the Message Split (and optionally XML inbound validation)

Thus if the step DB_ENTRY_QUEUEING has a high value the inbound queues have to be monitored using

transactions SMQ2 and SARFC The reasons are similar as those outlined in chapter qRFC Resources

(SARFC) and qRFC Queues (SMQ2)

o Not enough resources (DIA work processes for RFC communication) available

o A backlog in the queues caused by one of the following reasons

The number of parallel inbound queues is too low to handle the incoming messages in parallel Note that a simple increase of the parameter EO_INBOUND_PARALLEL might not always be the solution (as enough work processes also have to be available to process the queues in parallel) The number of work processes is in turn restricted by the number of CPUs that are available for your system If you would like to separate critical from non-critical sender systems define a dedicated set of inbound queues by specifying a sender ID as a sub parameter of EO_INBOUND_PARALLEL For example by doing so you can separate master data interfaces from business critical interfaces Please note that the separation on inbound queues only works on Business System level and not on interface level

The inbound queues were blocked by the first LUW Use transaction SMQ2 to check if that is still the case To identify if such a situation occurred in the past use the Wily Introscope dashboard ldquoABAP System qRFC Inbound Detailldquo It is generally not easy to find the one message that is blocking the queue It might be achieved by checking RFC traces or by searching for a single message with a long pipeline processing step For EO queues there is no business reason to block the queues in case of a single message error To prevent this follow the description in section Prevent blocking of EO queues

In case you have a different runtime of the messages in the queues (eg due to complex extended Receiver Determinations) the number of messages per queue might be uneven distributed between the queues In PI 73 a new balancing option is available as discussed in Avoid uneven backlogs with queue balancing

Problems when loading the LUW by the QIN scheduler as described in qRFC Queues (SMQ2)

Note If your system uses the parameter EO_INBOUND_TO_OUTBOUND = 0 you must also read chapter

445 for analyzing the reasons EO_INBOUND_TO_OUTBOUND only determines whether inbound queues

(value lsquo0rsquo) or inbound and outbound queues (value lsquo1rsquo) are used for the pipeline processing in the integration

server Check the value with transaction SXMB_ADM Integration Engine Configuration Specific

Configuration The default value is lsquo1rsquo meaning the usage of inbound and outbound queues (the

recommended behavior)

445 Long Processing Times for ldquoDB_SPLITTER_QUEUEINGrdquo

The value for DB_SPLITTER_QUEUEING describes the time that a message has spent waiting in a PI

outbound queue until a work process was assigned In case of errors the time also includes the wait time for

the restart of the LUW in the queue

The outbound queues (XBTO XBTA XBTZ XBTM for EO messages and XBQOXB2O

XBQAXB2A XBQZXB2Z for EOIO messages) process the pipeline steps for the mapping outbound

binding and call adapter

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

35

Thus if the step DB_SPLITTER_QUEUEING has a high value the outbound queues have to be monitored

using transactions SMQ2 and SARFC The reasons are similar as outlined in chapters qRFC Resources

(SARFC) and qRFC Queues (SMQ2) or described in the section above for PI inbound queues

o Not enough resources (DIA work processes for RFC communication) available

o A backlog in the queues caused by one of the following reasons

The number of parallel outbound queues is too low to handle the outgoing messages in parallel Note that simply increasing the parameter EO_OUTBOUND_ PARALLEL might not always be the solution as enough work processes have to be available to process the queues in parallel The number of work processes in turn is limited by the number of CPUs available for the system Also the parameter EO_OUTBOUND_PARALLEL is used differently than the parameter EO_INBOUND_ PARALLEL because it determines the degree of parallelism for each receiver system Thus it is possible to increase the number of parallel outbound queues for specific receivers whereas other receivers are handled with a lower degree of parallelism Note that not every receiver backend is able to handle a high degree of parallelism so that in case of an ABAP Proxy system it might not indicate a problem on PI in case you observe a high value for DB_SPLITTER_QUEUING

The outbound queues were blocked by the first LUW To identify if such a situation occurred in the past use the Wily Introscope dashboard ldquoABAP System qRFC Inbound Detailldquo In general it is not easy to find the one message that is blocking the queue It might be achieved by checking RFC traces or by searching for a single message with a long pipeline processing step For EO queues there is no business reason to block the queues in case of a single message error To prevent this follow the description in section Prevent blocking of EO queues

Since the outbound queues include a processing step that is bound to take longer than the other steps (that is the mapping and the call adapter step) this might mean a long waiting time for all subsequent messages As an example assume that a mapping takes 1 second and that 100 messages are waiting in a queue The first message gets executed at once the second message has to wait about 1 second (a bit more since more steps are executed than just the mapping but this should be ignored at the moment) the third takes 3 seconds and so on The 100

th message

has to wait 100 seconds until it is processed that is the value for DB_SPLITTER_QUEUING is as high as 100 For a given outbound queue you would therefore see an increase for the DB_SPLITTER_QUEUING value over time If you experience this situation proceed with Chapter 442

In case you have a different runtime of the messages in the queues (eg due to different message size) the number of messages per queue might be uneven distributed between the queues In PI 73 a new balancing option is available as discussed in Avoid uneven backlogs with queue balancing

o Problems when loading the LUW by the QIN scheduler as described in qRFC Queues (SMQ2)

Note The number of parallel outbound queues is also connected with the ability of the receiving system to

process a specific amount of messages for each time unit

In section Adapter Parallelism default restrictions of the specific adapters of the J2EE Engine are explained Some adapters (JDBC File Mail) can process only one message for each server node at a time (this can be changed) It does not make sense for such adapters to have too many parallel qRFC queues since this will only move the message backlog from ABAP to Java

For messages that are directed to the IDoc outbound adapter the value for EO_OUTBOUND_PARALLEL is connected to the MAXCONN value of the corresponding outbound destination You can define the maximum number of connections using transaction SMQS ndash the default value is 10 parallel connections If applicable SAP highly recommends that you use IDoc packaging in the outbound IDoc adapter to reduce the load during the transmission of tRFC calls from the PI system to the receiving system

For ABAP proxy systems the number of outbound queues directly determines the number of messages sent in parallel to the receiving system Thus you have to ensure that the resources

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

36

available on PI are aligned with those on the sendingreceiving ABAP proxy system

446 Long Processing Times for ldquoLMS_EXTRACTIONrdquo

Lean Message Search can be configured for newer PI releases as described in the Online Help After

applying Note 1761133 - PI runtime Enhancement of performance measurement an additional header for

LMS will be written to the performance header

The header could look like this indicating that around 25 seconds were spent in the LMS analysis

When using trace level two additional timestamps are written to provide details about this overall runtime

LMS_EXTRACTION_GET_VALUES

This timestamp describes the evaluation of the message according to user-defined filter criteria

LMS_EXTRACTION_ADJUST_VALUES

This timestamp describes the post-processing of the previously found values in the BAdI (as of PI 730)

The runtime of the LMS heavily depends on the number of payload attributes to be extracted Also the

payload size and the complexity of the XPath expression has a direct impact on the performance of the LMS

In general the number of elements to be indexed should be kept at a minimum and very deep and complex

XPath expressions should be avoided

If you want to minimize the impact of LMS on the PI message processing time you can define the extraction

method to use an external job In that case the messages will be indexed after processing only and it will

therefore have no performance impact during runtime Of course this method imposes a delay in the

indexing (based on the frequency of job SXMS_EXTRACT_MESSAGES) so that newly processed

messages cannot be searched using LMS If this delay is acceptable for the monitoring responsible the

messages should be indexed using an external job

447 Other step performed in the ABAP pipeline

As discussed earlier there are additional steps in the ABAP pipeline that can be executed based on the

configuration of your scenario Per default these steps will not be activated and should therefore not

consume any time All these steps are traced in the performance header of the message Below you can see

the details for

- XML validation

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

37

- Virus Scan

In case one of these steps is taking long you have to check the configuration of your scenario In the

example of the Virus scan the problem might be related to the external virus scanner used and tuning has to

happen on that side

45 PI Message Packaging for Integration Engine

PI message packaging was introduced with SAP NetWeaver PI 70 SP13 and is per default activated in PI

71 and higher Message packaging in BPE is independent of message packaging in PI (see chapter 56)

but they can be used together

To improve performance and message throughput asynchronous messages can be assembled to packages

and processed as one entity For this the qRFC scheduler was enabled to process a set of messages (a

package) in one LUW Thus instead of sending one message to the different pipeline steps (for example

mappingrouting) a package of messages will be sent that will reduce the number of context switches that is

required Furthermore access to databases is more efficient since requests can be bundled in one database

operation Depending on the number and size of messages in the queue this procedure improves

performance considerably In return message packaging can increase the runtime for individual messages

(latency) due to the delay in the packaging process

Message packaging is only applicable for asynchronous scenarios (QoS EO and EOIO) Due to the potential

latency of individual messages packaging is not suitable for interfaces with very high runtime requirements

In general packaging has the highest benefit for interfaces with high volume and small message size

The performance improvement achieved directly relates to the number of the messages bundled in each

package Message packaging must not solely be used in the PI system Tests have shown that the

performance improvement significantly increases if message packaging is configured end-to-end - that is

from the sending system using PI to the receiving system Message packaging is mainly applicable for

application systems connected to PI by ABAP proxy

From the runtime perspective no major changes are introduced when activating packaging Messages

remain individual entities in regards to persistence and monitoring Additional transactions are introduced

allowing the monitoring of the packaging process

Messages are also treated individually for error handling If an error occurs in a message processed in a

package then the package will be disassembled and all messages will be processed as single messages Of

course in a case where many errors occur (for example due to interface design) this will reduce the benefits

of message packaging SAP systems connected to PI via ABAP Proxy can also make use of the message

packaging to reduce the HTTP calls necessary to transmit messages between the systems Other sending

and receiving applications in particular will not see any changes because they send and receive individual

messages and receive individual commits

The PI message packaging can be adapted to meet your specific needs In general the packaging is

determined by three parameters

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

38

1) Message count Maximum number of messages in a package (default 100)

2) Maximum package size Sum of all messages in kilobytes (default 1 MB)

3) Delay time Time to wait before the queue is processed if the number of messages does not reach

the message count (default 0 meaning no waiting time)

These values can be adjusted using transactions SXMS_BCONF and SXMS_BCM To better use the

performance improvements offered by packaging you could for example define a specific packaging for

interfaces with very small messages to allow up to 1000 messages for each package Another option could

be to increase the waiting time (only if latency is not critical) to create bigger packages

See SAP Note 1037176 - XI Runtime Message Packaging for more details about the necessary

prerequisites and configuration of message packaging More information is also available at

httphelpsapcom SAP NetWeaver SAP NetWeaver PIMobileIdM 71 SAP NetWeaver Process

Integration 71 Including Enhancement Package 1 SAP NetWeaver Process Integration Library

Function-Oriented View Process Integration Integration Engine Message Packaging

46 Prevent blocking of EO queues

In general messages for interfaces using Quality of Service Exactly Once are independent of each other In

case of an error in one message there is no business reason to stop the processing of other messages But

exactly this happens in case an EO queue goes into error due to an error in the processing of a single

message The queue will then be automatically retried in configurable intervals This retry will cause a delay

of all other messages in the queue which cannot be processed due to the error of the first message in the

queue

To avoid this a new error handling was introduced in SAP Notes 1298448 - XI runtime No automatic retry

for EO message processing (set EO_RETRY_AUTOMATIC = 0 and schedule restart job

RSXMB_RESTART_MESSAGES) and SAP Note 1393039 - XI runtime Dump stops XI queue After

applying this Note EO queues will not go in SYSFAIL status any longer These Notes are recommended for

all PI systems and are per default activated on PI 73 systems By specifying a receiver ID as sub parameter

this behavior can be configured for individual interfaces only

In PI 73 you can also configure the number of messages that would go into error before the whole queue

would be stopped For permanent errors (eg due to a problem with the receiving backend) the queue would

go into SYSFAIL so that not too many messages are going into error This also reduces the number of alerts

for Message Based Alerting To define the threshold the parameter EO_RETRY_AUT_COUNT can be set

The default value is 0 and indicates that the number of messages is not restricted

47 Avoid uneven backlogs with queue balancing

From PI 73 on a new mechanism is available to address the distribution of messages in queues Per default

in all PI versions the messages are getting assigned to the different queues randomly In case of different

runtimes of LUWs caused by eg different message sizes or different runtimes in mappings this can cause

an uneven distribution of messages within the different queues This can increase the latency of the

messages waiting at the end of that queue

To avoid such an unequal distribution the system checks the queue length before putting a new message in

the queue Hence it tries to achieve an equal balancing during inbound processing A queue which has a

higher backlog will get less new messages assigned Queues with fewer entries will get more messages

assigned Therefore it is different than the old BALANCING parameter of category TUNING) which was used

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

39

to rebalance messages already assigned to a queue The assignment of messages to queues is shown in

the diagram below

To activate the new queue balancing you have to set the parameters EO_QUEUE_BALANCING_READ and

EO_QUEUE_BALANCING_SELECT of category TUNING in SXMB_ADM If the value of the parameter

EO_QUEUE_BALANCING_READ is 0 (default value) then the messages are distributed randomly across

the queues that are available If however EO_QUEUE_BALANCING_READ is set to a value n greater than

zero then on average the current fill level of the queue is determined after every nth message and stored in

the shared memory of the application server This data is used as the basis for determining the queue (see

description of parameter EO_QUEUE_BALANCING_SELECT) relevant for balancing

The parameter EO_QUEUE_BALANCING_SELECT specifies the relative fill levels of the queues in percent

Relative here means in relation to the maximum filled queue If there are queues with a lower fill level than

defined here then only these are taken into consideration for distribution If all queues have a higher fill level

then all queues are taken into consideration

Please note that determining the fill level takes place using database access and therefore impacts system

performance The value of EO_QUEUE_BALANCING_READ should for this reason be oriented towards the

message throughput and specific requirements or even distribution For higher volume system a higher value

should be chosen

Letrsquos look at an example In case you set the parameter EO_QUEUE_BALANCING_READ to 1000 after

every 1000th incoming message the queue distribution will be checked This requires a database access

and can therefore cause a performance impact The fill level for each queue will then be written to shared

memory In our example we configured EO_QUEUE_BALANCING_SELECT to 20 At a given point in time

you have three outbound queues on the system XBTO__A contains 500 entries XBTO__B 150 and

XBTO__C contains 50 messages Therefore XBTO__B has a fill level of 30 and XBTO__C of 10 That

means the only queue relevant for balancing is XBTO__C which will get more messages assigned

Based on this example you can see that it is important to find the correct values As a general guideline to

minimize performance impact you should set EO_QUEUE_BALANCING_READ to a rather large value The

correct value of EO_QUEUE_BALANCING_SELECT depends on the criticality of a delay caused by a

backlog

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

40

48 Reduce the number of parallel EOIO queues

As discussed in Chapter Parallelization of PI qRFC Queues it is important to keep the number of queues

limited As a rule of thumb the number of active queues should be equal to the number of available work

processes in the system It is especially crucial to not have a lot of queues containing only one message

since this will cause a high overhead on the QIN scheduler due to very frequent reloading of the queues

Especially for EOIO queues we often see a high number of parallel queues containing only one message

The reason for this is for example that the serialization has to be done on document number to eg ensure

that updates to the same material are not transferred out of sequence Typically during a batch triggered

load of master data this can result in hundreds of EOIO queues in SMQ2 just containing only one or two

messages This is very bad from a performance point of view and will cause significant performance

degradation for all other interfaces running at the same time To overcome this situation the overall number

of EOIO queues (for all interfaces) can be limited as of PI 71 as described in Change Number of EOIO

Queues

The maximum number of queues can be set in SXMB_ADM Configure Filter for Queue Prioritization

Number of EOIO queues

During runtime a new set of queues will be used with the name XB2 as can be seen below for the outbound

queues

These queues will be shared between all EOIO interfaces for the given interface priority and by setting the

number of queues the parallelization for all EOIO interfaces is limited Thus more messages will be using the

same EOIO queue and therefore PI message packaging will work better and also the reloading of the

queues by the QIN scheduler will show much better performance

In case of errors the messages will be removed from the XB2 queues and will be moved to the standard

XBQ queues All other messages for the same serialization context will be moved to the XBQ queue

directly to ensure the serialization is maintained This means that in case of an error the shared EOIO

queues will not be blocked and messages for other serialization contexts will be not delayed

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

41

49 Tuning the ABAP IDoc Adapter

Very often the IDoc adapter deals with very high message volume and tuning of this is very essential

491 ABAP basis tuning

As stated above the IDoc adapter uses the tRFC layer to transfer messages from PI to the receiver system

The IDoc adapter will only put the message in SM58 and from there the standard tRFC layer forwards the

LUW You therefore have to ensure that sufficient resources (mainly DIA WPs) are available for processing

tRFC requests

Wily Introscope also offers a dashboard (shown below) for the tRFC layer which shows the tRFC entries and

their different statuses With these dashboards you are also able to identify history backlogs on tRFC

In order to control the resources used when sending the IDocs from sender system to PI or from PI to the

receiver backend you can also think about registering the RFC destinations in SMQS in PI as known from

standard ALE tuning By changing the Max Conn or the Max Runtime values in SMQS you can limit or

increase the number of DIA WPs used by the tRFC layer to send the IDocs for a given destination This will

mitigate the risk of system overload on sender and receiver side

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

42

492 Packaging on sender and receiver side

One option for improving the performance of messages processed with the IDoc adapter is to use IDoc

packaging

The receiver IDoc adapter in PI (for sending messages from PI to the receiving application system) already

had the option for packaging IDocs in previous releases For this messages processed in the IDoc adapter

are bundled together before being sent out to the receiving system Thus only one tRFC call is required to

send multiple IDocs The message packaging that is being discussed in section PI Message Packaging uses

a similar approach and therefore replaces the ldquooldrdquo packaging of IDocs in the receiver IDoc adapter

Therefore we highly recommend configuring Message Packaging since this helps transferring data for the

IDoc adapter as well as the ABAP Proxy

For the sender IDoc adapter there was previously the option to activate packaging in the partner profile of

the sending system in case IDocs are not send immediately but in batch By specifying the ldquoMaximum

Number of IDocsrdquo in the report RSEOUT00 multiple IDocs are bundled in one tRFC calls At the PI side

these packages will be disassembled by the IDoc adapter and the messages will be processed individually

Therefore this will help mainly to reduce the tRFC resources involved in transferring the data between the

systems but not within PI

This behavior was changed with SAP NetWeaver PI 711 and higher If the sender backend sends an IDoc

package to PI a new option has been introduced which allows the processing of IDocs as a package on PI

as well You can specify an IDoc package size on the sender IDoc Communication Channel The necessary

configuration is described in detail in the following SDN blog - IDoc Package Processing Using Sender IDoc

Adapter in PI EhP1 A significant increase in throughput was recorded for high volume interfaces with small

IDocs

493 Configuration of IDoc posting on receiver side

As described in SAP Note 1828282 - IDoc adapter long runtime XI -gt IDoc-gt ALE also the way the IDocs

are posted on the receiver ERP backend will directly influence the processing of the LUW on the PI side In

general two options exist for IDoc posting

Trigger immediately The IDoc will be posted directly when it is received For this a free dialog workprocess will be required

Trigger by background program If this option is chosen the IDoc is persisted on the ERP side and the posting of the application data will happen asynchronously using report RBDAPP01 via a background job

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

43

When ldquotrigger immediatelyrdquo is configured the sender system (PI) has to wait till the IDoc is posted While the

workprocess will roll out you will see the LUW in ldquoTransaction Executingrdquo during this time For the

IDOC_AAE the RFC connection and the java threads will be kept busy until the IDOC is posted in the

backend therefore leading to backlogs on the IDOC_AAE

The coding for posting the application data can be very complex and therefore this operation can take a long

time consuming unnecessary resources also on the sender side With background processing of the IDocs

the sender and receiver systems are decoupled from each other

Due to this we generally recommend to use ldquotrigger immediatelyrdquo in exceptional cases where the data has to

be posted without any delay In all other cases like high volume replication of master data background

processing of IDocs should be chosen

With Note 1793313 and 1855518 a third option for posting IDocs on the ERP side was introduced This

option is using bgRFC on the ERP side to queue the IDocs As soon as the configured resources for the

bgRFC destination are available the IDocs will be posted By doing so the IDocs will be posted based on

resource availability on the receiver system and no additional background jobs will be required Furthermore

storing the LUWs in bgRFC will ensure that the sender and receiver systems are decoupled This option

therefore guarantees that the IDocs will be posted as soon as possible based on the available resources on

the receiver system without the requirement to schedule many background jobs This new option is therefore

the recommended way of IDoc posting in systems that fulfill the necessary requirements

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

44

410 Message Prioritization on the ABAP Stack

A performance problem is often caused by an overload of the system for example due to a high volume

master interface If business critical interfaces with a maximum response time are running at the same time

then they might face delays due to a lack of resources Furthermore as described in qRFC Queues (SMQ2)

messages of different interfaces use the same inbound queue by default which means that critical and less-

critical messages are mixed up in the same queue

We highly recommend using message prioritization on the ABAP stack to prevent such delays for critical

interfaces (for Java see Interface Prioritization in the Messaging System )

For more information about PI prioritization see httphelpsapcom and navigate to SAP NetWeaver SAP

NetWeaver PIMobileIdM 71 SAP NetWeaver Process Integration 71 Including Enhancement Package 1

SAP NetWeaver Process Integration Library Function-Oriented View Process Integration

Integration Engine Prioritized Message Processing

To balance the available resources further between the configured interfaces you can also configure a

different parallelization level for queues with different priorities Details for this can be found in SAP Note

1333028 ndash Different Parallelization for HP Queues

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

45

5 ANALYZING THE BUSINESS PROCESS ENGINE (BPE)

This chapter helps to analyze performance problems if Chapter Determining the Bottleneck has shown that

the BPE is the slowest step for a message during its time in the PI system Of course this type of runtime

engine is very susceptible to inefficient design of the implemented integration process Information about

best practices for designing BPE processes can be found in the documentation Making Correct Use of

Integration Processes In general the memory and resource consumption is higher than for a simple pipeline

processing in the Integration Engine As outlined in the document linked above every message that is sent

to BPE and every message that is sent from BPE is duplicated In addition work items are created for the

integration process itself as well as for every step More database space is therefore required than

previously expected and more CPU time is needed for the additional steps

51 Work Process Overview

The Business Process Engine executes the steps (work items) of an integration process using tRFC calls to

the RFC destination WORKFLOW_LOCAL_ltclientgt To check if there are enough resources available to

process the work items call transaction SM50 (on each application server) while one of the integration

processes is running

o Look for the user ldquoWF-BATCHrdquo and follow its activities by refreshing the transaction

o Are there always dialog work processes available

The most important point to keep in mind here is the concurrent activity of the Business Process Engine (for

ccBPM processes) and the Integration Engine (for pipeline processing) Both runtime environments need

dialog work processes Thus performance problems in one of the engines cannot be solved by restricting

the other Rather an appropriate balance between the two runtimes has to be found

The activity of the WF-BATCH user is not restricted by default It is highly recommended that you register the

RFC destination WORKFLOW_LOCAL_ltclientgt with transaction SMQS and assign a suitable amount of

maximum connections Otherwise ccBPM activity may block the pipeline processing of the Integration

Server and reduce the throughput of PI drastically SAP Note 888279 - Regulatingdistributing the Workflow

Load gives more details

In addition it might help to use specific servers (dialog instances) to carry out a given workflow or to use load

balancing Of course both options only apply for larger PI systems that consist of at least a central instance

and one dialog instance

52 Duration of Integration Process Steps

As described in section Processing Time in the Business Process Engine with PI 73 there is a new monitor

available in ldquoConfiguration and Monitoring Homerdquo on PI start page It shows the duration of individual steps

of an integration process in a very easy way and is the now tool to analyze performance related issues on

the ccBPM

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

46

For releases prior than PI 73 check in the workflow log for long-running steps as described in the previous

chapter Use the ldquoList with Technical Detailsrdquo to do so Since the design of integration processes varies

significantly there is no general rule for the duration of a step or for specific sequences of steps Instead

you have to get a feeling for your implemented integration processes

Did a specific process step decrease in performance over a period of time

Does one specific process step stick out with regard to the other steps of the same integration process

Do you observe long durations for a transformation step (ldquomappingrdquo)

Do you observe long durations of synchronous sendreceive steps in the integration process which correlates for example to a slow response from a connected remote system

Use the SAP Notes database to learn about recommendations and hints SAP Note 857530 - BPE

Composite Note Regarding Performance acts as a central note for all performance notes and might be used

as the entry point

Once you are able to answer the above questions there are several paths to follow

o Two common reasons should be taken into consideration for a performance decrease over time Firstly the load on PI might have increased over time as well Check the hardware resources (chapter 9) and carefully monitor the work process availability (chapter 51) Alternatively the underlying databases might slow down the integration processes Use chapter 54 to check this possibility

o If it is a specific process step that sticks out the process design itself has to be reviewed The SAP Notes mentioned above might help here

o If the mapping step takes a long time it is worth having a look at chapter 442 since the BPE and the IE use the same resources (mapping connections from the ABAP gateway to the J2EE server) to execute their mapping

o If there is a long processing time for a synchronous sendreceive step check the connected backend system for any performance issues

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

47

53 Advanced Analysis Load Created by the Business Process Engine (ST03N)

Since the Business Process Engine runs on the ABAP part of the PI Integration Server you can use the

statistical data written by the dialog work processes to analyze a performance problem

Procedure

Log in to your Integration Server and call transaction ST03N Switch to ldquoExpert Moderdquo and choose the

appropriate timeframe (this could be the ldquoLast Minutersquos Loadrdquo from the Detailed Analysis menu or a specific

day or week from the Workload menu) In the Analysis View navigate to User and Settlement Statistics and

then to User Profile The user you are interested in is the WF-BATCH user who does all ccBPM-related work

Compare the total CPU Time (in seconds) to the time frame (in seconds) of the analysis In the above example an analysis time frame of 1 hour has been chosen which corresponds to 3600 seconds To be exact these 3600 seconds have to be multiplied by the number of available CPUs Out of these 3600CPU seconds the WF-BATCH user used up 36 seconds

o The above value gives you a good idea of how much load the Business Process Engine creates on your PI Integration Server By comparison with the other users (as well as the CPU utilization from transaction ST06) you can determine if the Integration Server is able to handle this load with the available number of CPUs or not This does not help you to optimize the load but it does provide support when deciding how to distribute the workload of the BPE and the workload of the Integration Server (see Chapter 51)

Experienced ABAP administrators should also analyze the database time wait time and compare these values

54 Database Reorganization

The database is accessed many times during the processing of an Integration Process This is not usually

connected with performance problems but if a specific database table is large then statements may take

longer than for small database tables

Use transaction ST05 to collect a SQL trace while the integration process in question is being executed Restrict the trace to user WF-BATCH When the integration process is finished stop the trace and view it Look for high values in the duration column especially for SELECT statements

Use transaction SE16 to determine the number of entries for the typical workflow tables SWW_WI2OBJ and SWFRXI There are many more database tables used by BPE but if the number of entries is high for the above tables they are high for the rest as well See SAP Note 872388 - Troubleshooting Archiving and Deletion in XI and the links given in this note if you find high amounts of entries

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

48

55 Queuing and ccBPM (or increasing Parallelism for ccBPM Processes)

Until recently ccBPM processes accepted messages from only one queue (one queue XBQO$PEWS for

each workflow) If there was a high message throughput for this workflow then a high backlog occurred for

these queues A couple of enhancements have therefore been implemented to improve scalability if the

ccBPM runtime

o Inbound processing without buffering

o Use of one configurable queue that can be prioritized or assigned to a dedicated server

o Use of multiple inbound queues (queue name will be XBPE_WS)

o Configure transactional behavior of ccBPM process steps to reduce number of persistency steps

Details about the configuration and tuning of the ccBPM runtime are described at

httpswwwsdnsapcomirjsdnhowtoguides SAP Netweaver 70 End-to-End Process Integration

o How to Configure Inbound Processing in ccBPM Part I Delivery Mode

o How to Configure Inbound Processing in ccBPM Part II Queue Assignment

o How to Configure ccBPM Runtime Part III Transactional Behavior of an Integration Process

Procedure

Use transaction SMQ2 to check if the queues of type XBQO$PE show a high backlog

o Based on this information verify if the ccBPM process is suitable to run with multiple queues This is especially the case for collect patterns realized in BPE If you can use multiple queues configure the workflow based on the above guides

56 Message Packaging in BPE

Inbound processing takes up the most amount of processing time in many scenarios within BPE

Message packaging in BPE helps to improve performance by delivering multiple messages to BPE process

instances in one transaction This can lead to an increased throughput which means that the number of

messages that can be processed in a specific amount of time can increase significantly Message packaging

can also increase the runtime for individual messages (latency) due to the delay in the packaging process

The sending of packages can be triggered when the packages exceed a certain number of messages a

specific package size (in kB) or a maximum waiting time The extent of the performance improvement that

can be obtained depends on the type of scenario Scenarios with the following prerequisites are generally

most suitable

o Many messages are received for each process instance

o Messages that are sent to a process instance arrive together in a short period of time

o Generally high load on the process type

o Messages do not have to be delivered immediately

For example collect scenarios are particularly suitable for message packaging The higher the number of

messages is in a package the higher the potential performance improvements will be

Tests that have been performed have shown a high potential for throughput improvements up to factor 47 in

tested Collect Scenarios depending on the packaging size that has been configured For more details about

the functionality of Message Packaging in BPE and its configuration see SAP Note 1058623 - BPE

Message Packaging

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

49

Message packaging in BPE is independent of message packaging in PI (see Chapter 45) but they can be

used together

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

50

6 ANALYZING THE (ADVANCED) ADAPTER ENGINE

This chapter describes further checks and possible reasons for bottlenecks in the (Advanced) Adapter

Engine Although this is not the place to describe the architecture of the Adapter Framework in detail the

following aspects are important to know for the analysis of a performance problem on the AFW

o The AFW contains the Messaging System that is responsible for the persistence and queuing of messages Within the flow of a message it is placed in between the sender adapter (for example a file adapter that polls from a folder) and the Integration Server as well as between the Integration Server and the receiver adapter (for example a JMS adapter that sends a message out to an external system)

o The Messaging System uses 4 queues for each adapter type and these are responsible for receiving and sending messages To separate the message transfer each adapter (for example JMS SOAP JDBC and so on) has its own set of queues One queue for receiving synchronous messages (Request queue) one queue for sending synchronous messages (Call queue) one queue for receiving asynchronous messages (Receive queue) and one queue for sending asynchronous messages (Send queue) The name of the queue always states the adapter type and the queue name (for example JMS_httpsapcomxiXISystemReceive for the JMS receiver asynchronous queues)

o A configurable number of consumer threads are assigned on each queue These threads work on the queue in parallel meaning that the Messaging Queues are not strictly FirstIn-FirstOut Five threads are assigned by default to each queue and thus five messages can be processed in parallel But as will be discussed later the parallel execution of requests to the backend depends heavily on the adapter being used since not every adapter can work in parallel by default

o SAP NetWeaver PI 71 introduced a new queue The dispatcher queue The dispatcher queue is used to realize prioritization on the AAE (see chapter Prioritization in the Messaging System) Every message in the PI system is first assigned to the dispatcher queue before it is assigned to the adapter-specific queues

Viewed from the perspective of a message that enters the PI system using a J2EE Engine adapter (for

example File) and leaves the PI system using a J2EE Engine adapter (for example JDBC) asynchronously

the steps are as follows

1) Enter the sender adapter

2) Put into the dispatcher queue of the messaging system

3) Forwarded to the File send queue of the messaging system (based on Interface priority)

4) Retrieved from the send queue by a consumer thread

5) Sent from the messaging system to the pipeline of the ABAP Integration Engine (if no Integrated Configuration (AAE) is used

6) Processed by the pipeline steps of the Integration Engine

7) Sent to the messaging system by the Integration Engine

8) Put into the dispatcher queue of the messaging system

9) Forwarded to the JDBC receive queue of the messaging system (based on Interface priority)

10) Retrieved from the receive queue (based on the maxReceivers)

11) Sent to the receiver adapter

12) Sent to the final receiver

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

51

All the analysis is based on the information provided in the Audit Log With SAP NetWeaver PI 71 the audit

log is not persisted for successful messages in the database by default to avoid performance overhead

Therefore the audit log is only available in the cache for a limited period of time (based on the overall

message volume) More details can be found in chapter Persistence of Audit Log information in PI 710 and

higher

As described in chapter Adapter Engine Performance monitor in PI 731 and higher a new performance

monitor is available from PI 731 SP4 With this monitor you can display the processing time per interface in

a given interval and identify the processing steps in which a lot of time is spend

With 731 SP10 and 74 SP5 a download functionality for the performance monitor is provided as described

in SAP Note 1886761

61 Adapter Performance Problem

Compared to a bottleneck in the Messaging System it is relatively easy to detect a performance

problembottleneck in the adapter Since the timestamps for inbound and outbound messages are written at

different points in time the procedure is described separately for sender and receiver adapters

Note It is not generally possible for all sender and receiver adapters to work in parallel (discussed in section

Adapter Parallelism)

611 Adapter Parallelism

As mentioned before not all adapters are able to handle requests in parallel Thus increasing the number of

threads working on a queue in the messaging system will not solve a performance problembottleneck

There are 3 strategies to work around these restrictions

1) Create additional communication channels with a different name and adjust the respective senderreceiver agreements to use them in parallel

2) Add a second server node that will automatically have the same adapters and communication channel running as the first server node This does not work for polling sender adapters (File JDBC or Mail) since the adapter framework scheduler assigns only one server node to a polling communication channel

3) Install and use a non-central Adapter Framework for performance-critical interfaces to achieve better separation of interfaces

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

52

Some of the most frequently-used adapters and the possible options are discussed below

o Polling Adapters (JDBC Mail File)

At the sender side these adapters use an Adapter Framework Scheduler which assigns a server node that does the polling at the specified interval Thus only one server node in the J2EE cluster does the polling and no parallelization can be achieved Therefore scaling via additional server node is not possible More details will be presented in section Adapter Framework Scheduler For example since the channels are doing the same SELECT statement on the database or pick up files with the same file name parallel processing will only result in locking problems To increase the throughput for such interfaces the polling interval has to be reduced to avoid a big backlog of data that is to be polled If the volume is still too high you should think about creating a second interface for example the new interface would poll the data from a different directory or database table to avoid locking

At the receiver side the adapters work sequentially on each server node by default For example only one UPDATE statement can be executed for JDBC for each Communication Channel (independent of the number of consumer threads configured in the Messaging System) and all the other messages for the same Communication Channel will wait until the first one is finished This is done to avoid blocking situations on the remote database On the other hand this can cause blocking situations for whole adapters as discussed in section Avoid Blocking Caused by Single SlowHanging Receiver Interface

To allow better throughput for these adapters you can configure the degree of parallelism at the receiver side In the Processing tab of the Communication Channel in the field ldquoMaximum Concurrencyrdquo enter the number of messages to be processed in parallel by the receiver channel For example if you enter the value 2 then two messages are processed in parallel on one J2EE server node The parallel execution of these statements at database level of course depends on the nature of the statements and the isolation level defined on the database If all statements update the same database record database locking will occur and no parallelization can be achieved

o JMS Adapter

The JMS adapter is per default using a push mechanism on the PI sender side This means the data is pushed by the sending MQ provider Per default every Communication channel has one JMS connection established per J2EE server node On each connection it processes one message after the other so that the processing is sequential Since there is one connection per server node scaling via additional server nodes is an option

With Note 1502046 - JMS Adapter message pull mechanism the JMS protocol can be changed to use a pull mechanism By doing so you can specify a polling interval in the PI Communication Channel and PI will be the initiator of the communication Also here the Adapter Framework Scheduler is used which implies sequential processing But in contrary to JDBC or File sender channels the JMS polling sender channel will allow parallel processing on all server nodes of the J2EE cluster Therefore scaling via additional J2EE server nodes is an option

The JMS receiver side supports parallel operation out of the box Only some small parts during message processing (pure sending of the message to the JMS provider) are synchronized Therefore to enable parallel processing on the JMS receiver side no actions are necessary

o SOAP Adapter

The SOAP adapter is able to process requests in parallel The SOAP sender side has in general no limitations in the number of requests it can execute in parallel The limiting factor here is the FCAThreads available to process the incoming HTTP calls This is discussed in more detail in chapter Tuning SOAP sender adapter On the receiver side the parallelism depends on the number of the threads defined in the messaging system and the ability of the receiving system to cope with the load

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

53

o RFC Adapter

The RFC adapter offers parameters to adjust the degree of parallelism by defining the number of initial and maximum connections to be used The initial threads are allocated from the Application thread pool directly and are therefore not available for any other tasks Therefore the number of initial thread should be kept minimal To avoid bottlenecks during the peak time the maximum connections can be used A bottleneck is indicated by the following exception in the Audit Log

comsapaiiaframsapiDeliveryException error while processing message to remote systemcomsapaiiafrfccoreclientRfcClientException resource error could not get a client from JCOPool comsapmwjcoJCO$Exception (106) JCO_ERROR_RESOURCE Connection pool RfcClient hellip is exhausted The current pool size limit (max connections) is 1 connections

Also this should be done carefully since these threads will be taken from the J2EE application thread pool Therefore a very high value there can cause a bottleneck on the J2EE engine and therefore cause a major instability of the system As per RFC adapter online help the maximum number of connections are restricted to 50

o IDoc_AAE Adapter

The IDoc adapter on Java (IDoc_AAE) was introduced with PI 73 On the sender side the parallelization depends on the configuration mode chosen In Manual Mode the adapter works sequential per server node For channels in ldquoDefault Moderdquo it depends on the configuration of the inbound Resource Adapter (RA) Via the parameter MaxReaderThreadCount of the inbound RA you can configure how many threads are globally available for all IDoc adapters running in Default Mode Hence this determines the overall parallelization of the IDoc sender adapter per Java server node Currently the recommended maximum number of threads is 10

The receiver side of the Java IDoc adapter is working parallel per default

The table below gives a summary of the parallelism for the different adapter types

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

54

612 Sender Adapter

The first step(s) to be taken for a sender adapter depend on the type of adapter itself Afterwards however

the message flow is always the same First the message is processed in the Module Processor and

afterwards put into the queue of the Messaging System If the message is synchronous then it is the call

queue if the message is asynchronous that it is the send queue The final action of the Adapter Framework

is then to send the message from the messaging system to the Integration Server (for Non ICO interfaces)

Only the first entries of the audit log are of interest to establish if the sender adapter has a performance

problem From the first entry to the entry ldquoMessage successfully put into the queuerdquo These steps logically

belong to the sender adapter and the Module Processor while the subsequent steps belong to the

Messaging System The sender adapter uses one J2EE Engine application thread to execute its work This

is generally the same thread for all steps involved

Select the ldquoDetailsrdquo of a message in the AFW The first entry of the audit log will be the beginning of the activity of the sender adapter The end of the adapterrsquos activity is marked by the entry ldquoThe application sent the message asynchronously using connection Returning to applicationrdquo

6121 Tuning SOAP sender adapter As described above in general there is no limitation in the parallelization of the SOAP sender adapter itself

The incoming SOAP requests are limited by the available FCA Threads for HTTP processing The FCA

Threads are discussed in more detail in FCA Server Threads

Per default all interfaces using the SOAP adapter are using the same set of FCA Threads since they are all

using the same URL httplthostgtltportgtMessageServlet In case one interface is facing a very high load or

slow backend connections this can cause a blockage of the available FCA Threads and could cause a heavy

impact on the processing time of other interfaces

To avoid this SAP Note 1903431 allows the usage of different URLs for specific interfaces This way you

could isolate interfaces with a very high volume on a dedicated entry point using its own set of FCA Threads

In case of high load for this interface the other SOAP sender interfaces would not be affected by a shortage

of FCA Threads

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

55

To use this new feature you can specify on the sender system the following two entry points

o MessageServletInternal

o MessageServletExternal

More information about the URL of the SOAP adapter can be found here

613 Receiver Adapter

For a receiver adapter it is just the other way around when compared to the sender adapter First the

message is received from the Integration Server and then put into a queue of the Messaging System this

time either the request or the receive queue for asynchronous and the request queue for synchronous

messages The last step is then the activity of the receiver adapter itself

Select the ldquoDetailsrdquo of a message in the AFW The adapter activity starts after the entry ldquoThe message status set to DLNGrdquo The message will then be forwarded to the entry point of the AFW ldquoDelivering to channel ltchannel_namegtrdquo enters the Module Processor ldquoMP Entering Module Processorrdquo and ends with ldquoThe message was successfully delivered to the application using connection rdquo

Depending on the type of the adapter you now have to investigate why the receiver adapter needs a high amount of processing time This could be a complicated operating system command a bad network connection or a slow backend system among many other reasons

Here too custom modules can be the reason for a prolonged processing time Check Performance of Module Processing for more details

Note From 71 the audit log is not persisted any longer for performance reasons as described in Persistence of Audit Log information in PI 710 and higher

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

56

614 IDoc_AAE adapter tuning

A comprehensive and up-to date description about the tuning of the IDoc adapter is summarized in Note

1641541 - IDoc_AAE Adapter performance tuning Please refer to this Note for details

Similar to all other adapters running on the Adapter Engine also the IDOC_AAE adapter is using the Messaging System queues and the explanations of the next chapter Messaging System Bottleneck are also valid for the IDOC_AAE adapter Note that the IDOC_AAE adapter only supports async Scenarios ndash so it is only using the ldquoSendrdquo queue in Java-only scenarios and the ldquoSendrdquo and ldquoReceiverdquo queue in dual-stack scenariosrdquo

For the IDoc_AAE sender you can configure in addition bulk support as indicated in the screenshot below By doing so all messages received by PI in one RFC call from ERP will be processed as one message This is similar to the mechanism of the ABAP IDoc adapter described in the chapter Packaging on sender and receiver side except with the difference that the number of messages in one bulk cannot be further limited

It is also essential to ensure that the size of the IDocs or IDoc packages for sender or receiver IDocs is not

getting too large to avoid any negative impact on the Java heap memory and Garbage Collection In the

case of the IDoc_AAE adapter the XML size of the IDoc is less critical it is the number of segments in an

IDoc or the overall number of segments in all IDocs of an IDoc package that will influence the memory

allocation during message processing Based on the current implementation per IDoc segment around 5 KB

of memory during processing is required A package of 5 IDocs with 2000 segments for each IDoc would

consume roughly 500 MB during processing In such cases it is important to eg lower the package size

andor use large message queues as outlined in chapter Large message queues on PI Adapter Engine

615 Packaging for Proxy (SOAP in XI 30 protocol) and Java IDoc adapter

Due to message packaging in the Integration Engine (see PI Message Packaging for Integration Engine) the

ABAP based IDoc and Proxy adapter was always using packaging when sending asynchronous messages

to the receiver system With PI 731 packaging was also introduced for Java based IDoc and Proxy (SOAP in

XI 30 protocol) adapter The aim of packaging is to achieve similar benefits to packaging already existing in

ABAP pipeline

o Less calls to backend consuming less parallel Dialog Work Processes on the receiver

o Less overhead due to context switches authentication when calling the backend

Especially for high volume interfaces packaging should be enabled to avoid overloading of the backend

systems and impact on users operating on the system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

57

Important

In the ABAP stack the packaging is active per default (in 71 and higher) as described in chapter PI

Message Packaging for Integration Engine Experience shows that packaging can have a very high impact

on the DIA Work process usage on the ERP receiver side Especially for migration projects from dual-stack

PI to AEXPO it is important to evaluate packaging to avoid any negative impact on the receiving ECC due

to missing packaging

The packaging on Java works in general different then on ABAP While in ABAP the aggregation is done on

the individual qRFC queue level (which always contains messages to one receiver system only) this is not

possible on the Adapter Engine since messages for all receivers are located in the same queue Therefore

packages are built outside of the Adapter Engine queues The messages to be packaged will be forwarded

to a bulk handler which waits till either the time for the package is exceeded or the number of messages or

data size is reached After this the message is send by a bulk thread to the receiving backend system

Packaging on Java is always done for a server node individually If you have a high number of server nodes

packaging will work less efficiently due to the load balancing of messages across the available server nodes

When enabling message packaging the message status will stay in status ldquoDeliveringrdquo throughout all steps

described above In the audit log you can see the time it spends in packaging The audit log shown below

shows eg that the message was almost waiting one minute before the package was built and 9 IDocs were

sent with this package

Packaging can be enabled globally by setting the following parameters in Messaging System service as

described in Note 1913972

o messagingsystemmsgcollectorenabled Enable packaging or disable it globally

o messagingsystemmsgcollectormaxMemPerBulk The maximum size of a bulk message

o messagingsystemmsgcollectorbulkTimeout The wait time per bulk - default 60 seconds

o messagingsystemmsgcollectormaxMemTotal Maximum memory which can be used by message collector

o messagingsystemmsgcollectorpoolSize Number of parallel threads used to send out packages to the backend This value has to be tuned based on the general performance of the receiver system the network latency between the systems or the configuration of the IDoc posting on the ERP sender side (described in chapter IDoc posting on receiver side) Therefore tuning of these threads might be very important These threads can unfortunately not be monitored yet with any PI tool or Wily

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

58

Introscope To verify that not all packaging threads are in use you can use the Thread monitoring in the SAP Management Console and check for threads named ldquoBULK_EXECUTORrdquo as shown below

If you would like to adapt the packaging for specific communication channel this can also be done using the

configuration options in the Integration Directory

While in the SOAP adapter in XI 30 protocol you can define three different packaging KPIs this is currently

not possible for the IDoc adapter There you can only specify the package size based on number of

messages

616 Adapter Framework Scheduler

For polling adapters like File JDBC and JMS the Adapter Framework Scheduler determines on which server

node the polling takes place Per default a File or JDBC sender only works on one server node and the

Adapter Framework (AFW) Scheduler determines on which one the job is scheduled

The Adapter Frameworks scheduler had several performance and reliability issues in a cluster environment

especially in systems having many Java server nodes and many polling communication channels Based on

this and the symptoms described in Note 1355715 - AF Scheduler to avoid using cluster communication a

new scheduler was released We highly recommend using the new version of the AFW scheduler To

activate the new scheduler you have to set the parameter schedulerrelocMode of the Adapter framework

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

59

service property to some negative value For instance by setting the value to -15 after every 15th polling

interval a rebalancing of the channel within the J2EE cluster might happen This can allow for better

balancing of the incoming load across the available server nodes if the files are coming in on regular

intervals The balancing is achieved by doing a servlet call Based on the HTTP load balancing the channel

might then be dispatched to another server node To avoid the balancing overhead this value should not be

set too low A value between -10 and -15 should be acceptable in most cases For very short polling intervals

(eg every second) even lower values (eg -50) can be configured

In case many files are put at one given time on the directory (by eg using a batch process) all these files will

be processed by one polling process only Hence a proper load balancing cannot be achieved by the AFW

scheduler In such a case the only option is to use write the files with different names or different directories

so that you can configure multiple sender channels to pick up the files

Starting from PI 731 you can monitor the Adapter Framework Scheduler in pimon Monitoring

Background Job Processing Monitor Adapter Framework Scheduler Jobs There you can check on which

server node the Communication Channel is polling (Status=rdquoActiverdquo) Also you can see the time the channel

was polling the last time and will poll next time

You cannot influence the server node on which the channel is polling This is determined via the AFW

scheduler You can only influence the frequency after which a channel can potentially move to another

server node by tuning the parameter relocMode as outlined above

Prior to PI 731 you have to use the following URL to monitor the Adapter Framework Scheduler

httpserverportAdapterFrameworkschedulerschedulerjsp

Above you see an example of a File sender channel The type value determines if the channel runs only on

one server node (type = L (local)) or on all server nodes (Type = G (global)) The time in the brackets

[60000] determines the polling interval in ms ndash in this case 60 seconds The Status column shows the

status of the channel

o ldquoONrdquo Currently polling

o ldquoonrdquo Currently waiting for the next polling

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

60

o ldquooffrdquo not scheduled any longer (eg channel deactivated or scheduled on another server node)

62 Messaging System Bottleneck

As described earlier the Messaging System is responsible for persisting and queuing messages that come

from any sender adapter and go to the Integration Server pipeline or Java only interfaces (ICO) or that come

from the Integration Server pipeline and go to any receiver adapter The time difference between receiving

the message from the one side and delivering it to the other side is the value that can be used to analyze

bottlenecks in the Messaging System The following chapters describe how to determine the time difference

for both directions

A bottleneck (or rather a ldquofakerdquo bottleneck) in the Messaging System can usually be closely connected with a

performance problem in the receiver adapter You must therefore make sure you execute the check

described in chapter 613 If the receiver adapter needs a lot of time to process messages then of course

the messages get queued in the messaging system and remain there for a long time since the adapter is not

ready to process the next one yet It looks like the messaging system is not fast enough but actually the

receiver adapter is the limiting factor

Execute the checks described in chapter 621 and 622 to get an overview of the situation you should do this for multiple messages

Decide if a specific receiver adapter is causing long waiting times in the messaging system by executing the check described in chapter 613 An additional hint at a receiver adapter problem is if only messages to specific receivers show a long wait time in the queue

Use the queue monitoring in the Runtime Workbench (RWB) to analyze how many threads are active for each of the queue types and how large the queue size currently is To do so choose Component Monitoring Adapter Engine Press the Engine Status button and choose tab ldquoAdditional Datardquo This will open the page below showing the number of messages in a queue the threads currently working and the maximum available threads Note This view only shows the Messaging System of one J2EE server node To get the whole picture you have to check all the server nodes that are available in your system via the drop down list box

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

61

o Since checking all the server nodes with this browser frontend can be very tedious it is better to use Wily Introscope since it allows easy graphical analysis of the queues and thread usage of the messaging system across Java server nodes

The starting point in Wily Introscope is usually the PI triage showing all the backlogs in the queue In the screenshot below a backlog in the dispatcher queue (for 71 and higher only) can be seen

The dispatcher queue is not the problem here since it will forward messages to the adapter queue if any free

threads are available on the adapter specific consumer threads The analysis must start in the PI inbound

queues By using the navigation button in the upper right corner of the inbound queue size you can directly

jump to a more detailed view where you can see that the file adapter was causing the backlog

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

62

To see the consumer thread usage you can then follow the link to the file adapter In the screenshot below

you can see that all consumer threads were exceeded during the time of the backlog Thus increasing the

number of consumer threads could be a solution if the resulting delay is not acceptable for business

o If it is obvious that the messaging system does not have enough resources as in the case above increase the number of consumers for the queue in question Use the results of the previous check to decide which adapter queues need more consumers

The adapter specific queues in the messaging system have to be configured in the NWA using

service ldquoXPI Service AF Corerdquo and property messagingconnectionDefinition The default

values for the sending and receiving consumer threads are set as follows

(name=global messageListener=localejbsAFWListener exceptionListener=

localejbsAFWListener pollInterval=60000 pollAttempts=60

SendmaxConsumers=5 RecvmaxConsumers=5 CallmaxConsumers=5

RqstmaxConsumers=5)

To set individual values for a specific adapter type you have to add a new property set to the default set with the name of the respective adapter type for example

(name=JMS_httpsapcomxiXISystem

messageListener=localejbsAFWListener

exceptionListener=localejbsAFWListener pollInterval=60000

pollAttempts=60 SendmaxConsumers=7 RecvmaxConsumers=7

CallmaxConsumers=7 RqstmaxConsumers=7)

The name of the adapter specific queue is based on the pattern ltadapter_namegt_ ltnamespacegt for

example RFC_httpsapcomxiXISystem

Note that you must not change parameters such as pollInterval and pollAttempts For more details see SAP Note 791655 - Documentation of the XI Messaging System Service Properties

o Not all adapters use the parameter above Some special adapters like CIDX RNIF or Java Proxy can be changed by using the service ldquoXPI Service Messaging Systemrdquo and property messagingconnectionParams by adding the relevant lines for the adapter in question as described above

o A performance problem on the Messaging System can also be caused by a high logging as described in chapter Logging Staging on the AAE (PI 73 and higher)

o Important Make sure that your system is able to handle the additional threads that is monitor the CPU usage and application thread availability as well as the memory of the J2EE Engine after you have applied the change

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

63

o Important Keep in mind that not all adapters are able to handle requests in parallel even if you assign more threads on the Messaging System queues In such a case adding additional threads will not speed up processing significantly See chapter Adapter Parallelism for details

o A backlog of messages in the messaging system is highly critical for synchronous scenarios due to timeouts that might occur Therefore the number of synchronous threads always has to be tuned so that no bottleneck occurs Often a backlog in synchronous queues is caused due to bad performance on the receiving backend system

o More information is provided in the blog Tuning the PI Messaging System Queues

621 Messaging System in between AFW Sender Adapter and Integration Server (Outbound)

Open the Audit Log of a message and look for the following entry ldquoThe message was successfully retrieved

from the send queuerdquo Note that the queue name would be ldquoCall Queuerdquo if the message was synchronous

To determine how long the message waited to be picked up from the queue compare the timestamp of the

above entry with the timestamp for the step ldquoMessage successfully put into the queuerdquo A large time

difference between those two timestamps would therefore indicate a bottleneck for consumer threads on the

sender queues in the Messaging System

Instead of checking the latency of single messages we recommend using Wily Introscope to monitor the

queue behavior as shown above

Asynchronous messages only use one thread for processing in the messaging system Multiple threads are

used for synchronous messages The adapter thread puts the message into the messaging system queue

and will wait till the messaging system delivers the response The adapter thread is therefore not available

for other tasks until the response is returned A consumer thread on the call queue sends the message to the

Integration Engine The response will be received by a third thread (consumer thread of send queue) which

correlates the response with the original request After this the initiating adapter thread will be notified to

send the response to the original sender system This correlation can be seen in the audit log of the

synchronous message below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

64

622 Messaging System in Between Integration Server and AFW Receiver Adapter (Inbound)

Open the Audit Log of a message and look for the following entry ldquoMessage successfully put into the

queuerdquo Compare this timestamp with the timestamp for the step ldquoThe message was successfully retrieved

from the receive queuerdquo The time difference between these two timestamps is the time that the message

waited in the messaging system for a free consumer thread A large time difference between those two

timestamps indicates a bottleneck in the adapter specific inbound queues of the Messaging System

623 Interface Prioritization in the Messaging System

SAP NetWeaver PI 71 and higher offers the possibility for prioritization of interfaces similar to the ABAP

stack This feature is especially helpful if high volume interfaces run at the same time with business critical

interfaces To minimize the delay for critical messages the Messaging System now makes it possible for you

to define high medium and low priority processing at interface level Based on the priority the Dispatcher

Queue of the messaging system (which will be the first entry point for all messages) will forward the

messages to the standard adapter-specific queues The priority assigned to an interface determines the

number of messages that are forwarded once the adapter-specific queues have free consumer threads

available This is done based on a weighting of the messages to be reloaded The weights for the different

priorities are as follows High 75 Medium 20 Low 5

Based on this approach you can ensure that more resources can be used for high priority interfaces The

screenshot below shows the UI for message prioritization available in pimon Configuration and

Administration Message Prioritization

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

65

The number of messages per priority can be seen in a dashboard in Wily as shown below

You can find more details on configuring usage prioritization within the AAE at SAP Online Help navigate to

SAP NetWeaver SAP NetWeaver PI SAP NetWeaver Process Integration 71 Including Enhancement

Package 1 SAP NetWeaver Process Integration Library Function-Oriented View Process Integration

Process Integration Monitoring Component Monitoring Prioritizing the Processing of Messages

624 Avoid Blocking Caused by Single SlowHanging Receiver Interface

As already discussed above some receiver adapters process messages sequentially on one server node by

default This is independent of the number of consumer threads defined for the corresponding receiver

queue in the messaging system For example even though you have configured 20 threads for the JDBC

Receiver one Communication Channel will only be able to send one request to the remote database at a

given time If there are many messages for the same interface all of them will get a thread from the

messaging system but will be blocked when performing the adapter call

In the case of a slow message transmission for example for EDI interfaces using ISDN technology or

remote system with small network bandwidth this behavior can cause a ldquohangingrdquo situation for a specific

adapter since one interface can block all messaging threads As a result all the other interfaces will not get

any resources and will be blocked

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

66

Based on this a parameter messagingsystemqueueParallelismmaxReceivers was introduced which can

be set in NWA in service XPI Service Messaging System With this parameter you can restrict the maximum

number of receiver consumer threads that can be used by one interface In older releases (lower 731 SP11

and 74 SP6) this is a global parameter at the moment that affects all adapters It should therefore not be set

too restrictively If you need high parallel throughput one option is to set the maxReceiver parameter to 5 (so

that each interface can use 5 consumer threads for each server node) and increase the overall number of

threads on the receive queue for the adapter in question (for example JDBC) to 20 With this configuration it

will be possible for four interfaces to get resources in parallel before all threads are blocked For more

information see Note 1136790 - Blocking Receiver Channel May Affect the Whole Adapter Type and SDN

blog Tuning the PI Messaging System Queues

In the screenshot below you see the impact of maxReceivers parameter when a backlog for one interface

occurs Even though there are more free SOAP threads available they are not consumed Hence the free

SOAP threads could be assigned to other interfaces running in parallel

This parameter has a direct impact on the queue where message backlogs occur As mentioned earlier

usually a backlog should occur on the Dispatcher queue only since it is responsible to dispatch threads only

if there are free consumer threads in the adapter specific queue When setting maxReceiver you restrict the

threads per interface and this means that less consumer threads are blocked When one interface is facing a

high backlog there will always be free consumer threads and therefore the dispatcher queue can dispatch

the message to the adapter specific queue Hence the backlog will appear in the adapter specific queue so

that the message prioritization would not work properly any longer

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

67

Per default the maxReceiver parameter is only relevant for asynchronous message processing (ICo and

classical dual-stack scenarios) The reason here is that a wait time for synchronous scenarios should be

avoided by all means and therefore additional restrictions in the number of available threads can be very

critical Therefore it is usually advisable to not limit the threads per interface but increase the overall number

of available threads In case you have many high volume synchronous scenarios with different priority that

run in parallel it might be advisable to limit the threads each interface can use To do so you have to set the

parameter messagingsystemqueueParallelismqueueTypes to Recv ICoAll as described in SAP Note

1493502 - Max Receiver Parameter for Integrated Configurations

Enhancement with 731 SP11 and 74 SP6

With Note 1916598 - NF Receiver Parallelism per Interface an enhancement was introduced that allows

the specification of the maximum parallelization on more granular level This new feature has to be activated

by setting the parameter messagingsystemqueueParallelismperInterface to true Using a configuration UI

you can specify rules to determine the parallelization for all interfaces of a given receiver service If no rule

for a given interface is specified the global maxReceivers value will be considered If the receiver service

corresponds to a technical business system this configuration would help to restrict the parallel requests

from PI to that system This also works across protocols so that eg it could be used to restrict the number of

parallel requests using Proxy or IDoc to the same ERP receiver system Below you can find a screenshot of

the configuration UI in NWA SOA Monitoring

With the improvement mentioned above also the dispatching mechanism in the dispatcher queue is

changed so that it is aware of the maxReceiver settings This means that now again the backlog will now be

placed in the dispatcher queue and the prioritization will work properly

625 Overhead based on interface pattern being used

Also the interface pattern configured can cause some overhead When choosing the interface pattern

ldquoStateless Operationrdquo each message is parsed against its data type during inbound processing This can

cause performance and memory overhead but gives a syntactical check of the data received

When choosing ldquoStateless (XI30-Compatible)rdquo no check of the data type is performed and therefore the

overhead is avoided Therefore for interface with good data quality and high message throughput this

interface pattern should be chosen

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

68

63 Performance of Module Processing

If your analysis in chapter Adapter Performance Problem pointed to a problem in the Module Processing

then you must analyze the runtime of the modules used in the Communication Channel Adapter modules

can be combined so that one CC calls multiple modules in a defined sequence Adapter modules can be

custom-developed or SAP standard and can be used for many different purposes ndash like to transform the EDI

message before sending it to the partner or to change header attributes of a PI message before forwarding it

to a legacy application Due to the flexible usage of adapter modules there is also no standard way to

approach performance problems

In the audit log shown below you can see two adapter modules One module is a customer-developed

module called SimpleWaitModule The next module called CallSAPAdapter which is a standard module that

inserts the data into the messaging system queues

In the audit log you will get a first impression of the duration of the module In the example above you can

see that the SimpleWaitModule requires 4 seconds of processing time

Once again Wily Introscope offers a better overview of the processing time of modules In the screenshot

below you can see a dashboard showing the cumulativeaverage response times and number of invocations

of different modules If there is one that has been running for a very long time then it would be very easy to

identify since there will be a line indicating a much higher average response time The tooltip displays the

name of the module

Module

Processing

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

69

Once you have identified the module with the long running time you have to talk to the responsible developer

to understand why it is taking that long Eventually the module executes a look-up to a remote system using

JCo or JDBC which could be responsible for the delay In the best case the module will print out information

in addition to the audit log to detect such steps If not use the Wily Introscope transaction trace as explained

in appendix Wily Transaction Trace

64 Java only scenarios Integrated Configuration objects

From SAP NetWeaver PI 71 on you can configure scenarios that only run on the Java stack using the

Advance Adapter Engine (AAE) when only Java-based adapters are used A new configuration object is

used for this that is called Integrated Configuration When using this the steps executed so far in the ABAP

pipeline (Receiver Determination Interface Determination and Mapping) are also executed by the services in

the Adapter Engine

641 General performance gain when using Java only scenarios

The major advantage of AAE processing is a reduced overhead due to the context switches between ABAP

and Java Thus the overall throughput can be increased significantly and the overall latency of a PI

message can be reduced greatly If possible interfaces should always use the Integrated Configuration to

achieve best performance Therefore the best tuning option is to change a scenario that is using Java based

sender and receiver adapters from a classical ABAP-based scenario to an AAE-based one

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

70

Based on SAP internal measurements performed on 71 releases the throughput as well as the response

time could be improved significantly as shown in the diagrams below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

71

Based on these measurements the following statements can be made

1) Significant performance improvements can be achieved always with local processing (if available) for a certain scenario

2) This is valid for all adapter types huge mapping runtimes slow networks slow (receiver) applications reduce the throughput and therefore rated for the overall scenario also reduce the benefit of local processing in terms of overall throughput and response time measurements

3) Greatest benefit is recognized for small payloads (comparison 10k 50k 500k) and asynchronous messages

642 Message Flow of Java only scenarios

All the Java based tuning options mentioned in chapter Analyzing the (Advanced) Adapter Engine and Long

Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo are still valid Just the calling sequence is different

The following describes the steps used in Integrated Configuration to help you better understand the

message flow of an interface The example is JMS to Mail

1) Enter the JMS sender adapter

2) Put into the dispatcher queue of the messaging system

3) Forwarded to the JMS send queue of the messaging system

4) Message is taken by JMS send consumer thread

a No message split used

In this case the JMS consumer thread will do all the necessary steps previously done in the ABAP pipeline (like receiver determination interface determination and mapping) and will then also transfer the message to the backend system (remote database in our example) Thus all the steps are executed by one thread only

b Message split used (1n message relation)

In this case there is a context switch The thread taking the message out of the queue will process the message up to the message split step It will create the new message and put it in the Send queue again from where it will be taken by different threads (which will then map the child message and finally send it to the receiving system)

As we can see in this example for Integrated Configuration only one thread does all the different steps of a

message The consumer thread will not be available for other messages during the execution of these steps

The tuning of the Send queue (Call for synchronous messages) is therefore much more important for

scenarios using Integrated Configuration than for ABAP-based scenarios

The different steps of the message processing can be seen in the audit log of a message If for example a

long-running mapping or adapter module is indicated then you can use the relevant chapter of this guide

There is no difference in the analysis except that for mappings no JCo connection is required since the

mapping call is done directly from the Java stack

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

72

The example audit logs below are based on a Java only synchronous SOAP to SOAP scenario

In the highlighted areas you can see that all the steps are very fast except the mapping call which lasts

around 7 seconds To analyze the long duration of the mapping now the same steps as discussed in chapter

Long Processing Times for ldquoPLSRV_MAPPING_REQUESTrdquo have to be followed

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

73

643 Avoid blocking of Java only scenarios

Like classical ABAP based scenarios Java only scenarios can face backlog situations in case of a slow or

hanging receiver backend Since the Java only interfaces are only using send queues the restriction of

consumer threads on the receiver queue as described in chapter Avoid Blocking Caused by Single

SlowHanging Receiver Interface is no solution

Based on that an additional property messagingsystemqueueParallelismqueueTypes of service SAP XI AF

MESSAGING was introduced via Note 1493502 - Max Receiver Parameter for Integrated Configurations

The parameter can be set for synchronous and asynchronous interfaces Please keep in mind that

messagingsystemqueueParallelismmaxReceivers is a global parameter for all adapters and for all

configured Quality of Service This global restriction can be avoided starting with 731 SP11 740 SP06 as

per SAP Note 1916598 - NF Receiver Parallelism per Interface Usually a restriction for the parallelization

can be highly critical for synchronous interfaces Therefore we generally recommend to set

messagingsystemqueueParallelismqueueTypes in most cases to Recv IcoAsync only

644 Logging Staging on the AAE (PI 73 and higher)

Up to PI 711 on the PI Adapter Engine only one message version was persisted Especially for Java only

scenarios this was often not sufficient for troubleshooting since for instance the result of a mapping could not

be verified

Starting from PI 73 the persistence steps can be configured globally in the Adapter Engine for synchronous

and asynchronous interfaces This is similar to the LOGGING or LOGGING_SYNC on the ABAP stack In

later version of PI you will be able to do the configuration on interface level

The versions that can be persisted are shown in the diagram below (the abbreviations in green are the

values for the configuration)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

74

In the Messaging System we generally distinguish Staging (versioning) and Logging An overview is given

below

For details about the configuration please refer to the SAP online help Saving Message Versions

Similar to the LOGGING on the ABAP Integration Server persisting several versions of the Java message

can cause a high overhead on the DB and can cause a decrease in performance

The persistence steps can be seen directly in the audit log of a message

Also the Message Monitor shows the persisted versions

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

75

While this additional monitoring attributes are extremely helpful for troubleshooting they should be used

carefully from a performance perspective Especially the number of message versions and the

loggingstaging mode you choose can have a severe impact on performance Therefore you have to find the

balance between business requirement and performance overhead Some guidelines on how to use staging

and logging are summarized in SAP Note 1760915 - FAQ Staging and Logging in PI 73 and higher

65 J2EE HTTP load balancing

With NetWeaver version 71 and higher the Java HTTP load balancing is no longer done by the Java

Dispatcher but by the ICM The default load balancing rules of the ICM are designed with a stateful

application in mind giving priority to the balancing of HTTP sessions These default rules are not optimal for

stateless applications such as SAP PI

In the past we could observe an unequal load distribution across Java server nodes caused by this In case

of high backlogs this can cause a delay in the overall message processing of the interface because one

server node has more messages assigned than others and therefore not all available resources are used

A typical example can be seen in the Wily screenshot below

SAP Note 1596109 ndash Uneven distribution of HTTP requests on Java server nodes introduces new load

balancing rules for stateless applications like PI Please follow the description in the Note to ensure the

messages are distributed equally across the available server nodes In the meantime these load balancing

rules can also be executed using the CTC templates as described in Notes 1756963 and 1757922 Also this

has been included in the PI Initial Setup wizard in order to execute this task automatically as a post-

installation step (see Note 1760700 - PI CTC Add HTTP loadbalancing to initial setup)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

76

For the example given above we could see a much better load balancing after the new load balancing rules

were implemented This can be seen in the following screenshot

Please Note The load balancing rules mentioned above are only responsible to balance the messages

across the available server nodes of one instance HTTP load balancing across application servers is done

by the SAP Web Dispatcher More information about this is provided in the Guide How To Scale SAP PI 71

66 J2EE Engine Bottleneck

All tuning that can be done in the AFW is limited by the ability of the J2EE Engine to handle a given amount

of threads and to provide enough memory that is needed to process all requests Of course the CPU is also

a limiting factor but this will be discussed in Chapter 91

From SAP NetWeaver PI 71 onwards the SAP VM is used on all platforms Therefore the analysis of all

platforms can use the same tools

661 Java Memory

The first analysis of the J2EE memory is usually done using the garbage collection (GC) behavior of the

Java Virtual Machine (JVM) To get the information of the GC written to the standard log file the JVM has to

be started with the parameter ndashverbosegc

Analyze the file std_serverltngtout for the frequency of GCs their duration and the allocated memory

Look for unusual patterns indicating an Out-of-Memory error or an unintentional restart of your J2EE engine This would be visible if the allocated memory drops to 0 at a given point in time A healthy Garbage Collection output would display a saw tooth pattern ndash that is the memory usage would increase over time but then go down to its original value as soon as a full Garbage Collection is triggered You should see the memory usage go down to initial values during low volume times (night-time)

Pay attention to GCs with a high duration This is important because no PI message can be mapped or processed by the J2EE Adapter Framework during a GC in a JAVA VM One of the many reasons for long GCs that should be mentioned here is paging (see chapter 92) If the heap space of the J2EE Engine is exhausted all objects in the heap are evaluated for their references If there is still a reference the object is kept If there is no reference the object is deleted If some or all of the objects have been swapped out due to insufficient RAM then they have to be swapped in to be evaluated This leads to very long runtimes for GCs GCs of more than 15 minutes have been observed on swapping systems

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

77

Different tools exist for the Garbage Collection Analysis

1) Solution Manager Diagnostics

Solution Manager Diagnostics (SMD) offers a memory analysis application that is able to read the GC output in the std_serverltngtout files To start call the Root Cause Analysis section In the Common Task list you find the entry Java Memory Analysis which will give you the chance to upload your std_serverltngtout file It shows the memory allocated after the GC and also prints the duration of each GC (black dots with duration scale on right axis) A normal GC output should show a saw tooth pattern where the memory always goes down to an initial value (especially during low volume times) This is shown in the example screenshot below

2) Wily Introscope

Again Wily Introscope offers different dashboards that can be used to check the Garbage Collection behavior on the J2EE engine The dashboard can be found using the J2EE Overview J2EE GC Overview and shows important KPIs for the GC such as the count of GC or the duration for each interval The GC Time dashboards shows the ratio of time spent in GC An example is shown in the screenshot below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

78

3) Netweaver Administrator

The NWA also offers a view to monitor the Java heap usage However the important information about the duration of the GC is not available This monitor can be found by navigating to Availability and Performance Management Java System Reports Choose Report System Health

In NetWeaver 73 you find this data in Availability and Performance Management Resource Monitoring History Reports There you can build your own Memory report of the monitoring data provided The screenshot below eg shows the Used Memory per server node

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

79

4) SAP JVM Profiler

The SAP JVM profiler allows the analysis of the extensive GC information stored in the sapjvm_gcprf files of the server node folders or to directly connect to the server node to retrieve this information More information can be found at httpwikiscnsapcomwikidisplayASJAVAJava+Profiling

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

80

Procedure

o Search for restarts of the J2EE Engine by searching for the string ldquois startingrdquo in the file std_serverltngtout Apart from the first of these entries (which simply marks the initial start of the J2EE Engine) the second and any later entries mark a fatal situation after which the J2EE server node had to be restarted This could for example be an outndashof-memory situation Search the log above those entries for a possible reason

o Search for exceptions and fatal entries Was an out-of-memory error reported (search for pattern OutOfMemory) Was a thread shortage reported

o This document does not deal with tuning the J2EE Engine If at this point it becomes clear that the J2EE Engine is the limiting factor proceed with SAP Notes or open a customer incident Only some very basic recommendations are given here

o The parameters for the SAP VM are defined by Zero Admin templates From time to time new Zero Admin templates might be released which change important parameters for the SAP VM Thus you should check SAP Note 1248926 - AS Java VM Parameters for NetWeaver 71-based Products regularly and apply the changes accordingly

o The problem might be caused by too high parallelization when processing large messages This should be restricted to avoid overloading of the Java heap memory

1) In case the interfaces are using the ABAP Integration Server use the large message queue filters to restrict the parallelization of mapping calls from ABAP queues To do so set parameter EO_MSG_SIZE_LIMIT to eg 5000 to direct all large messages to dedicated XBTL or XBTM queues

2) To restrict the parallelization on the Java Messaging System you can use the large message queue mechanism described in chapter Large message queues on the Messaging System

o If it is already obvious that the memory of your J2EE Engine is getting short and nothing can be changed on the scenarios itself you have several options for scaling your PI landscape Especially for large PI installations with high message throughput or large messages SAP Note 1502676 - Scaling up large PI installations describes potential options

1) Increase the Java Heap of your server node If sufficient physical memory is available increase the default heap size of 2 GB to a higher value such as 3 or 4 GB Experience shows that the SAP VM can handle these heap sizes without major impact on performance Larger heap sizes can cause longer processing time of GCs which in turn can affect the PI application negatively Increasing the maximum heap size on the J2EE engine will automatically adapt the new area of the heap due to a dynamic configuration (in newer J2EE versions this will be per default set to 16 of the overall heap) After increasing the heap size monitor the duration of the GC execution

2) Add an additional server node to distribute the load over more processes SAP recommends that productive systems have at least 2 Java server nodes per instance Prerequisite is that your physical host has enough RAM and CPU To minimize the overhead for J2EE cluster communication it is in general recommended configuring larger server nodes (as described above) then a high number of server nodes

3) Add an additional application server consisting of a double stack Web Application Server (ABAP and J2EE Engine) to distribute the load to an additional hardware Find details on the scaling of PI with multiple instances in the Guide How To Scale up SAP NetWeaver Process Integration

4) Use a non-central Adapter Engine to separate large messages that cause the memory problem from business critical scenarios

662 Java System and Application Threads

A system or application thread is required for every task that the J2EE Engine executes Each of these

thread types are maintained using a thread pool If the pool of available threads is expired no more requests

can be processed It is therefore important that you have enough threads available at all times

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

81

o In general SAP recommends that you increase the system and application thread count for a PI system As described in SAP Note 937159 - XI Adapter Engine is stuck the application threads should be increased to 350 and the system threads to lsquo120rsquo for all J2EE server nodes

o There are two options for checking the thread usage

1) Wily Introscope

Wily Introscope provides a dashboard in the J2EE Overview section called J2EE CPU and Memory Detail that can also be used to monitor the thread usage on a PI engine Different views exist for Application and System Threads as shown in the screenshot below

2) NetWeaver Administrator (NWA)

The NWA also offers a monitor similar to the one available for the memory mentioned above This monitor can be found by navigating to Availability and Performance Management Java System Reports Choose Report System Health and looking at the windows shown in the screenshot below

Again in NetWeaver 73 and higher the monitor can be found at Availability and Performance

Management Resource Monitoring History Report An example for the Application Thread

Usage is shown below

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

82

Check if the ThreadPoolUsageRate is below 80 for both the application threads and the system threads Also check the maximum value by choosing a longer history in Wily Introscope

Has the thread usage increased over the last daysweeks

Is there one server node that shows a higher thread usage than others

3) SAP Management Console

The SAP Management Console is an essential tool to analyze different aspects of the J2EE engine It also offers a thread dump view similar to SM50 in ABAP showing the activity of all the threads of a given Java Application Server More information about the SAP MC can be found in Note 1014480 - SAP Management Console (SAP-MC) and the online help The SAP Management Console can be started as Java web-start application using httpltservergt5ltInstanc_numbergt13

Navigate to AS Java Threads and check for any threads in red status (gt20 secs running time) This indicates long running threads and you should check for the related thread stacks The example below shows a long running thread related to the PI directory You can see the user doing the request and can see that the thread is waiting for an HTTP response from the CPACache

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

83

If you are not sure what is causing the problem you should take multiple thread dumps (at least 3 every 30 seconds) for all server nodes This can be done in the Management Console navigating to AS Java Process Table

The Thread Dump Viewer tool (TDV) from Note 1020246 - Thread Dump Viewer for SAP Java Engine can be used to analyze the produced thread dumps (inside the results archive that can be downloaded to your local PC)

663 FCA Server Threads

An additional type of thread was introduced with SAP NetWeaver PI 71 that is responsible for HTTP traffic

FCA Server Threads The FCA Server Threads are responsible for receiving HTTP calls at the Java side

(after the Java dispatcher is no longer available) Also FCA Threads use a Thread Pool Fifteen FCA

Threads are configured by default but based on SAP Note 1375656 - SAP Netweaver PI System Parameters

we recommend that you increase it to 50 This can be done in the NWA by changing the parameter

FCAServerThreadCount of service HTTP Provider

FCA Server Threads are particularly crucial for synchronous message transfer and for HTTP-based

scenarios like WebServices or calls from the ABAP engine to the Adapter Engine If there is a long duration

of the HTTP call due to a slow backend system then the thread is blocked for the whole time and not

available to send other HTTP requests (other PI messages)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

84

The configured FCAServerThreadCount represents the maximum number of threads working on a single

entry point An entry point in the PI sense could eg be the SOAP Adapter Servlet (share with all channels

as described in Tuning SOAP sender adapter)

o If response times for specific entry point are high (gt15 seconds) additional FCA threads will be spawned to be available for parallel HTTP based incoming requests using different entry points only This ensures that in case of problems with one application not all other applications will be blocked constantly

o An overall maximum of 20xFCAServerThreadCount may be used in the J2EE engine

There is currently no standard monitor available for FCA Threads except the thread view in the SAP

Management Console A new dashboard will be integrated into Wily Introscope soon and will also show the

FCA Server Threads that are in use

664 Switch Off VMC

The Virtual Machine Container (VMC) is enabled by default for an ABAP WebAS 71 Since PI does not use

VMC at runtime the VMC can be switched off on a PI system to avoid resource overhead This

recommendation is based on SAP Note 1375656 - SAP NetWeaver PI System Parameters

You can activate the VM Container setting the profile parameters vmcjenable = off which can be changed

using profile maintenance (transaction RZ10) Once you have made the change you need to restart the SAP

Web Application Server instance

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

85

7 ABAP PROXY SYSTEM TUNING

Every ABAP WebAS gt 620 includes an ABAP Proxy runtime This enables a SAP system to talk native PI

protocol (XI-SOAP) so that no costly transformation is necessary

In general ABAP proxy is just a framework that is triggered by (sender) or triggers (receiver) application-

specific coding It is not possible to give a general tuning recommendation because the applications and use

cases of ABAP proxy can differ so greatly

In this section we would like to highlight the system tuning options that can be applied to improve the

throughput at the ABAP Proxy backend side

In general you can increase the throughput of EO interfaces by changing the queue parallelization Two

different parameters exist to change the number of qRFC queues on the ABAP Proxy backend As in PI

ABAP Proxy processing is based on qRFC inbound queues (SMQ2) with specific names A sender proxy

uses XBTS queues (10 parallel by default) and a receiver proxy XBTR (20 parallel by default)

The sender proxy only forwards the message to PI by executing an HTTP call that must not take too much

time and which is not resource-critical On the other hand the receiver proxy executes the inbound

processing of the messages based on the application context (and which can be very time-consuming) It is

therefore more likely to face a backlog situation in the Receiver Proxy queues In the screenshot below you

can see the performance header of a receiver proxy message that required around 20 minutes in the

PLSRV_CALL_INBOUND_PROXY (which corresponds to the posting of the application data)

Of course such a long-running message will block the queue and all messages behind it will face a higher

latency Since this step is purely application-related it is only possible to perform tuning at the application

side

The number of sender and receiver ABAP Proxy queues can be changed in transaction SXMB_ADM on the

proxy backend with parameter EO_INBOUND_PARALLEL and sub parameter SENDER (XBTS queues)

and RECEIVER (XBTR queues) The prerequisite for increasing the number of queues is that there are

enough qRFC resources are available at the proxy system (as discussed in section qRFC Resources

(SARFC))

Also as described in chapter Message Prioritization on the ABAP Stack prioritization can be configured in

the ABAP Proxy system This can be used to separate runtime-critical interfaces from interfaces with a long

application processing (as shown above)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

86

71 New enhancements in Proxy queuing

The queuing mechanism was enhanced for sender and receiver proxy systems with Note 1802294

(Receiver) and Note 1831889 (Sender) After implementing these two Notes interface specific queues will be

used and tuning of these queues on interface level will be possible This can be very helpful in cases where

one receiver interface shows very long posting time in the application coding that cannot be further

improved Other messages for more business critical interfaces will eventually be blocked by this message

due to the common usage of the XBTR queues On the sender side of a Proxy system the processing of the

queues call the central PI hub In general the processing time there should be fast But in case of high

volume interfaces you might want to slow down less business critical interfaces to avoid overloading of your

central PI hub The prioritization mechanisms available prior to these Notes did not provide such options

This is achieved by adding an interface specific identifier to the queue-name In the screenshot below you

see a comparison in the queue names for the old framework (red) and the new framework (blue)

For the sender queues the following new parameters are introduced

o Parameter EO_QUEUE_PREFIX_INTERFACE with sub parameter SENDERSENDER_BACK Should be set to lsquo1rsquo to active interface specific queue This is a global switch affecting all sender interfaces

o Parameter EO_INBOUND_PARALLEL_SENDER without sub parameter determines the general number of queues per interface

Using a sub parameter ltsender IDgt the parallelization of individual interfaces can be changed This can eg be used to assign more queues for high priority interfaces

Wildcards can be used in the ltsender IDgt to group interfaces sharing eg the same service name or the same interface name but different services

For the receiver queues the following new parameters are introduced

o Parameter EO_QUEUE_PREFIX_INTERFACE with sub parameter RECEIVER Should be set to lsquo1rsquo to active interface specific queue This is a global switch affecting all receiver interfaces

o Parameter EO_INBOUND_PARALLEL_RECEIVER without sub parameter determines the general number of queues per interface

Using a sub parameter ltsender IDgt the parallelization of individual interfaces can be changed This can eg be used to assign more queues for high priority interfaces

Wildcards can be used in the ltsender IDgt to group interfaces sharing eg the same service name or the same interface name but different services

This new feature also replaces the currently existing prioritization since it is in general more flexible and

powerful We therefore recommend adjusting existing prioritization rules using the new queuing possibilities

described above

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

87

8 MESSAGE SIZE AS SOURCE OF PERFORMANCE PROBLEMS

The message size directly influences the performance of an interface The size of a PI message depends on

two elements The PI header elements with a rather static size and the payload which can vary greatly

between interfaces or over time for one interface (for example larger messages during year-end closing)

The size of the PI message header can cause a major overhead for small messages of only a few kB and

can cause a decrease in the overall throughput of the interface Furthermore many system operations (like

context switches or database operations) are necessary for only a small payload The larger the message

payload the smaller the overhead due to the PI message header On the other hand large messages

require a lot of memory on the Java stack which can cause heavy memory usage on ABAP or excessive

garbage collection activity (see section Java Memory) that will also reduce the overall system performance

Very large messages can even crash the PI system by causing an Out-of-Memory exception for example

You therefore have to find a compromise for the PI message size

Below you see throughput measurements performed by SAP In general the best throughput was identified

for messages sizes of 1 to 5 MB

The message size in these measurements corresponds to the XML message size processed in PI and not

the size of the file or IDoc being sent to PI You can use the Runtime Header of the ABAP stack to check the

message size Below you can see an example of a very small message While the MessageSizePayload

field describes the size of the payload in bytes (here 433 bytes) the MessageSizeTotal describes the total

message size (header + payload) In the example this is around 14 KB demonstrating the overhead by the PI

header for small messages The next two lines describe the payload size before and after the mapping In

the example below the mapping reduces the payload size The last two lines determine the size of the

response message that is sent back to PI before and after the response mapping for synchronous

messages

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

88

Based on the above observations we highly recommend that you use a reasonable message size for your

interfaces During the design and implementation of the interface we therefore recommend using a message

size of 1 to 5 MB if possible Messages can be collected at the sender side or by using IDoc packaging as

described in Tuning the IDoc Adapter to achieve this In case of large messages a split has to be performed

by changing the sender processing or by using the split functions available in the structure conversion of the

File adapter

81 Large message queues on PI ABAP

In case the interfaces are using the ABAP Integration Server use the large message queue filters to restrict

the parallelization of mapping calls from ABAP queues To do so set the parameter EO_MSG_SIZE_LIMIT of

category TUNING to eg 5000 to direct all messages larger 5 MB to dedicated XBTL or XBTM queues

The value of the parameter depends on the number of large messages and the acceptable delay that might

be caused due to a backlog in the large message queue

To reduce the backlog the number of large message queues can also be configured via parameter

EO_MSG_SIZE_LIMIT_PARALLEL of category TUNING The default value is 1 so that all messages larger

than the defined threshold will be processed in one single queue Naturally the parallelization should not be

set higher than 2 or 3 to avoid overloading of the Java memory due to parallel large message requests

82 Large message queues on PI Adapter Engine

SAP Note 1727870 - Handling of large messages in the Messaging System introduces large message

queues (virtual queues no adapter behind) also for the Java based Adapter Engine Contrary to the

Integration Engine it is not the size of a single large message only that determines the parallelization

Instead the sum of the size of the large messages across all adapters on a given Java server node is limited

to avoid overloading the Java heap This is based on so called permits that define a threshold of a message

size Each message larger than the permit threshold is considered as large message The number of permits

can be configured as well to determine the degree of parallelization Per default the permit size is 10 MB and

10 permits are available This means that large messages will be processed in parallel if 100 MB are not

exceeded

To show this let us look at an example using the default values Let us assume we have 5 messages waiting

to be processed (status ldquoTo Be Deliveredrdquo on one server node Message A has 5 MB message B has 10

MB message C has 50 MB message D 150 MB message E 50 MB and message F 40 MB Message A is

not considered large since the size is smaller than the permit size and is not considered large and can be

immediately processed Message B requires 1 permit message C requires 5 Since enough permits are

available processing will start (status DLNG) Hence for message D all available 10 permits would be

required Since the permits are currently not available it cannot be scheduled If blacklisting is enabled the

message will be put to error status (NDLV) since it exceed the maximum number of defined permits In that

case the message would have to be restarted manually Message E requires 5 permits and can also not be

scheduled But since there are 4 permits left message F is put to DLNG Due to the smaller size message B

and message F finish first releasing 5 permits This is sufficient to schedule message E which requires 5

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

89

permits Only after message E and C have finished message D can be scheduled consuming all available

permits

The example above shows a potential delay a large message could face due to the waiting time for the

permits But the assumption is that large messages are not time critical and therefore additional delay is less

critical than potential overload of the system

The large message queue handling is based on the Messaging System queues This means that restricting

the parallelization is only possible after the initial persistence of the message in the Messaging System

queues Per default this is only done after the Receiver Determination Therefore if you have a very high

parallel load of incoming large requests this feature will not help Instead you would have to restrict the size

of incoming requests on the sender channel (eg file size limit in the file adapter or the

icmHTTPmax_request_size_KB limit in the ICM for incoming HTTP requests) If you have very complex

extended receiver determination or complex content based routing it might be useful to configure staging in

the first processing step of the Messaging System (BI=3) as described in Logging Staging on the AAE (PI

73 and higher)

The number of permits consumed can be monitored in PIMON Monitoring Adapter Engine Status The

number of threads corresponds to the number of consumed permits

In newer Wily versions there is also a dashboard showing the number of consumed permits as well The

Worker Threads on the right hand side of the screenshot correspond to the permits

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

90

9 GENERAL HARDWARE BOTTLENECK

During all tuning actions discussed in the previous chapters you must keep in mind that the limitation of all

activities is set by the underlying CPU and memory capacity The physical server and its hardware have to

provide resources for three PI runtimes the Adapter Engine the Integration Engine and the Business

Process Engine Tuning one of the engines for high throughput leaves fewer resources for the remaining

engines Thus the hardware capacity has to be monitored closely

91 Monitoring CPU Capacity

When monitoring the CPU capacity it is not enough to use the average that is displayed in the ldquoPrevious

Hoursrdquo section of transaction ST06 Instead you should start your interface and monitor the CPU while it is

running The best procedure is to use operating system tools to monitor the CPU usage (particularly if using

hardware virtualization)

The SMD Host Agent also reports CPU data to Wily Introscope which can be viewed using dashboards as

shown below This example shows two systems where one is facing a temporary CPU overload and the

other a permanent one

Also the NWA offers a view on the CPU activity from NetWeaver 73 on via Availability and Performance

Management Resource Monitoring History Reports There you can build your own report based on the

ldquoCPU utilizationrdquo data as shown in the screenshot

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

91

You can also monitor the CPU usage in ST06 as described below

Procedure

Log on to your Integration Server and call transaction ST06 Take a look at the snapshot that is provided on

the entry screen than navigate to ldquoDetailed Analysis Menurdquo

Snapshot view Is the idle time (upper left corner) at a reasonable value at all times for example around 10 or higher

Snapshot view The load average is a good indicator of the dimension of the bottleneck It describes the number of processes for each CPU that are in a wait queue before they are assigned to a free CPU As long as the average remains at one process for each available CPU the CPU resources are sufficient Once there is an average of around three processes for each available CPU there is a bottleneck at the CPU resources In connection with a high CPU usage a high value here can indicate that too many processes are active on the server In connection with a low CPU usage a high value here can indicate that the main memory is too small The processes are then waiting due to excessive paging

Detailed Analysis View TOP CPU Which thread uses up the highest amount of CPU Is it the J2EE Engine the work processes or the database

o There are different follow-up actions to be taken depending on the findings of the second check The first option is of course to simply increase the hardware This should be accompanied by a thorough sizing (for which SAP provides the Quicksizer in the Service Marketplace httpservicesapcomquicksizer )

o The second option is to identify the CPU-consuming threads and to think about a reduction of the load in this component For example if the J2EE Engine is the largest consumer then it is worth re-evaluating the number of concurrent mapping threads andor consumer queues as described in the previous chapters

o SAP Note 742395 - Analyzing High CPU Usage by the J2EE Engine provides a good entry point if the J2EE Engine consumes too much CPU

o If it is the work processes that consume a lot of CPU use transaction ST03N to figure out if it is the Integration Server or the Business Process Engine You can do that by looking at the CPU usage per user BPE uses the user WF-BATCH while the Integration Server uses the last user who activated a qRFC queue (typically PIAFUSER or PIAPPLUSER) Use the previous chapter to restrict these activities

92 Monitoring Memory and Paging Activity

As stated above paging is very critical for the Java stack since it influence the Java GC behavior directly

Therefore paging should be avoided in every case for a Java-based system

The OS tools are the most reliable for monitoring the paging activity on your system Transaction ST06 can

also be used to perform a snapshot analysis Take a look at the snapshot that is provided on the entry

screen

Snapshot view Is there enough physical memory available

Snapshot view Is there considerable paging Paging can have a negative influence on your J2EE Engine If it does this can be seen in long GC times as described in Chapter 661

93 Monitoring the Database

Monitoring database performance is a rather complex task and requires information to be gathered from a

variety of sources Only a basic set of indicators is given for the purpose of this guide A more detailed

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

92

analysis is required if these indicators point to a major performance problem in the database If assistance is

needed open a support message under the component BC-DB-ORA (or MSS SDB DB6 respectively)

931 Generic J2EE database monitoring in NWA

The next sections of this chapter will be mainly based on ABAP transactions to do a detailed analysis of the

database performanceThe NWA however offers possibilities to monitor the database performance for the

Java based tables This is especially helpful for Java only AEXPO or Non-Central AAEs In the NWA you

can use the ldquoOpen SQL Monitorrdquo in the Troubleshooting section of NWA The official documentation can be

found in the Online Help Monitoring DB via DBACOCKPIT transaction (ABAP based) is possible also for the

Java-only systems from SAP Solution Manager (DBA Cockpit must be configured in the Managed System

Setup)

It will be too much to outline all the available functionalities here Instead only a few key capabilities will be

demonstrated If you want to eg see the number of select update or insert statements in your system you

can use the ldquoTable Statistics Monitorrdquo It clearly shows you which tables in your system are accessed the

most frequently

To see the processing time of the different statements on your system you can use the ldquoOpen SQL statisticsrdquo

monitor There you can see the count the total average and max processing time of the individual SQL

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

93

statements and can therefore identify the expensive statements on your system

Per default the recorded period is always from the last restart of the system If you would like to just look at

the statistics for a specific time period (eg during a test) you have the option of resetting the statistics in the

individual monitor prior to the test

932 Monitoring Database (Oracle)

Procedure

Log on to your Integration Server and call transaction ST04

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

94

Many of the numbers in the screenshot above depend on each other The following checklist names a few

key performance figures of your Oracle database

The data buffer quality for instance is based on the ratio of physical reads versus the total number of reads The lower the ratio the better the buffer quality The data buffer quality should be better than 94 the statistics should be based on 15 million total reads This number of reads ensures that the database is in an equilibrated state

Ratio of user and recursive calls A good performance is indicated by ratio values greater than 2 Otherwise the number of recursive calls compared to the number of user calls is too high Over time this ratio always declines because more and more SQL statements get parsed in the meantime

Number of reads for each user call If this value exceeds 30 blocks for each user call this indicates an expensive SQL statement

Check the value of TimeUser call Values larger than 15 ms often indicate an optimization issue

Compare busy wait time versus CPU time A ratio of 6040 generally indicates a well-tuned system Significantly higher values (for example 8020) indicate room for improvement

The DD-cache quality should be better than 80

933 Monitoring Database (MS SQL)

Procedure

To display the most important performance parameters of the database call Transaction ST04 or choose

Tools Administration Monitor Performance Database Activity An analysis is only meaningful if

the database has been running for several hours with a typical workload To ensure a significant database

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

95

workload we recommend a minimum of 500 CPU busy seconds Note The default values displayed in

section Server Engine are relative values To display the absolute values press button Absolute values

Check the values in (1)

The cache hit ratio (2) which is the main performance indicator for the data cache shows the

average percentage of requested data pages found in the cache This is the average value since startup The value should always be above 98 (even during heavy workload) If it is significantly below 98 the data cache could be too small To check the history of these values use Transaction ST04 and choose Detail Analysis Menu Performance Database A snapshot is collected every 2 hours

Memory setting (5) shows the memory allocation strategy used and shows the following

FIXED SQL Server has a constant amount of memory allocated which is set by SQL Server configuration parameters min server memory (MB) = max server memory (MB)

RANGE SQL Server dynamically allocates memory between min server memory (MB) lt gt max server memory (MB)

AUTO SQL Server dynamically allocates memory between 4 MB and 2 PB which is set by min server memory (MB) = 0 and max server memory (MB) = 2147483647

FIXED-AWE SQL Server has a constant amount of memory allocated which is set by min server memory (MB) = max server memory (MB) In addition the address windowing extension functionality of Windows 2000 is enabled

934 Monitoring Database (DB2)

Procedure

To get an overview of the overall buffer pool usage catalog and package cache information go to

transaction ST04 choose Performance Database section Buffer Pool (or section Cache respectively)

1 1

1 2

1 3

1 5

1 4

gt 6h

gt 98

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

96

Buffer Pools Number Number of buffer pools configured in this system

Buffer Pools Total Size The total size of all configured buffer pools in KB If more than one buffer pool is used choose Performance Buffer Pools to get the size and buffer quality for every single buffer pool

Overall buffer quality This represents the ratio of physical reads to logical reads of all buffer pools

Data hit ratio In addition to overall buffer quality you can use the data hit ratio to monitor the database (data logical reads - data physical reads) (data logical reads) 100

Index hit ratio In addition to overall buffer quality you can use the index hit ratio to monitor the database (index logical reads - index physical reads) (index logical reads) 100

Data or Index logical reads The total number of read requests for data or index pages that went through the buffer pool

Data or Index physical reads The total number of read requests that required IO to place data or index pages in the buffer pool

Data synchronous reads or writes Read or write requests performed by db2agents

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

97

Catalog cache size Maximum size of the catalog cache that is used to maintain the most frequently accessed sections of the catalog

Catalog cache quality Ratio of catalog entries (inserts) to reused catalog entries (lookups)

Catalog cache overflows Number of times that an insert in the catalog cache failed because the catalog cache was full (increase catalog cache size)

Package cache size Maximum size of the package cache that is used to maintain the most frequently accessed sections of the package

Package cache quality Ratio of package entries (inserts) to reused package entries (lookups)

Package cache overflows Number of times that an insert in the package cache failed because the package cache was full (increase package cache size)

935 Monitoring Database (MaxDB SAP DB)

Procedure

As with the other database types call transaction ST04 to display the most important performance

parameters

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

98

With the Database Performance Monitor (transaction ST04) and its submonitors the CCMS in the SAP R3

System allows you to view in the SAP R3 System all of the information that can be used to identify

bottleneck situations

The SQL Statements section provides information about the number of SQL statements executed and related sizes

The IO Activity section lists physical and logical read and write accesses to the database

The Lock Activity section (in transaction ST04) allows you to identify possible bottlenecks in the assignment of locks by the database

The Logging Activity section combines information about the log area

The Scan and sort activity section can be helpful in identifying that suitable indexes are missing

The Cache Activity section provides information about the usage of the caches and the associated hit rates

o The data cache Hit rate should be 99 in a balanced system If this is not the case you need to check if the low hit rate is due to the size of the data cache being too small or due to an unsuitable SQL statement If necessary you can increase the data cache by increasing the MaxDB parameter Data_Cache_Size (lower than 74) or Cache_Size (74 or higher)

o The catalog cache hit rate should be 85 in a balanced system It is not a cause for concern if the catalog hit rate is temporarily lower since accesses to the system tables and an active command monitor can temporarily impair the hit rate If the catalog cache is busy the pages are paged out in the data cache rather than on the hard disk

o Log entries are not written directly to the log volume but first into a buffer (LOG_IO_QUEUE) so as to be able to write several log entries together asynchronously with one IO on the hard disk Only once a LOG_IO_QUEUE page is full is it written to the hard disk However there are situations in which you need to write LOG entries from the LOG_QUEUE onto the hard disk as quickly as possible for example if transactions are completed (COMMITROLLBACK) The transactions wait until the log writer reports the OK informing them that the log entry is on the hard disk Firstly this means that is important to use the quickest possible hard disks for the log volume(s) and secondly you must ensure that no LOG_QUEUE_OVERFLOWS occur in production operation If the LOG_IO_QUEUE is full then all transactions that want to write LOG entries must wait until free memory becomes available again in the LOG_IO_QUEUE At LOG_QUEUE_OVERFOLWS (Transaction DB50 -gt Current Status -gt Activities Overview -gt LOG_IO_QUEUE Overflow) you need to increase the LOG_IO_QUEUE parameter

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

99

o In accordance with Note 819324 - FAQ MaxDB SQL optimization check which SQL statements are responsible for the most disk accesses and whether they can be optimized You should also optimize SQL statements that have a poor runtime and poor selectivity

Bottleneck Analysis

The Bottleneck Analysis function is available in transaction ST04 under the Problem Analyzes

Performance Database Analyzer Bottlenecks You can use this function to activate the corresponding

analysis tool dbanalyzer

The dbanalyzer then collects important performance data every 15 minutes and evaluates this for possible

bottlenecks

The result of the bottleneck analysis is output in text format to provide a quick overview of possible causes of

performance problems For a more detailed description of the bottleneck messages see the online

documentation (httphelpsapcom) in the SAP Web Application Server area and search for SAP DB

bottleneck analysis messages

The dbanalyzer tool can also be started at operating system level It logs its analysis results in files with date

stamps in the subdirectory ltrundirectorygtanalyzer (you can find the actual directory by double-clicking on

lsquoPropertiesrsquo in DB50) of the relevant database instance

94 Monitoring Database Tables

Some of the tables of SAP PI grow very quickly and can cause severe performance problems if archiving or

deletion does not take place frequently For troubleshooting see SAP Note 872388 ndash Troubleshooting

Archiving and Deletion in PI

Procedure

Log on to your Integration Server and call transaction SE16 Enter the table names as listed below and execute In the following screen simply press the button ldquoNumber of Entriesrdquo The most important tables are SXMSPMAST (cleaned up by XML message archivingdeletion)

If you use the switch procedure you have to check SXMSPMAST2 as well

SXMSCLUR SXMSCLUP (cleaned up by XML message archivingdeletion)

If you use the switch procedure you have to check SXMSPCLUR2 and SXMSCLUP2 as well

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

100

SXMSPHIST (cleaned up by the deletion of history entries)

If you use the switch procedure you have to check SXMSPHIST2 as well

SXMSPFRAWH and SXMSPFRAWD (cleaned up by the performance jobs see SAP Note 820622 ndash Standard Jobs for XI Performance Monitoring)

SWFRXI (cleaned up by specific jobs see SAP Note 874708 - BPE HT Deleting Message Persistence Data in SWFRXI)

SWWWIHEAD (cleaned up by work item archivingdeletion)

Use an appropriate database tool for example SQLPLUS for Oracle for the database tables of the J2EE schema Main tables that could be affected of growth

PI Message tables BC_MSG and BC_MSG_AUDIT (when audit log persistence is enabled)

PI message logging information when stagingloging is used BC_MSG_LOG_VERSION

XI IDoc tables (if IDoc persistence is activated) XI_IDOC_IN_MSG and XI_IDCO_OUT_MSG

Check for all tables if the number of entries is reasonably small or remains roughly constant over a period of time If that is not the case check your archivingdeletion setup

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

101

10 TRACES LOGS AND MONITORING DECREASING PERFORMANCE

101 Integration Engine

There are three important locations for the configuration of tracing logging in the Integration Engine The

pipeline settings of PI itself the Internet Communication Manager (ICM) and the gateway All three are

heavily involved in message processing and therefore their tracing or logging settings can have an impact on

performance On top of this you might have changed the SM59 RFC destinations and have switched on the

trace in order to analyze a problem If so then this needs to be reset as well

Procedure

Start transaction SXMB_ADM and navigate to Integration Engine Configuration Specific Configuration and search for the following entries

Category Parameter Subparameter Current Value Default

RUNTIME TRACE_LEVEL ltnonegt ltyour valuegt 1

RUNTIME LOGGING ltnonegt ltyour valuegt 0

RUNTIME LOGGING_SYNC ltnonegt ltyour valuegt 0

Set the above parameters back to the default value which is the recommended value by SAP

Start transaction SMICM and check the value of the Trace Level that is displayed in the overview

(third line) Set the trace level to lsquo1rsquo if not already the case You can do so by navigating to Goto Trace Level Set

Start transaction SMGW navigate to Goto Parameters Display and check the parameter Trace Level Set the trace level to lsquo1rsquo if not already the case You can do so by navigating to Goto Trace Gateway Reduce Level (or Increase Level respectively)

Start transaction SM59 and search the following RFC destination for the flag ldquoTracerdquo on the tab Special Options and remove the Flag ldquoTracerdquo if it has been set

AI_RUNTIME_JCOSERVER

AI_DIRECTORY_JCOSERVER

LCRSAPRFC

SAPSLDAPI It is possible that other RFC destinations are used for sending out IDocs and similar

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

102

102 Business Process Engine

You only have to check the Event Trace in the Business Process Engine

Procedure

Call transaction SWELS and check if the Event Trace is switched on The recommended setting is lsquooffrsquo (as shown in the screenshot below)

103 Adapter Framework

Since the Adapter Framework runs on the J2EE Engine the tracing and logging can be conveniently

checked and controlled using NetWeaver Administrator To improve the performance SAP recommends that

you set all trace levels to default Higher trace levels are only acceptable in productive usage during problem

analysis Below you will find a description on how to set the default trace level for all locations at once It is of

possible to do it for every location separately but this way you might forget to reset one of the locations

Procedure

Start the NetWeaver Administrator and navigate to Problem Management (or Troubleshooting in NW 73 and higher) Logs and Traces and choose Log Configuration In the Show drop down list box choose Tracing Locations Select the Root Location in the tree and press the button Default Configuration This will set the default configuration for all locations

Start the NetWeaver Administrator and navigate to Troubleshooting Logviewer (direct link nwalinks) Open the View ldquoDeveloper Tracerdquo and check if you have very frequent reoccurring exceptions that fill up the trace Analyze the exception and check what is causing this (eg a wrong configuration of a Communication Channel causing repeated traces If you see very frequent errors for which you cannot find the root cause open a customer incident at SAP

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

103

An aggregated view of Java exceptions (by number date instance server node etc) are reported and available in SAP Solution Manager Root Cause Analysis Exception Analysis functionality

1031 Persistence of Audit Log information in PI 710 and higher

With SAP NetWeaver PI 71 the audit log is not persisted for successful messages in the database by default

to avoid performance overhead Therefore the audit log is only available in the cache for a limited period of

time (based on the overall message volume)

Audit log persistence can be activated temporarily for performance troubleshooting where audit log

information is required To do so use the NWA and change the parameter

ldquomessagingauditLogmemoryCacherdquo to false in service XPI Service Messaging System by setting the

parameter to false Details can be found in SAP Note 1314974 - PI 71 AF Messaging System audit log

persistence Only do so temporarily during the time of troubleshooting to avoid any performance problems

from the additional persistence

After implementing Note 1611347 - New data columns in Message Monitoring additional information like

message processing time in ms and server node is visible in the message monitor

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

104

11 ERRORS AS SOURCE OF PERFORMANCE PROBLEMS

If errors occur within messages or within technical components of the SAP Process Integration then this can

have a severe impact on the overall performance Thus to solve performance problems it is sometimes

necessary to analyze the log files of technical components and to search for messages with errors To

search for messages with errors use the PI Admin Check which is available using SAP Note 884865 - PIXI

Admin Check

Procedure

ICM (Internet Communication Manager)

Start transaction SMICM open the log file (Shift + F5 or GoTo Trace File Show All) and check for errors

Gateway

Start transaction SMGW open the Log File (CTRL + Shift + F10 or Goto Trace Gateway Display File) and check for errors

System Log

Start transaction SM21 choose an appropriate time interval and search the log entries for errors Repeat the procedure for remote systems if you are using dialog instances

ABAP Runtime Errors

Start transaction ST22 choose an appropriate time interval and search for PI-related dumps In general the number of ABAP dumps should be very small on a PI system and therefore all occurring dumps should be analyzed

Alerts CCMS

Start transaction RZ20 and search for recent alerts

Work Process and RFC trace

In the PI work directory check all files which begin with dev_rfc or dev_w for errors

J2EE Engine

In the PI work directory search for errors in file dev_server0out If you are using more than one J2EE server node check dev_serverltngtout as well

Applications running on the J2EE Engine

The traces for application-related errors are the defaulttrc files located in j2eeclusterserverltngtlog directory To monitor exceptions across multiple server nodes the best tool to use is the LogViewer service of the NWA This is available via Problem Management Logs and Traces and choose Log Viewer Search for exceptions in the area of the performance problem for example the PI Messaging system

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

105

APPENDIX A

A1 Wily Introscope Transaction Trace

As mentioned above the Wily Transaction trace can be used to trace expensive steps that were noticed

during the mapping or module processing The transaction trace will allow you to drill down further into Java

performance problems and to distinguish if it is a pure coding problem or caused by a look-up to a remote

system or a slow connection to the local database

The Wily Transaction trace can be used to trace all steps that exceed a specific duration on the Java stack If

a user is known (for example for SAP Enterprise Portal) the tracing can be restricted to a specific user In PI

if you encounter a module that lasts several seconds then you can restrict the tracing as shown below

Note Starting the Transaction Trace increases the data collection on the satellite system (PI) and is

therefore only recommended in productive environment for troubleshooting purposes

With the selection above the trace will run for 10 minutes (this is the maximum and it can be canceled earlier)

and will trace all steps exceeding 3 seconds for the agents of the PI system If such long-running steps are

found then a new window will be displayed listing these steps

In the screenshot below you can see the result in the Trace View The Trace View shows the elapsed time

from left to rightndash in the example below around 48 seconds From top to bottom we can see the call stack of

the thread In general we are interested in long-running threads on the bottom of the trace view A long-

running block at the bottom means that this is the lowest level coding that was instrumented and which is

consuming all the time

In the example below we see a mapping call that is performing many individual database statements ndash this

will become visible by highlighting the lowest level In such a case you have to review the coding of the

mapping to see if the high amount of database calls can be summarized in one call

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

106

Another case that is often seen is that a lookup using JDBC to a remote database or RFC to an ABAP

system takes a long time in the mapping or adapter module In such a case there will be one long block at

the bottom of the transaction trace that also gives you some details about the statement that was executed

A2 XPI inspector for troubleshooting and performance analysis

The XPI Inspector is an additional tool that provides enhanced logging and tracing capabilities with a main

focus on troubleshooting But the tool can also be used for troubleshooting of performance issues

General information about the tool can be found at Note 1514898 - XPI Inspector for troubleshooting PI The

tool can be called using the following URL on your system httphostjava-portxpi_inspector

To analyze performance problems typically the example ldquo51 ndash ldquoPerformance Problem)rdquo is used As a basic

measurement it allows you to take multiple thread dumps in specific time intervals These thread dumps can

be analyzed later by your administrators or SAP support to identify what is causing the problem The Thread

Dump Viewer tool (TDV) from Note 1020246 - Thread Dump Viewer for SAP Java Engine can be used to

analyze the produced thread dumps (inside the results archive that can be downloaded to your local PC)

SAP NETWEAVER 71 AND HIGHER PERFORMANCE CHECK GUIDE

107

In addition the tool allows you to do JVM profiling be either doing JVM Performance tracing or JVM Memory

Allocation tracing This can help to understand in detail which steps of the processing are taking a long time

As output the tool provides prf files that can be loaded into the JVM Profiler More information can be found

at httpwikiscnsapcomwikidisplayASJAVAJava+Profiling Please Note that these traces (especially the

memory profiling) will cause a high overhead on the J2EE engine and are therefore not recommended to be

used in production Instead it is advisable to reproduce the problem on your QA system and use the tool

there

Document Version 30 March 2014

wwwsapcom

copy 2014 SAP AG All rights reserved

SAP R3 SAP NetWeaver Duet PartnerEdge ByDesign SAP

BusinessObjects Explorer StreamWork SAP HANA and other SAP

products and services mentioned herein as well as their respective

logos are trademarks or registered trademarks of SAP AG in Germany

and other countries

Business Objects and the Business Objects logo BusinessObjects

Crystal Reports Crystal Decisions Web Intelligence Xcelsius and

other Business Objects products and services mentioned herein as

well as their respective logos are trademarks or registered trademarks

of Business Objects Software Ltd Business Objects is an SAP

company

Sybase and Adaptive Server iAnywhere Sybase 365 SQL

Anywhere and other Sybase products and services mentioned herein

as well as their respective logos are trademarks or registered

trademarks of Sybase Inc Sybase is an SAP company

Crossgate mgic EDDY B2B 360deg and B2B 360deg Services are

registered trademarks of Crossgate AG in Germany and other

countries Crossgate is an SAP company

All other product and service names mentioned are the trademarks of

their respective companies Data contained in this document serves

informational purposes only National product specifications may vary

These materials are subject to change without notice These materials

are provided by SAP AG and its affiliated companies (SAP Group)

for informational purposes only without representation or warranty of

any kind and SAP Group shall not be liable for errors or omissions

with respect to the materials The only warranties for SAP Group

products and services are those that are set forth in the express

warranty statements accompanying such products and services if

any Nothing herein should be construed as constituting an additional

warranty

wwwsapcom