BPM system performance evaluation basics

Follow

Performance tuning is a broad topic with regards to BPM.  We often equate this to peeling back layers of an onion, the top layer gives way to more and more settings we need to examine.  While this can be a time consuming process, it's a good idea to do the evaluation from time to time on older systems and to check these settings for any new environments you may setup.

This article is designed to walk you though the basics of performance tuning, but it is not a comprehensive list nor is it designed to address architecture issues.  Only base level settings on BPM.

THE BASIC AREAS:

  • HARDWARE, VM, CLUSTER SETTING BASICS
  • NETWORKING
  • JVM HEAP SETTINGS
  • DATASOURCE CONNECTION POOL PROPERTIES
  • SYSTEM TIMEOUTS
  • TWCR (TeamworksConfiguration.running.xml) SETTINGS
  • MONITORING FOR PERFORMANCE
  • MORE INFORMATION OR ASSISTANCE

HARDWARE, VM, CLUSTER SETTING BASICS:

There are a number of considerations with regards to the base machine installation.  Generally speaking these requirements are listed here: IBM BPM Advanced detailed system requirements.

Consider the following initially:

  • CPU - either real or virtual, you should most size the system appropriately.
  • RAM - Make sure you have at least enough to cover the base OS operations and JVM sizes you will be using.
  • OS - Please ensure it is a supported version of the OS or aware that non-supported versions may encounter issues.

NETWORKING:

Networking with BPM can get very tricky.  Please be aware that restricting network access via VLANs, firewalls, port forwarding, proxy configs, or other network related operations can negatively impact BPM performance.  This is because BPM can communicate with a number of systems (Datasources/SOR databases, LDAP, reporting servers, etc).  While many of these configurations are supported, each one should be examined and addressed in regards to it's own operations.

For example, let's say your database server and runtime server are on separate VLANs.  You should ensure all relevant ports are open for communication to any server on those other VLAN between the runtime BPM server and the database.

There is little to examine here other than being aware of the network architecture.

JVM HEAP SETTINGS:

Depending on the type of BPM installation, you can have a single or multiple Websphere application servers.  Usually you will see AppTarget, MEssaging, Support, WebApp or some combination of these, OR if it is a single cluster install, you may just see something like SingleClusterMemeber1.  In each of these Websphere applciuation servers, you need to check the min and max heap settings and the verbose garbage collection setting.

Settings can be accessed by:

  • Login to the Websphere Admin Console (or WAS console)
  • Expand Servers -> Server Types -> Websphere application servers
  • For each server you have listed, click the link to open it's Configuration page
  • Under "Server Infrastructure", expand Java and Process Management -> Process Definition
  • Click Java Virtual Machine on the right hand side

You will then see the settings we need to examine.  Each is described below [1]:

Initial heap size

Specifies, in megabytes, the initial heap size available to the JVM code. If this field is left blank, the default value (50MB) is used.  For BPM, this value is almost certainly too low and should be increased.  The value should be lower than the maximum heap size.  Increasing this setting can improve start-up, however if the heap is too large to reside in physical memory (heap size is greater then RAM available) then paging can occur.  This can then result in a noticeable decrease in performance.

Maximum heap size

Specifies, in megabytes, the maximum heap size that is available to the JVM code. If this field is left blank, the default value (256MB) is used.

The default maximum heap size is 256 MB. This default value applies for both 32-bit and 64-bit configurations.

Increasing the maximum heap size setting can improve startup. When you increase the maximum heap size, you reduce the number of garbage collection occurrences with a 10 percent gain in performance.

Increasing this setting usually improves throughput until the heap becomes too large to reside in physical memory. If the heap size exceeds the available physical memory, and paging occurs, there is a noticeable decrease in performance. Therefore, it is important that the value you specify for this property allows the heap to be contained within physical memory.

Verbose garbage collection

Specifies whether to use verbose debug output for garbage collection. The default is not to enable verbose garbage collection. We suggest to keep verbose GC enabled especially for Production systems.

In theory, the performance impact of GC logging should be very low. The data is always generated internally by the garbage collector, because it is used to enable self-tuning. There should be no significant overhead for logging the values. The cost of enabling -verbose:gc should be the same as adding any output markup, then writing the output to either stderr or a log file.
For a running Java application we would expect the GC overhead: (the percentage of time spent running GC versus the time spent running Java code) to be around 3% or less. Some very basic quantitative tests that we carried out locally has shown us that GC logging causes GC to take about 1% longer, and thereby would increase that GC overhead to 3.03%. This shows that the overall performance cost of enabling GC logging is as low as 0.03%.

The log information is extremely useful for monitoring IBM BPM Java applications, and for helping you diagnose performance or memory problems. Often, you want to diagnose a problem when it first happens. To do this, you need to enable logging so that you already have the monitoring information.

DATASOURCE CONNECTION POOL PROPERTIES:

Datasources are the database connections your WAS instance is using.  Generally, for BPM installations, these are either the default databases or a custom SOR (system of record) for a customer process application. Most of these values do not need to be changed, but a review of these values may be useful.

Settings can be accessed by:

  • Login to the Websphere Admin Console (or WAS console)
  • Expand Resources -> JDBC -> Data sources
  • Select the data source
  • One the right hand side of the Configuration page click Connection pool properties

There are 6 main settings we need to examine [2]:

Connection time This value indicates the number of seconds that a connection request waits when there are no connections available in the free pool and no new connections can be created. This usually occurs because the maximum value of connections in the particular connection pool has been reached.  A value of 0 indicates an infinite time (not recommended).  The default is 180 seconds.
Max connections

These are the physical connections to the backend resource. When this number is reached, no new physical connections are created. The requester waits until a physical connection that is currently in use returns to the pool, or until a ConnectionWaitTimeoutException error displays.

If you're running on BPM 801 or earlier then consider evaluating and changing values for IBM BPM datasources as instructed in the following technote:

http://www-01.ibm.com/support/docview.wss?uid=swg21573982

Tip: For better performance, set the value for the connection pool lower than the value for the maximum thread pool connections of the web container. To configure this setting click Servers > Server types > WebSphere application servers > server > Thread Pools, and modify the web container property. Lower settings, such as 10-30 connections, perform better than higher settings, such as 100.

Min connections

Specifies the minimum number of physical connections to maintain.

If the size of the connection pool is at or below the minimum connection pool size, the Unused timeout thread does not discard physical connections. However, the pool does not create connections solely to ensure that the minimum connection pool size is maintained. Also, if you set a value for Aged timeout, connections with an expired age are discarded, regardless of the minimum pool size setting.

Reap time

Specifies the interval, in seconds, between runs of the pool maintenance thread.  The Reap Time interval also affects performance. Smaller intervals mean that the pool maintenance thread runs more often and degrades performance.  Default is 180 seconds

Unused timeout

Specifies the interval in seconds after which an unused or idle connection is discarded.

Set the Unused timeout value higher than the Reap timeout value for optimal performance. Unused physical connections are only discarded if the current number of connections exceeds the Minimum Connections setting.

Aged timeout

Specifies the interval in seconds before a physical connection is discarded.

Setting Aged timeout to 0 supports active physical connections remaining in the pool indefinitely. Set the Aged timeout value higher than the Reap timeout value for optimal performance.

SYSTEM TIMEOUTS:

There are four system level timeouts that may need to be adjusted. [3]

Total transaction timeout

Application Servers -> Support Server Name -> Container Services –Transaction Service ->

The default maximum time, in seconds, allowed for a transaction that is started on this server before the transaction service initiates timeout completion. Any transaction that does not begin completion processing before this timeout occurs is rolled back.

Client inactivity timeout

Application Servers -> Support Server Name -> Container Services –Transaction Service ->

Specifies the maximum duration, in seconds, between transactional requests from a remote client. Any period of client inactivity that exceeds this timeout results in the transaction being rolled back in this application server.

Maximum transaction timeout

Application Servers -> Support Server Name -> Container Services –Transaction Service ->

Specifies, in seconds, the upper limit of the transaction timeout for transactions that run in this server. This value should be greater than or equal to the value specified for the total transaction timeout.

Session timeout

Serves -> WebSphere application servers -> Server Name -> Session Management ->

Specifies how long a session is allowed to go unused before it is considered not valid. Specify either Set timeout or No timeout. If you choose to set the timeout, the value must be at least two minutes, specified in minutes.

TWCR (TeamworksConfiguration.running.xml) SETTINGS:

There are a number of settings in these files that may need to be modified.  Some of these are listed below, but for a more comprehensive list check IBM BPM Performance tuning RedBook.

That said, below are the main settings we examine:

  • kick-on-schedule - When set to true, the Event Manager reloads it’s queues every time a new task is added. This is great for development as you see a task as soon as it’s scheduled. This is terrible for production as it causes the EM to thrash constantly rescheduling itself.
  • default-unversioned-po-cache-size - For low volume environments with relatively few process applications and coaches, this value may be sufficient. For more complex environments with many process applications or coaches, increase this value so that the process applications and coaches are held in the cache of their initial use. This step can improve response time when accessing these process applications and coaches. For example, increase each of these values to 1500.
  • default-versioned-po-cache-size - For low volume environments with relatively few process applications and coaches, this value may be sufficient. For more complex environments with many process applications or coaches, increase this value so that the process applications and coaches are held in the cache of their initial use. This step can improve response time when accessing these process applications and coaches. For example, increase each of these values to 1500.
  • default-unversioned-po-cache-size - For low volume environments with relatively few process applications and coaches, this value may be sufficient. For more complex environments with many process applications or coaches, increase this value so that the process applications and coaches are held in the cache of their initial use. This step can improve response time when accessing these process applications and coaches. For example, increase each of these values to 1500.
  • bpd-queue-capacity - Start with a BPD Queue Size (bpd-queue-capacity) of 10 per physical processor core (for example, 40 for a four-processor core configuration), with a maximum value of 80. Tune as needed after that, based on the performance of your system.
  • max-thread-pool-size - Start with a Worker Thread Pool Size (max-thread-pool-size) of 30 + 10 per physical processor core (for example, 70 for a four-processor core configuration), with a maximum value of 110. Tune as needed after that, based on the performance of your system.
  • loader-advance-window - For BPDs with many timers, reduce the amount of Event Manager activity by reducing the number of timer events that are held in memory through the following change to the 80EventManager.xml file in the profiles directory (the full path to this file is in “Tune BPD queue size and worker thread pool size” on page 58 in above mentioned RedBook); the default is 60000.
  • branch-context-max-cache-size - The size of the branch cache is denoted as the number of branch cache entries that can be held in the cache.  This is directly related to the number of unique process application and toolkit snapshots that are deployed. The amount of additional memory used by increasing the size of the branch cache can be estimated using the formula shown below, and the default is 64:
(# unique Process App snapshots + # unique toolkit snapshots)
x size_constant
  • snapshot-cache-size-per-branch - The size of the snapshot cache is denoted as the number of snapshot cache entries that can be held in the cache.  This is directly related to the number of unique process application and toolkit snapshots that are being accessed. The default value is 64.

NOTE: For more on the last two parameters, please review this article: Tuning branch and snapshot cache sizes in IBM Business Process Manager.

MONITORING FOR PERFORMANCE:

BPLabs has already written several articles on monitoring that can assist with future monitoring and performance evaluation needs:

MORE INFORMATION OR ASSISTANCE:

First, you can get an even more comprehensive look at performance tuning with this: IBM Business Process Manager V8.0 Performance Tuning and Best Practices.

Need more help?  Just ask BPLabs for additional assistance.  We will be happy to help!

 

 

REFERENCES

1. "Jave Virtual Machine Settings." IBM Knowledge Center. IBM, June 12, 2014. http://www-01.ibm.com/support/knowledgecenter/?lang=zh-tw#!/SSEQTP_8.0.0/com.ibm.websphere.base.iseries.doc/info/iseries/ae/urun_rconfproc_jvm.html.  Accessed July 9, 2014.

2. "Connection Pool Settings." IBM WebSphere Application Server. IBM. http://publib.boulder.ibm.com/infocenter/wsdoc400/v6r0/index.jsp?topic=/com.ibm.websphere.iseries.doc/info/ae/ae/udat_conpoolset.html. Accessed July 9, 2014.

3. "Transaction Service Settings." IBM Knowledge Center. IBM. June 16, 2014. http://www-01.ibm.com/support/knowledgecenter/#!/SSAW57_7.0.0/com.ibm.websphere.nd.multiplatform.doc/info/ae/ae/udat_contranserv.html. Accessed July 9, 2014.

 

Have more questions? Submit a request

Comments

  • Avatar
    Mahesh Ramani

    Hi,
    Nice article! I had a question. Do you think the following parameters are still applicable in IBM BPM v7.5 and above. I know they did help a lot in the performance in Lombardi v6.x:
    cached-objects-ttl
    cached-epv-ttl

  • Avatar
    Dave Rosen

    Excellent question! You are correct, in the TW6.x days, those settings were important, however with IBM WLE and BPM (versions 7.2 and up), the settings were deprecated. This means those settings have no effect starting with WLE 7.x and onward. A colleague has already a note about this still being documented in the 8.5 Performance tuning redbook and hopefully IBM will update that soon.

    Bottom line, those settings were deprecated but left for historical reasons. Instead IBM has completely re-written the cache and thus those settings are no longer used but left in the BPM config files.

Powered by Zendesk