Configure AppDynamics to monitor IBM BPM/BAW effectively

Overview

In this article, I'm not going to focus on how to configure AppDynamic agents to work with IBM BPM/BAW because that's a simple task that consists of adding corresponding generic JVM parameters in IBM WebSphere.

Instead, I'd like to focus on what are the main areas you may want to pay attention to configuring inside AppDynamics to get the most out of it when monitoring BPM/BAW.

Problem

IBM BPM/BAW developers and admins know that whenever you try to troubleshoot a performance problem in BPM/BAW that turns out to be related to your solution code it's often hard to pinpoint the exact process or service that might be causing the problem. There are certain built-in tools like Process Monitor in BPM/BAW but once you're in a certain bad state in your environment you're unable to even access those tools. Also, another problem arises when you're running in a multi-node type of environment you have to access Process Monitor on each node to catch the exact process or service that is causing the issue, and often it's not possible because the node ports are blocked (especially in PROD) and you have to use Load Balancer URL which means you end up on one specific node each time (don't forget about sticky sessions required by BPM/BAW!).

Resolution

If you happen to have AppDynamics in your organization then you have much better chances of catching exact problem areas in your solution than using just built-in tools.

Also, the neat thing about AppDynamics is that once you configure some sort of a dashboard you may get historical data even after your server crashes! So, for instance, in case of out of memory errors, your server won't be responding and you won't be able to even access Process Monitor while you will still be able to pull the historical snapshot from AppDynamics and pinpoint the issue.

The question becomes - how do I configure AppDynamics for BAW / BPM proper monitoring?

Luckily, AppDynamics has put together a great article that I would suggest to start with.

Article covers pretty much all you will need to correlate the processing within IBM BPM\BAW to facilitate end-to-end tracing of the Business Transactions.

These processes are critically important and it is, therefore, natural to use AppDynamics Business Transactions to monitor the system integration elements of the processes to ensure the performance and reliability of these technical integrations. You can also use the AppDynamics Business Journeys capability to monitor the end-to-end progress of the processes, including the elements of human interaction.

Often you want to look into a specific thread that you believe is causing the issue (delay / loop) and you discovered the thread id from the logs or java cores. In AppDynamics you have an ability to drill down inside a specific thread and see what it's doing including being able to pull up the exact query it's executing on BPM db side.

The typical flow for this would be -

If you have a baseline set and these slow transactions are beyond that threshold, then you should be able to see a call graph of the offending transactions. Expand out "Troubleshoot" section of BPM/AppCluster and then click on errors and look for Transaction Snapshots with a "blue page icon", and double-click on it, and then drill down from there, or you can click on "Business Transacations", double-click on the transaction with the long times, and then click on "Slow transactions" or "Transaction Snapshots" and double-click on snapshots with the "blue page icon." If you go this last route, you should be able to double-click on a healthy one and a bad one to compare your call charts.

I would also suggest to set a Policy (ies) to trigger when it goes beyond your baseline too, and get a direct deeplink to the offending transactions. Same/similar policies can be applied for heap / memory consumption and CPU consumption, e.g. when your heap is consumed for more than 70-80% create a policy to trigger an alert, same for CPU.

More information can be found in the following links -

Additional information

When troubleshooting performance issue my advise is to collect as much data as possible including those instruments that are provided by IBM BPM / BAW. The most critical pieces of data during performance problem troubleshooting are: java cores, heap dumps (in case of OOM errors), instrumentation data (from instrumentation monitor from PA console), Process Monitor (on all nodes), EM Monitor, verbose GC logs, system logs. Having these and AppDynamics would allow you to correlate between data sets and being able to pinpoint the exact performance culprit. You will see same thread id's in java cores/ instrumentation logs / system logs as those you will see in AppDynamics tool.

Looking for help?

Articles in this section

Comments