Monitoring is a key part of ensuring uptime and service for your BPM solution. Below is a starting point for discussing and thinking about how to monitor a BPM infrastructure.
For monitoring, there are several areas to examine and I will delve into more detail on each shortly.
- Remote protocol checks - Basic network services
- OS server settings
- Applciation server settings
- Log files
- JVM settings
- Database settings
- Miscellaneous non-BPM items
- Specific BPM parameters related to your specific processes.
Generally, not all of these settings can be implemented at once or in all cases, but each area listed has specific reasons to be monitored. Also, the settings for the thresholds for each of the monitors should be determined on a case by case basis. Some applications have a high load whereas others may not be. This means each of these settings may need to be tuned for your environment.
There are a number of monitoring tools that can be used, but there is not a single tool that is recommended. If you have a commercial product such as HP-Openview, Uptime, Solarwinds, CA Unicenter, Tivoli, or similar package, then you may already have access to monitoring agents that will allow you access to the internal metrics of the various key items we want to monitor. Please consult your monitoring solution documentation for details.
If you are looking to use a free or open source solution, then you may be required to develop your own scripts or monitors. Nagios, for example, has a number of free monitors and open source scripts to assist. There are also other monitoring solutions similar to this that could be leveraged.
- DB - Database server. This can be either a stand alone server or possible something like an Oracle RAC (for example).
- AS - Application server. This is the machine on which Websphere is installed and BPM is running. This could be a node in a cluster, in which case AS1, AS2, etc. are used.
- LB - Load Balancer. Depending on the kind of Load Balancer, the monitoring options may change.
- PP - Process Portal
- PAC - Process Admin Console
- WAC - Websphere Admin Console
Remote protocol checks - Basic network services
These monitored items are the first pass at remote monitoring. Most modern monitoring systems include these basic checks for any server, but I will address the ones specific for BPM monitoring:
- TCP/IP checks - DB, AS, LB
- HTTP/HTTPS - PP, PAC, WAC
- SSH - DB, AS
- DNS - AS, LB
- SMTP - AS
- LDAP - AS (Note: This service is used by the Application servers, not hosted by them in general).
OS server settings
Usually indicate system performance and indirectly BPM performance. Generally, these are good health checks for local system resources.
- CPU Load - DB, AS, LB
- RAM usage - DB, AS, LB
- Disk space - DB, AS, LB
- Swap space - DB, AS, LB
APPLICATION SERVER settings - Specific to Websphere including JVM
These monitored services are Websphere specific. There are numerous ways to implement these checks, therefore depending on your monitoring system, the check method will vary.
Generally speaking, there are two interfaces exposed by Websphere, the PMI and JMX management interfaces.
- JVM Heap Memory Usage - Total Memory, Free Memory, Used Memory
- Garbage Collection (GC) statistics
- Metrics of all web applications
- Enterprise JavaBeans (EJBs)
- Thread Pools
- Java Database Connectivity (JDBC) Pools
- Java Message Service - Queue and Topic Details
- Custom Application MBeans (JMX) attributes
Log file can be monitoring with numerous packages (for example Splunk). We simply need to check the following locations for each of the various servers in question:
- <WAS_ROOT>/profiles/<DMGR PROFILE>/logs/<SERVER NAME>
- <WAS_ROOT>/profiles/<DMGR PROFILE>/logs/ffdc
- <WAS_ROOT>/profiles/<NODE PROFILE>/logs/<SERVER NAME>
- <WAS_ROOT>/profiles/<NODE PROFILE>/logs/ffdc
- NODE AGENT:
- <WAS_ROOT>/profiles/<NODE PROFILE>/logs/<SERVER NAME>
To monitor the JVM, you will need to expose the JMX interface using the PMI (Performance Monitor Infrastructure) by installing the perfservletApp.ear. Below are general steps to allow this.
Install the perfservletApp.ear, Performance Monitor Infrastructure (PMI) app.:
- In the Admin console, on the left-side tree, click:
- Applciations->Websphere enterprise applications
- The right-side table lists all the installed applications. Check if PerfServletApp is already available. If not:
- Click 'Install' to install the perfServletApp.ear file (which is available by default under WebSphere installation directory).
- The perfservletApp.ear comes with Websphere and should be located in the “installableApps” directory
- Fast Path should be fine and choosing the defaults should work as well. Click: Next->Next->Next-> Finish
- Once the installation is done, you should “Save” the changes to the master configuration.
- Restart WebSphere Server.
- Login to the admin console and make sure the Enterprise Application “perfServletApp” is running.
- So now the application is installed and running we set what statistics we want to monitor for the PMI:
- (WAS Console->Monitoring and Tuning->Performance Monitoring Infrastructure) and set the monitored statistics to "All" and both Runtime and Configuration tabs confirm the setting.
- Set the monitoring role to allow access:
- Aplpications->Websphere Enterprise Appllications-> perfServletApp
- Security role to user/group mapping
- Map Special Subjects → Everyone
- Save changes.
- Now we should be able to get to the XML: http://<SERVER>:<PORT>/wasPerfTool/servlet/perfservlet where
- WebSphere Host -> The host of the websphere application server in which the perf servlet application is installed
- Websphere Port -> HTTP Transport port of the Websphere server in which the perf servlet application is installed
- NetworkDeployer SOAP PORT -> The SOAP port of the domain manager (DMGR)
- Network Deployer Host -> The host in which the domain manager is running.
Once the JMX is exposed, you should be able to script getting the values for monitoring purposes.
Monitoring the database settings can be tricky. There are several things to consider from multiple levels including external connection status, internal metrics, and runtime settings.
- Check if a connection to the database can be established.
- Checks if the instance is active.
- Check DB version.
- Check the database size.
- Check the HADR status.
- Checks the locks looking for long term lock waits, returning with process is holding the locks.
- Checks the log consumption per day.
- Log usage, that allows you identify how many primary and secondary logs are being used.
- Tablespace: Use and state
- Check Oracle RAC status
- Cache Hit Ratio
- Library Cache Hit Ratio
- DB Block Buffer Cache Hit Ratio
- Latch Hit Ratio
- Disk Sort Ratio
- Rollback Segment Waits
- Dispatcher Workload
- MSSQL Mirroring
- connection-time - Time to connect to the server
- full-scans - Full table scans per second
- transactions - Transactions per second
- database-free - Free space in database
Miscellaneous non-BPM items
There are some items which do not fall neatly into any category, but you may wish to monitor. Below are a few such cases:
- Service account monitoring
- Some accounts may be needed for various network services such as LDAP. Some customers may want to implement monitoring to check if said account is valid, password expired, etc.
- Shared network resources
- This can be any shared resource but an example is a shared drive (SAN, NAS, or just an NFS mount) where you may be storing some custom data or logs for your process.
Specific BPM parameters related to your specific processes.
For these BPM specific settings, you will likely need to examine your process, any SoR (System-of-Record), external applications (say web services your process interacts with), and possibly BPM database or SIB parameters.
Initially, we suggest thinking about how your users interact with the process to help determine key monitoring metrics.
MORE INFORMATION OR ASSISTANCE:
Just ask BPLabs for additional assistance. We will be happy to help!