Some companies need to retain data for years due to compliance and/or government regulations. This can lead to increasingly large databases with completed instance data, resulting in poor runtime performance for users, which may conflict with the data retention policies.
It is commonly said that
BPM should not be considered as a system-of-record (SOR).
However, we still need to balance performance with compliance.
IBM BPM makes it very easy to construct large variable types and pass that data throughout a process. This ease of design leads many people to infer that the system is a SOR.
Whilst true for active instances and their associated tasks, problems may arise once those instances are completed or terminated and are retained simply for historical purposes. An increasing number of completed items means the searches (including portal inbox) must evaluate those rows in the database as well. This will cause performance issues over time as the number of BPM instances and tasks increase (and subsequently the searching takes longer with more rows to examine each time).
To mitigate the performance degradation over time, we suggest using a second archive or SOR database.
There are a number of possible scenarios to store historical data.
First, recall the Performance Data Warehouse stores timing intervals and tracked data, so the information may be available to meet compliance requirements there.
If the PDW data is not acceptable compliant, here are some alternatives:
- Architect your solution with an SOR database.
- When designing your BPM solution, factor in a SOR database for your important data. In that manner
- the data source on the BPM system can change if the SOR has to move, change, etc.
- the SOR database is completely separate from the Process DB which means the Process DB can only hold current instances and tasks, thereby being more performant and not needing to search old instances/tasks.
- the SOR can still be reached for reporting or data recovery needs.
- If you have a solution in place, but are lacking an SOR, then this is another possible path to consider. The idea is to take the instances and task you would delete and move them to another database. In this manner, you "archive" the instance and task, but keep the data in a second data source.
- Using the LSW_BPD_INSTANCE_DELETE stored procedure, you can modify it and the associated stored procedures (LSW_ERASE_TASK, LSW_ERASE_BPD_INSTANCE, etc) to move the instances first to another database before deleting.
- Your instance data is now stored in a separate "Archive" database that is searchable and mitigates the performance hit on the runtime Process database of active instances and tasks.
Data retention is difficult but the main things to remember are:
- A separate data store for the SOR or old instance and task data.
- Keep the runtime Process database clean to help performance remain high.
- BPM is not a SOR database.
As always, should you have any questions about this or any article from BP3, please contact our labs group and we will be happy to assist.