Monitoring in ODC is currently restricted to Apps and there's no way of monitoring the Platform or any of the existing Runtime stages in the assigned tenant.
Also, as we speak, ODC doesn't externalize any API or metrics information. All available Apps information can be found inside the ODC Portal, under the left side navigation bar, under MONITOR.
The current options are the following:
App Health
App Health is a predefined set of dashboards with different indicators, which can provide App health information for different stages and for specific time periods.
These time periods can be used from a predefined set or customized, ranging from 5 minutes, up to one month.
There's a top dashboard with overall indicators, such as:
- Total apps - where the total number of Apps, for the selected stage, is indicated;
- Critical health - which shows all Apps in a critical state, for the selected stage;
- Moderate health - which will show all Apps with some non-critical issue, for the selected stage;
- Good health - which will show all Apps with good health, for the selected stage.
In a best case scenario, the Good health indicator matches the Total apps one.
Health score
This dashboard ranks all existing Apps in the selected stage, by their health, from the top best, to worst.
Health score is expressed in a numerical form and is based on the App's response time and errors during the selected time period and they are aggregated as following:
- Critical, from 0 to 70%
- Moderate, from 70% to 85%
- Good, from 85% to 100%
Top apps by request
This dashboard enumerates all Apps, in the selected stage, and ranks them by the ones with the most top requests and how many requests each one had, per hour.
Errors
This dashboard shows how many errors were in the selected stage and the error rate (error/hour) and the overall error percentage.
Response time
This dashboard presents the response time a server takes to handle a request. This dashboard shows two metrics:
- P90 response time - the duration for the 90% or the requests falling below the max response time;
- Max response time - the max response time for a request.
Requests
This dashboard presents how many requests were done and what is the request rate, per hour, for each App.
Other monitoring data - Logs and Traces
Besides the more graphical interface with an overall vision of the Apps health and potential issue identification, ODC Portal also provides access to Logs and Traces.
Each of this data has its own page and selection options, for filtering, in order to facilitate the access to specific data, in the case of validation or finding the root cause of some issue.
While logs are useful for helping on identifying the problem, traces will help to narrow down to the root cause of it.
Logs
Every App in ODC has its own logs. These logs can be automatically or code base generated.
For the first situation consider, for instance, when a timer fails to execute.
As for the code based generated logs, they can be triggered by exceptions or when a developer uses the LogMessage system action.
By default, when the Logs screen is selected, in the left navigation menu, you will see a list ordered by time in descending order. It's possible to change this order by using the filters in the list's header.
Each entry is defined by time of occurrence, severity, App's name, occurrence message and user.
Regarding severity, each occurrence is classified by three different severity levels:
- Error
- Warning
- Information
When a log entry is clicked, a new page with all log details is open.
In this page the log message, stack trace and any related logs are also shown. If the log has an associated trace, you can access the trace page, by clicking on the "Go to trace" button. This button is disabled, if there are no traces associated to the log.
Traces
Traces are generated when an App does a request using a server side element.
It shows an end to end request, with a series of intervals, known as spans. Each span is a logical unit of work and has a given duration.
They are useful for the following situations:
- Error identification and root cause analysis - identify in which span an error was thrown and what was its root cause;
- Performance evaluation - view the performance of each span in the app and dependent apps.
As for the Logs, when opening the Traces page, by default a list, ordered by time in descending order, is presented. Also, as for the Logs, this list can be filtered.
The available filters are:
- stage (by default, as in Logs, the first stage is Development, but other stages available can also be selected);
- App's name;
- Element type;
- User
- Status (any, ok, error);
- duration
- time range
When clicking on a trace entry from the list, a new page with the trace details is presented.
On the left side of the page, each span is listed in its respective execution order and along with its duration. The red bar will highlight which was the span who threw an error.
It's possible to dig down in a span to see its attributes and any related logs.
Logs and traces retention
ODC Portal shows logs and traces up to a month, but it's possible to retrieve both between four to seven weeks, by opening a support ticket. All logs and traces are automatically deleted by the system after seven weeks.
Sharing logs and traces
Since neither logs and traces can be downloaded, it may be useful to share them among team members.
When opening a log or trace in ODC Portal, the URL shows a unique ID, making the log or trace URL unique for either a log or a trace. It's possible to share this URL with another team member and have her or him to view the details.
Only team member with access to an ODC organization and the right permissions can open the log or trace URL.
Comments
0 comments
Please sign in to leave a comment.