The Jeyzer Show Case
Fictional case illustrating the Jeyzer benefits.
People with strong production background will find case variants and notes in the desk side discussion.
Ticket gets created
Friday afternoon.
Company insurance with several subsidiaries.
HQ accounting department needs to issue the quarterly figures for the company board.
Figures are still missing although consolidation activities have been triggered in all the subsidiaries.
Is this normal ? It’s getting late.
Internal support service is called and ticket gets created.
Support takes over
Support service jumps on this critical ticket.
This consolidation server is Java process, running with the Jeyzer Recorder Agent.
Support contacts IT Ops to get the JZR recording for the current day.
Support is hosting the Jeyzer Web Analyzer.
Support generates the JZR report right away, based on the received JZR recording, and attaches it on the ticket.
Jeyzer Web Analyzer displays the JZR report in red, which means critical incident occured.
Support wins
Support opens the JZR report under Excel.
Some sheet tabs are in warning (orange), confirming the analyzer insight.
Let’s get a look :
- Detected incidents : the top event is critical red and relates to a known applicative issue.
- Remediation action is to cancel the involved activities.
- Issue is fixed in later version of the Consolidation server
- Root cause is a race condition between 2 activities. Exotic case.
- Monitoring sequence : shows the events on a Gantt timeline.
- 2 consolidation activities are effectively running in parallel.
Support takes note of the activity details (the country here). - The critical incident event is also appearing afterwards.
- 2 consolidation activities are effectively running in parallel.
Support informs the end users about the situation.
End users do cancel the requested activities and re-execute it sequentially.
One hour later, all is ok : people can leave the office for a deserved week end.
R&D confirms and improves
The following Monday, R&D people get informed about the issue around a coffee with the support people.
The R&D team leader is checking the JZR report on the ticket : he confirms that the detected issue by looking at the Task sequence sheet (= more detailed view showing the active threads) : the 2 activities are looping effectively on this non thread safe map.
He leaves a detailed message on the ticket, adding a note about the recommended application upgrade as indicated in the JZR report.
Looking at other JZR sheets (profiling oriented), he finds a performance issue in another part of the application: surely he will fix it this week.
Time to move on ?
Desk side discussion
Notes
Consolidation activity events : where is it coming from ?
The Consolidation Server application creates those applicative events through the Jeyzer Publisher API.
The Jeyzer Recorder is then collecting those events periodically to dump it in the JZR recording.
Consolidation Jeyzer profiles
The Jeyzer Analyzer is using an applicative master profile dedicated to the Consolidation application.
Its analysis profile is automatically generated when the Consolidation application is Maven (or Gradle) built.
The monitoring profile contains one rule that detects this map looping issue and tells what to do.
More incident examples
The Jeyzer Labors demo is the good place to look at more incident cases.
Labors demo generates randomly 3 incidents out of about 80 recorded distinct cases.
It is then up to the Jeyzer Analyzer to detect it and list it in the JZR report.
Variants
Jeyzer Monitor in play
Jeyzer Monitor would permit to get the ITOps team alerted up front with this issue.
No need to wait for the end users to create the ticket, Monitor will do it automatically on JIRA.
Remember that Jeyzer Monitor requires a commercial license.
A trial version of Jeyzer Monitor is available in every Jeyzer release for non-production usage.
Without Jeyzer
That is the worst case scenario where end result depends on human talents.
Under stress and pressure, support and IT ops could restart the server.
That is quite dangerous if the application is not 100% transactional and fail-over safe.
This could typically end up with corrupted data in any part of the application, to be repaired by R&D over the week end.
Stepping back, support could look at the options. R&D would be immediately involved for guidance.
Possibly, someone would come up with the cancellation approach, missing however the sequential aspect.
Problem could then re-appear afterwards, making the week end shorter, or a few weeks later, in which case end users and management would be upset.
R&D would propose an emergency patch to add more logs. Some could have the idea to do a jstack from time to time which is good but one shot.
At last, someone could propose to put Jeyzer Recorder in place to have the complete view of the issue with a JZR report.
No applicative master profile
Let’s suppose that R&D team did not invest in a Jeyzer profile for the Consolidation application.
The analysis would rely on a generic master profile which is linked with Java and 3rd party profiles included in the Jeyzer distribution.
The Jeyzer monitoring rules would have detected a long CPU peak on those 2 activity threads indicating an issue.
2 CPU task events would have been raised (skipped in the above presentation to make it simple) and linked to the CPU consuming threads.
From these 2 thread lines, in the Task Sequence sheet, the support team would have identified the activities.
In the worst case, R&D could have been involved at that stage.
Note that the next time issue happens (may be this Consolidation is deployed in other insurance companies), the absence of Consolidation profile would mean the same investigation effort with probably different people.
To avoid this waste of time, think to translate the issues into monitoring rules, and go for a applicative master profiles.
No Publisher usage
Let’s suppose that R&D team did not invest in the Jeyzer Publisher.
The consolidation activity events would then of course miss in the JZR report.
What would be annoying is the lack of activity information (the GER and UK countries).
Support would have to discuss with the end users to get the business background.
This highlights the real benefit of the Jeyzer Publisher events : you combine a business view with a technical one. Very powerful.
No local Jeyzer Analyzer
This point is similar to the “No master profile” one.
Support would fall back on the Jeyzer Online Analyzer which is working with a generic profile.