* Hint: Please click on the pictures in order to maximize them

 

So after working for a client on a Meeting-Place related issue, this particular case took centre stage in our closing remarks/ conversation. The client complained that every now and then, Cisco Tomcat Service would crash and results in the problems listed below:

1)      Admin and end-Users cannot log into Call-Manager web pages when the server fails. The client noted that even after a Cisco Tomcat Service restart, the time required to load a Call-Manager page would continuously degrade/increase  as time elapses.

2)      The client also informed me that their corporate directory also crashes when the Tomcat service crashes.

I then decided to do a quick diagnostic test on the server’s CLI by issuing ‘utils diagnose test After issuing the command, I got the output below:

admin: utils diagnose test
Log file: platform/log/diag2.log
Starting diagnostic test(s)
===========================
test – disk_space : Passed (available: 24897 MB, used: 11727 MB)
skip – disk_files : This module must be run directly and off hours
test – service_manager : Passed
test – tomcat : Passed
test – tomcat_deadlocks : Passed
test – tomcat_keystore : Passed
test – tomcat_connectors : Passed
test – tomcat_threads : Passed
test – tomcat_memory : Passed
test – tomcat_sessions : FailedThe following web applications have an unusually large number of active sessions: axl. Please collect all of the Tomcat logs for root cause analysis: file get activelog tomcat/logs/*
test – validate_network : Passed
test – validate_network_adv: Passed
test – raid : Passed
test – system_info : Passed (Collected system information in diagnostic log)
test – ntp_reachability : Passed
test – ntp_clock_drift : Passed
test – ntp_stratum : Passed
skip – sdl_fragmentation : This module must be run directly and off hours
skip – sdi_fragmentation : This module must be run directly and off hours
test – ipv6_networking : Passed
Diagnostics Completed

At this stage, I was quite happy that the diagnostic test had not only discovered the location of the problem, but had also pointed me in the right direction with regards to collecting the appropriate logs.  However, I had one problem with the output: If I followed the directions of the output and issued the command; ‘ file get activelog tomcat/logs/*’ ,  I would essentially be pulling all the logs in that directory when I only really needed the ‘.hprof ’  file.

As can be seen in the screenshot below, I then decided that the best thing to do was to collect the specific ‘.hprof’ file for the specified time-stamp.

rtmt

The next phase was to pass the .hprof file through Eclipse Memory Analyser as can been seen in the screenshot below.

The steps followed were

i)                    Click file menu

ii)                   Select open heap dump

iii)                 Select the ‘.hprof’ file extracted from CUCM using RTMT

iv)                 Then click finish.

eclipse

leak1

leak2

So from the screenshot above, when the trace heap-file was analysed, com.rsa.sslj.xcu was seen to be consuming over 75.81% of memory heap. I had already been combing the internet and doing Bug-scrubs on Cisco’s Bug Tool Kit so I was sure I had hit bug CSCty36110

 

I double checked the bug’s details (screenshot below) at Cisco’s website and it confirmed my conclusion. It stated that if I see ‘com.rsa.sslj.x.cu’ as being the major cause of the Tomcat Crash when analyzing the trace, I should know that I am hitting bug CSCty36110.

It was at this point that I recommended an upgrade of CUCM to version 8.6.2.22900.9 as it is a version that has the fix

Another case had been seen to its end and I was relaxed.

 

toolkit