Monday, September 15, 2014

PatrolCli - Part 4: Event reporting to help troubleshoot missing events

This happens to all of us: You are expecting a critical event sent from a PATROL agent, but for some reason you don't see it in BPPM/BEM GUI.  Now you need to troubleshoot. 

First, you need to determine if the PATROL agent never sent the event out or if a cell rule/policy dropped the event.  To determine if the PATROL agent has ever sent the event out or not, you can use PATROL console and look through the events using its event manager GUI.  However, this method is not very reliable especially when your PATROL agent generates a lot of events.  If you cannot find the event in PATROL console, it doesn't necessarily mean that PATROL agent didn't send it out.

A quicker and more reliable way to determine if the PATROL agent has ever sent the event out or not is to run an event report using PatrolCli.  Not only it saves you the effort to bring up and log into PATROL console, it will save all events meeting your criteria into a text file so that you can search through it over and over again for multiple events.

Here is an example to generate an event report.  Assuming that you are expecting an event from PATROL agent running on server1 to send an ALARM/Critical event from parameter /NT_LOGICAL_DISKS/C:/LDldDiskTimePercent.  But for some reason you don't see it in BPPM/BEM GUI.  You would like to find out if PATROL agent has ever sent that event to BPPM cell. 

To do so, you can simply start PatrolCli from the server you are currently working on as long as it has a PATROL agent running and has permission to connect to PATROL agent running on server1.  Then you just set up an event filter and run an event dump command to save your event report to a file on your current server.

Myserver> PatrolCli
PCli% open server1 3181
Username: patrol
Password:
PCli% event setfilter 091520002014 "" "" A "" "" "" "" ""
OK
PCli% event dump C:\tmp\events.txt W
OK
 
In this example, the setfilter command set the event filter as all ALARM events starting on Sep 15 20:00:00 2014 until current.  The event dump command specified the location of the event report output file.  The mode 'W' is for write vs mode 'A' for append.

Once the above command has finished, you can open the event report file with a wordpad and search for the events generated from parameter /NT_LOGICAL_DISKS/C:/LDldDiskTimePercent.  For example, you may see your event in your event report like this:

Id          : 2118531
Status      : OPEN
Type        : ALARM
Severity    : 4
Time        : Mon Sep 15 22:46:40 2014
Node        : server1
Origin      : NT_LOGICAL_DISKS.C:.LDldDiskTimePercent
Catalog     : 0
Class       : 11
Description : Alarm #2 of global parameter 'LDldDiskTimePercent' triggered on 'NT_LOGICAL_DISKS.C:'.  69 <= 100.00 <= 100

No comments:

Post a Comment