Event blackout rule or event blackout policy in BPPM cell is something we all rely on to suppress alerts during regularly scheduled maintenance window. Upon exiting blackout period, if a PATROL parameter alert (e.g. process down alert) is still present, what should you do?
If you choose to ignore it and the process is still down, no one will be notified. PATROL agent only generates an alarm event once when a process goes down. If the process went down during the blackout period with no notification sent from BPPM cell, PATROL agent will never generate another alarm event again if the process remains down after blackout period ended.
If you choose to send a notification for every suppressed alert in BPPM cell upon exiting blackout, you may send out lots of false alarms. During the maintenance window, many PATROL agents may be restarted as the result of server reboot or PATROL configuration change. The process that was previously down may be brought up as the result of PATROL agent or server restart. However a newly started PATROL agent will not generate an OK event since there is no state change on PATROL parameter.
Either way, we have a problem. The best solution is for BPPM cell to re-check PATROL parameter status for each outstanding alert upon exiting blackout. From all the PATROL users I have talked to, this is one of the most-wanted features for event blackout. Although this feature doesn't come out of box, you can write your own code using PatrolCli.
For example, you can use the following PatrolCli command to check 'mcell' process status:
PCli% execpsl get("/NT_PROCESS/mcell/PROCStatus/status");
OK
It does require some advanced MRL programming skill to tie everything together. If you need more help, please feel free to contact us for consulting services. We have developed a proprietary extension for BPPM cell that have addressed many out-of-box limitations including event blackout.
BPPM (BMC ProactiveNet Performance Management) or TrueSight Operations Management (the rebranded name) suite is the latest solution from BMC Software for enterprise system management. It combines the data analytic engine from ProactiveNet, the event processing engine from BMC Event Manager (BEM), and the server/application monitor from PATROL into one product. This blog is intended to share information and experience on TrueSight/BPPM implementation, customization, and integration.
Hi Willa,
ReplyDeleteIf the above command returns ALARM , it means the parameter still remains in ALARM state even after BLACKOUT ? If so , we will have to write PSL scripts to trigger an alert right?
Thanks,
Jeevan Anne
Jeevan,
ReplyDeleteThank you for your message. What specific action to take if the parameter is still in ALARM state after blackout depends on how your BPPM cell KB was programed. If your cell KB requires a new event to trigger notification and ticketing actions, you need to generate a new event using MRL or PSL (haven't tried PSL myself). If your cell KB can trigger notification and ticketing actions based on custom action-flag slot value without receiving a new event, you can simply reset that action-flag slot. Since every cell KB is custom developed, you have several options based on your customization.
Thanks!
Willa
Hi Willa ,
ReplyDeleteThanks for the prompt response . I totally understand the customization part .
For the PCli% execpsl get("/NT_PROCESS/mcell/PROCStatus/status"); command , if it returns ALARM , does it mean the parameter is still remains same (alerting) even after BLACKOUT windows?
Thanks,
Jeevan Anne
Hi Jeevan,
ReplyDeleteYes, you are right. It would mean that the parameter is still in ALARM state after blackout ended.
Thanks!
Willa
Hi Willa ,
ReplyDeleteThank you for confirming and also for sharing some insights related to Patrol , BPPM , BEM . Great source of information .
Thanks,
Jeevan Anne
Jeevan,
ReplyDeleteHappy to know that my blog is helpful to you. Thank you so much for your encouragement and support!
Willa