Tuesday, November 25, 2014

Total Cost of Ownership of BPPM - Part 7: Best practice - Name convention

Many things you can do in BPPM implementation to lower the total cost of ownership of BPPM.  Some of them are big such as architecture design.  And some of them seem to be small such as name convention.  However small they seem to be, they help minimize 'Winchester House' syndrome and lower the total cost of ownership of BPPM.

I would like to start best practice discussion with name convention.  BPPM doesn't post any requirement on how you name your BPPM components including files, cells, integration services, clusters, CMA tags, etc.  If there is no name convention in place, each person would end up naming BPPM components any way he/she likes.  Sooner or later, your BPPM implementation would become a Winchester House. 

Having a name convention in place at the beginning of BPPM implementation is a small effort but it pays many times back after the implementation is completed.  The name convention needs to be enforced at all time during the implementation and after the implementation.  Trying to rename things afterwards is a painful and error-prone process.

Pick up a 2 or 3-letter prefix for your organization.  It will help distinguish your custom files from BPPM out-of-box files.  For example, I used prefix C1 for CapitalOne and prefix CTI for CitiMortgage. Use this prefix for your custom PATROL KM files, PSL files, cell knowledge base MRL files, BAROC files, shell scripts, perl scripts, batch files, JAVA scripts, etc.

For custom PATROL KM, also pick up a short name for your KM.  I also prefer to use all capitals for KM short names.  Include this short name in all file names related to the KM.  This will make your KM easy to package and deploy without missing any file.  If the KM requires specific pconfig file, I would use the same name convention for the pconfig file.  For example, here are the file names I used for a custom CACHE KM: XYZ_CACHE_main.km, XYZ_CACHE_db.km, XYZ_CACHE_db_collector.psl.  XYZ_CACHE.cfg.  XYZ here is the organization prefix. 

For cell names, never ever reuse the same name in the entire organization.  Since cell names are prompted during BPPM installation, you need to have a discussion and decide on their names before you start installation.  The information needs to be included in cell names include: 1) Environment: dev, QA, or production. 2) Event source: PATROL, SCOM, external, correlation, server, etc. 3) Type: H/A or standalone.  For example, I named the cell located on the first clustered PATROL integration service in dev environment as Dev_PATROL1_HA. 

CMA tags also need to be determined in advance.  It can usually be divided into two different ways: by infrastructure (OS, DB, application) and by user environment. For example, you may have CMA tags like Windows_Base, Oracle_Canada, Peoplesoft_HR, etc.  The less overlap between tags, the easier to use and maintain them.  Be careful when making changes to existing CMA tags after implementation is completed because all PATROL agents with the same tag names will have their pconfig updated automatically. 

Monday, November 17, 2014

Total Cost of Ownership of BPPM - Part 6: Stop building Winchester House

Does your BPPM customization process look like building a Winchester house?

For those who are not familiar with it, Winchester House is a famous house located in San Jose, CA.  This giant 160-room house was gradually built during a period of 38 years with no overall planning and consistency.  Often staircases lead to the ceiling and doors open to solid walls.  Visitors are warned not to wander away from the touring group.  Otherwise they can be lost for hours.

All my "rescue mission" projects involved substantial effort of reverse engineering, re-designing, and re-implementing the previous customization that suffered from "Winchester House" syndromes. "Winchester House" syndromes substantially increase the cost to maintain and extend BPPM customization.

Here are some examples of typical "Winchester House" syndromes:

1) In custom PATROL knowledge modules, some KMs read data input from pconfig, some KMs read data input from external files.  And there is no standard location for external files.

2) In custom BPPM cell knowledge base, some events use mc_object slot to determine their Remedy support group, some events parse the msg string to determine their Remedy support group.

3) In custom BPPM cell knowledge base, different events use different rules for update even their requirements are the same.

The root cause of "Winchester House" syndromes is lack of development experience.  As stated in the previous post, many BMC customers use the same resource for installation and for customization. Maintaining an accurate and up-to-date documentation can help reduce "Winchester House" syndromes.  However the key to eliminate "Winchester House" syndromes is to have an experienced developer to set up a well-defined framework, name convention, and best practices at the beginning of BPPM customization.

Eliminating "Winchester House" syndromes can dramatically lower your total cost of ownership.  We will discuss best practices for BPPM customization in the next few posts.

Monday, November 10, 2014

Total Cost of Ownership of BPPM - Part 5: Have separate resources for customization and operations

I have been to several rescue missions when a customization or integration stopped working on a different type/version of input and the client had no idea how to modify it.  Because there was very little accurate document about the customization or integration, I had to reverse engineer the whole thing from various source code, configuration files, and test cases to understand how the customization or integration was done.  Then I needed to make necessary modification to make the customization or integration work again.  After that it took me some time again to write the complete documents that should have been there at the first place including release notes, deployment guide, and troubleshooting guide.

All the time spent in reverse engineering and re-documentation increased the total cost of the ownership to the client.  Sometimes, the customization or integration had to be completely re-written because the original one was not architected right.  In this case, the total cost of ownership would be increased even more. However this cost along with the headache can be completely avoided if the customization or integration was done right at the first place.

How did this happen? A typical story goes like this: The IT organization hired an employee or long-term consultant responsible for day-to-day operations support including installation, configuration, reporting, user support, and trouble shooting.  At the same time, the same person is required to develop customized solution, to program PSL and MRL, and to integrate 3rd-party monitoring software into BPPM.

We all know operations and development require different skill sets.  A person who is good at repetitive work in day-to-day operations support often does not have enough experience to develop robust and maintainable solution for customization and integration.  A person who is good at developing customization and integration is often bored with day-to-day operations.  Requiring one person to do both operations and development is often the root cause of many troubled BPPM operations. 

As an example, here is job description from a recent job ad:
- Administer and configure Enterprise Systems Management/Monitoring systems such as BMC ProactiveNet Performance Management (BPPM),BMC Event Manager (BEM) and TMART.
- Integrate end to end Infrastructure solutions to BEM for Unified Event Management 
- Integrate enterprise solutions such as Netcool , Omegamon, CA Wilyto BEM/BPPM

Because the hourly rate of operations support staff is lower than a developer and there are many more people with operations support skill than people with development skill, most IT organizations end up hiring an operations support staff and also requires him/her to develop customization and integration.  On the surface, it appears that it would lower the cost due to lower hourly rate.  In reality, the opposite is true.  Due to lack of experience and skill, it takes considerably longer for an operations support staff to complete the development work of  customization and integration.  And the solution is often hard to maintain, extend, or troubleshoot.  The one who developed the solution is often the only person who can understand and maintain it.  The entire solution falls apart when this person moves out of the organization.

To avoid this mistake, it is highly recommended to hire an experienced developer to develop a robust and well-documented solution for your customization and integration requirements.  A good developer can develop a robust solution quickly and hand over his/her finished work to your operations support staff for maintenance.  The hourly rate of a good developer may be higher than an operations support staff, but the time required to complete the development is much less.  Development work is a one-time cost.  When the development is completed, there is no need to keep the developer around anymore.  Do a math on your total budget vs just the hourly rate.  At the end it should cost you less on your total budget by hiring a good developer.

We have been doing customized BPPM development for the last 12 years.  To further eliminate a client risk,  we offer fixed-price quotes on well-defined requirements that will fit your total budget. For example, if you estimate that it will take your operations support staff 6 weeks at $75/hr to finish an integration project, we can take the project to fit your total estimated budget ($75 x 240 = $18,000).  We can afford to do this because we have developed a time-approved methodology for any integration and most of source code has already been developed.  Of course, we will be able to deliver the solution to you in far less than 6 weeks.

Monday, November 3, 2014

Total Cost of Ownership of BPPM - Part 4: Customization in BPPM operations

In the previous posts, we discussed the first two types of BPPM customization: adding new features and integration.  In this post, we will focus on the third type of BPPM customization: automation in BPPM operations.

When you build a house, the time you will be living in that house is much longer than the time you spent to build your house.  It would make sense to add some extra features to the house during the construction to make your life easier after you move in.  For example, you may want to wire for surrounding sound speakers and Ethernet during construction.

The same applies to BPPM implementation. For every BPPM implementation, its costs more in many years of operations than in a few months of implementation.  It would make sense to add the necessary customization during implementation that will cut down the cost of operations.  For every full-time staff you save in operations, that is about $100K saving year and year. 

Although some automation for BPPM operations can be added after implementation is completed, majority of the automation should be planned as part of the implementation.  As an example, I am going to share how our 'cell extension' product has kept large BPPM operations manageable with a small support team.

BPPM support team spends majority of their time to: 1) Gather user requirements on monitoring, email notification, and ticketing assignments; 2) Configure or customize BPPM to meet user requirements; 3) Optionally develop customization in dev environment; 4) Test configuration and customization in UAT environment; 5) Deploy configuration and customization in production environment under change control; 6) Set up blackout period for scheduled and on-demand maintenance windows; 7) Trouble shoot when things don't work as they should be.

Our 'cell extension' product is a BPPM cell knowledge base applied on top of BMC's out-of-box knowledge base.  It includes all the common features in BPPM operations such as event updates, repeats, aggregation, delay, blackout, rewording, email notification, Remedy ticketing, etc.  For PATROL and Portal events, no cell policy configuration or rule programming is required.  For all 3rd-party events, you only need to provide slot mapping from the 3rd-party events by following our clear sample code as a one-time setup.  As an added bonus, it also provides plug-and-play interfaces for 2-way integration with any 3rd-party monitoring software.

Our 'cell extension' product is entirely data driven thus it works for every BMC customer.  It uses a few simple forms for user input.  All these forms can be viewed and updated offline without BPPM access.  An end user can fill up those forms with the help from BPPM support team initially.  Many end users even choose to own those forms and put them under source control system.  Through these forms, an end user can tell BPPM what events they are interested, how they want the message to be formatted, who to email, which group the ticket should be assigned to, etc.  BPPM support team only needs to help end users to fill up the forms or fill the forms for them, check the forms for error, and load those forms directly into BPPM.  And the same forms are used for dev, UAT, and production environment.

Since data input has been dramatically reduced, and there is no need to configure cell policy or develop cell rules, you only need a small BPPM support team to run a large BPPM operations.  Troubleshooting will be much easier since all events follow the same flow.  Human input error will be minimized when data are input only once.  BPPM cells will process events faster since no event policy is used.  A user requirement can be completed in minutes instead of days.

If you are interested in finding out how our 'cell extension' product can work for you, please feel free to contact us by clicking the link on top of this page.