Pink Elephant
The IT Service Management Experts

Troy's Blog

The Hitch Hiker's Guide to the IT Galaxy and Beyond
Don't Panic



Troy Dumoulin Photo

Troy DuMoulin, VP, Research & Development

Troy is a leading ITIL® IT Governance and Lean IT authority with a solid and rich background in Executive IT Management consulting. Troy holds the ITIL Expert certifications and has extensive experience in leading IT Service Management (ITSM) programs with a regional and global scope.

He is a frequent speaker at IT Management events and is a contributing author to multiple ITSM and Lean IT books, papers and official ITIL publications including ITIL’s Planning To Implement IT Service Management and Continual Service Improvement.


The Guide

"This blog is dedicated to making sense out of the shifting landscape of IT Management. Just when we thought we had a good handle on managing technology, the job we thought we knew is being threatened by strange acronym’s like ITIL, CMMI, COBIT, ect.. Suddenly the rules have changed and we are not sure why. The goal of this blog is to offer an element of sanity and logic to what can appear to be chaos."

Hitch Hiker's Guide to the Galaxy

"In many of the more relaxed civilizations on the Outer Eastern Rim of the Galaxy, the Hitch Hiker’s Guide has already supplanted the great Encyclopedia Galactic as the standard repository of all knowledge and wisdom, for though it has many omissions and contains much that is apocryphal, or at least wildly inaccurate, it scores over the older more pedestrian work in two important respects.

First, it is slightly cheaper: and secondly it has the words DON’T PANIC inscribed in large friendly letters on its cover."
~Douglas Adams


Troy On Twitter

Recent Entries



Other Blogs


Monday, July 23, 2007

Service Improvement Tips - Knowledge Management

Canadian summers are short and sweet so we typically take every opportunity to enjoy the fleeting weekends as they fly by. This weekend I had the opportunity to spend some quality time with the family at the cottage and got into an interesting discussion with my son Caleb who asked me about the saying “knowledge is power”. So sensing an opportunity to communicate with my pre-teen we began to discuss the differences between knowledge and wisdom.

The question of course is whether collecting and managing data and information truly increases your capabilities and ultimately your power to apply this knowledge towards some goal or end? It is the word “apply” that points to the key differentiation between knowing something and the more important question of having the wisdom and insight of how and when to apply the use of that information to some advantage.

In ITIL V3 we find the process of Knowledge Management outlined in the Service Transition book with the following goal: “The goal of Knowledge Management is to ensure that the right information is delivered to the appropriate place or competent person at the right time to enable informed decision.” The key element of this goal is the delivery of information at the right time in support of making an informed decision which is what moves the discussion into the domain of wisdom.

The Service Transition book illustrates this truth with the (DIKW) Model where the goal is to move information through the following lifecycle (Data – Information – Knowledge – Wisdom).  The key for any Knowledge Management process is to implement activities and roles that translate unstructured data into useful wisdom. Here are two tips for realizing this goal:

* Remember KM Is A Process
While this may seem obvious it takes intent and action to move data to wisdom. This means that a process has been designed and implemented where knowledge is:

1. Submitted as a candidate
2. Filtered and validated for accuracy
3. Expanded and added to for context and application
4. Classified to facilitate ease of look up and access
5. Approved for promotion into an official knowledge repository
6. Updated through its lifespan to reflect the most current information and context
7. Retired when the knowledge is no longer relevant

To accomplish these activities will mean a substantial investment and ongoing application of resources (people, process and technology). Otherwise, what occurs is unstructured data is dumped without qualification, assurance or validation into a database or a file store. We have all seen this before where a repository of so-called knowledge is turned into a bucket to dump data and information that is considered useless and at best out of date. We even accelerate this by implementing personal key performance indicators for IT staff that require individuals to dump data into this bucket on a regular basis.

What is interesting is that the concept of Knowledge Management is not new with the publishing of the most recent ITIL version. The concepts, processes and roles required for Knowledge Management have been well documented for years and yet very few organizations have implemented this practice for internal knowledge. Instead, we typically turn to external sources of information that can be purchased where we know the provider has taken the time to do all of the things I have listed above. Of course this can never replace the need to capture and manage internal IT experience and wisdom.  The old axiom comes to mind. “Garbage In – Garbage Out”
* Knowledge Does Not Occur In A Vacuum
One of the key elements for providing context around IT Service Management knowledge is the use of a common classification scheme for all processes related to delivering those services. For knowledge to be timely and useful it must be easy to obtain based on the need that drives its requirement. For this reason it is imperative that it uses the same common classification scheme that you would use for other processes such as Incident, Change, Procurement, Security Release Management, etc.  This topic I have addressed in detail in the following post: It’s Classified

One additional consideration around classification is that the KM tool should be an integrated module of your ITSM suite. Several tool providers support the ability to associate a knowledge record to a workflow record such as an Incident ticket in much the same way you would relate a Configuration Item record from the CMDB. The power of this capability supports the ability rank knowledge by use and its effectiveness.

So to summarize this discussion I would suggest that knowing information and facts is not enough to support the goal of Knowledge Management. Perhaps the saying should be more correctly stated: “Wisdom is Power.”

Troy’s thoughts, what are yours?

The following is a relevant quote from “Hitch Hikers Guide to the Galaxy”

“’...You hadn’t exactly gone out of your way to call attention to them had you? I mean like actually telling anyone or anything.’
`But the plans were on display…’
`On display? I eventually had to go down to the cellar to find them.’
`That’s the display department.’
`With a torch.’
`Ah, well the lights had probably gone.’
`So had the stairs.’
`But look you found the notice didn’t you?’
`Yes,’ said Arthur, `yes I did. It was on display in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying “Beware of The Leopard.”  ~Douglas Adams

(0) Comments
Posted by Troy DuMoulin on 07/23 at 12:45 PM
ITIL & Beyond (0) TrackbacksPermalink

Don't Panic

Wednesday, July 18, 2007

Service Improvement Tips – Problem Management

For several decades Pink has been conducting process assessments based on ITIL and over that period we have seen consistent trends in which processes were more mature than others. As you would assume the maturity of Configuration Management is universally depressing (a discussion for another day). However, one that might surprise you is that Problem Management has consistently come out as one of the least mature processes across hundreds of first time assessments. It seems that while companies do focus on the Service Desk and the process of Incident Management our consistent finding is that very few IT organizations focus on the process that is designed to remove errors and instability from our service environment. Perhaps this is due in part to our IT culture that rewards firefighting skills and quick resolutions over and above back office analytics and proactive activity.

Historically we have been much more interested in 1st call resolution rates then we have been on problem avoidance or deploying solid production assurance processes like Release Management. This topic I have explored in detail in the following blog post: Problem Management screws up our metrics!

However, recently I have observed a renewed interest in problem identification and Incident reduction thanks to the guidance provided in ITIL.  The following tips represent common pitfalls on the way to implementing this important process which I have observed in numerous organizations.

  • Root Cause Analysis (RCA) is not Problem Management:

    At an early stage of Problem Management implementation many organizations develop a reactive process to analyze and report on major business impacting Incidents. In short, these Incidents are significant in that they have caused a business impact that was highly noticeable and most probably introduced significant cost or risk to the business. At its best this process is quickly executed as soon as the service has been restored and a high priority investigation occurs where a detail report is generated describing the business impact and the contributing factors (People, Process and Technology) that led to the outage. However, the report should not only identify cause and effect but should also identify specific and concrete actions that will avoid a similar occurrence in the future. The RCA process then continues to ensure that the actions are carried out.  While this is a commendable and useful activity it is only one element and arguably not the most important activity of the Problem Management process. Many, if not most, organizations are satisfied with leaving their process at this level of maturity and never dive deeper into trending and removing the repeat incidents of lesser perceived impact and then move on to further address proactive Problem Management. My strong advice is that to be truly effective you have to mature this process from only focusing on the large business impacting issues to also looking for trends and removing repeat Incidents. This topic is further discussed in the following post: Problem Management vs Root Cause Analysis

  • Central Ownership with Distributed Coordination:

    The next two tips have to do with the roles of Problem Management. In and of itself this is not a hard process to understand or implement relative to the other processes described by ITIL. However, that being said a common issue that I have seen is that while the process has been defined and a central Problem Management function has been established many companies still struggle with making this process effective. In my experience this is largely due to the fact that a distributed role that I will call a Problem Coordinator must be identified and resourced in each IT domain for this process to be effective. What happens when you identify a central governance function and process without their distributed counterparts is that you identify problems and produce interesting trending reports that show you where the pain is but very little is actually done with this information. In essence you build an inventory of identified problems without a real focus on removing them from the environment. Remember that what we are talking about here are the lesser impacting but numerous repeat issues identified through reporting and trending. The large business impacting Problems are getting the attention they need based on the observations in the first point. This problem is address by establishing a distributed Problem Coordinator role for each domain and then ensuring that the manager and the groups KPIs reflect the importance of the process. (What gets measured gets done!) When this role is established the central governance roles focus on reporting, trending, problem identification and prioritization. While the distributed role is responsible for resource allocation, investigation and root cause determination as well as identifying permanent fixes. The challenge that a central function has without these distributed coordinators is that it does not own the resources required to execute the process so it is forever going around with hat in hand literally pleading for resources to move this process forward.

  • Leave the Problem Management Resources out of the Major Incidents:

    Another common challenge I see around Problem Management is the observation that often the people who have been resourced against this process are also often required to be the champions of the major Incident or crisis management processes. Many companies in accordance with best practice have a specialized approach to coordinating the activities around resolving a major Incident. It often seems logical that the same person who is responsible for the post mortem or major Incident review activities should also chair or captain the major Incident restoration process. My advice and experience on this matter is to resist this assumption and make sure that these two processes are handled and resourced separately. The reason for this advice has two primary drivers. The first driver is that time after time where I have seen this double duty applied the individual spends most of their time in a fire fighting mode and has precious little time left over to step back and take an overall view of the big picture. The other reason for this suggestion is that the skills required for facilitating a major Incident process with high stress are not necessarily the same as those required for the detailed, tenacious and analytic approach required by Problem Management. In summary keep the central Problem Management resources out of the firefighting mode and let them do the incredibly important work of systematically identifying and removing service delivery issues related to people, process and technology faults.

Troy’s Thoughts What are Yours?

“The major problem - one of the major problems, for there are several – one of the many major problems with governing people is that of whom you get to do it; or rather of who manages to get people to let them do it to them. To summarize: It is a well known fact, that those people who most want to rule people are, ipso facto, those least suited to do it.  To summarize the summary:  Anyone who is capable of getting themselves made President should on no account be allowed to do the job. To summarize the summary of the summary: people are a problem.” ~Douglas Adams


(0) Comments
Posted by Troy DuMoulin on 07/18 at 11:15 AM
ITIL & Beyond (0) TrackbacksPermalink

Don't Panic

Tuesday, July 10, 2007

Service Improvement Tips – Incident Management

Over the month of June a group of us at Pink have been involved in an ITIL v3 Pink Perspective road show. As the final session of the two day event we ended with a Question and Answer period where the attendees posed questions to the panel of Sr. Pinkers. At our last event in Seattle we were given the challenge to come up with two or three tips for what ever process the attendees chose to call out. This turned into an interesting time of Q&A that I would like to continue in a series of blog posts.

For the first process I will start with Incident Management; however, feel free to shoot me a comment for what processes you would like to see covered. In my opinion the following points represent a targeted set of improvement activities that will drastically improve the effectiveness, efficiency and benefits of this process.

  • Establish End to End Ownership of your Incident tickets:

    If you do nothing else it is critical to establish and maintain ownership and monitoring of the Incident ticket with the initiation body. In the case of user/customer based tickets this is the Service Desk. However, Incidents that do not have a customer impact such as a failed batch job that is restarted or a device failure that is supported by a planned failover strategy can be opened and owned by an IT function other than the Service Desk. The undesired alternative to this is for the ticket ownership to be transferred to what ever group is currently working on the Incident. The challenge with transferring ownership is that a view of the user experience is lost. In this model every time the ticket is assigned to a new support group the proverbial clock starts from the beginning. As we all have experienced this can turn into a game of hot potato and the Incident ticket can bounce several times before the right group is even working on the issue. By establishing the end to end ownership of the Incident ticket the IT black hole syndrome is addressed. What I refer to with this analogy is where the Service Desk does a decent job of logging the ticket but then transfers it with its ownership to a queue monitored by a support group where it is lost from sight until a user or customer yells loud enough for it to be found and prioritized. To be fair this scenario is usually only reflective of the Incidents that are not high profile or have a large business impact. These typically get the attention they need through to resolution and closure. However, the Incidents that are of lower impact and relate to a user that is not a VIP get lost on a frequent basis. This of course represents most of the Incidents currently opened in your workflow tool. Establishing end to end ownership means that the Service Desk retains the ownership and responsibility for monitoring the life of the Incident regardless of which internal or external IT groups are currently working the issue. This allows the Service Desk to raise the appropriate flags when escalation timeframes are breached in support of the established agreements.

  • Focus Process Activity on Service Restoration Not Technology Repair

    Contrary to popular belief the objective of Incident Management is not to fix things that are broken. That is the role and objective of Problem Management. Incident Management as defined by ITIL is all about Service Restoration. When the support group that needs to look at the Incident ticket is finally located and assigned the issue it is typically focused on fixing the technology as opposed to restoring the user’s service experience.  It is very possible that the service can be restored through an alternative means while the technology that has failed remains out of commission. The effectiveness of the Incident Process can be greatly enhanced if you are successful at embedding this goal into your IT support culture.

In short, first restore the service and then afterwards fix the technology

Troy’s Thoughts What Are Yours?

“They rented a car in Los Angeles from one of the places that rents out cars that other people have thrown away. “Getting it to go round corners is a bit of a problem,” said the guy behind the sunglasses as he handed them the keys, “sometimes it’s simpler just to get out and find a car that’s going in that direction.” ~ Douglas Adams


(1) Comments
Posted by Troy DuMoulin on 07/10 at 11:00 PM
ITIL & Beyond (0) TrackbacksPermalink

Don't Panic

Page 1 of 1 pages