|
|
|
Applications |
Experts |
Maps |
Network Mgmt |
OS |
Packets |
Problem Management |
Philosophy |
SAN
|
|
Problem Mgmt |
Problem ManagementI function as our company's ITIL Problem Manager, both managing the process and leading Root Cause Analysis efforts. ProcessThe Problem Managment Process I manage, along with the flow which a typical problem traverses, an example Problem Record from the tool (Redmine) we use to track Problems, what we do during review meetings, and how we report monthly and quarterly (more examples) to management. Templates for reporting Problems upstream I view Problem Management as an IT-specific instance of Risk Management and view its theroetical underpinnings in ways consonant with the following:
RCA MethodologyMy favorite methodology, with backing checklist, for managing an RCA comes from Advance7 and is described in detail in their Rapid Problem Resolution book and in various white papers. I facilitate a hands-on workshop in which participants split into small groups and practice a simplified version of the RPR Methodology along with analysis skills, working through real-world RCAs. During RCAs, I often set long-running packet captures going and later extract key frames from directories full of the capture files. Incident AnalysisWhat Takes Us Down?, published in the October 2012 ;login. My analysis of this data set suggests that timely Patching and proactive Testing can convert Unplanned incidents into Planned events, although I admit that the argument isn't compelling. Summary of the data set: I've charted statistics extracted from the database in several ways, none of which tell a persuasive story to me. Note that the database starts October 2010 and ends June 2012.
|
|
Prepared by: Stuart Kendrick Last modified: 2013-June-14 |