Problem Management

1. Roles & Responsibilities

IT Team

  • Problem Manager
    • Owns the Problem Management process.
    • Ensures root cause analysis (RCA) and permanent fixes.
  • IT Admin / Network Engineer
    • Investigates technical issues (network outages, system failures).
    • Implements corrective actions.
  • Helpdesk
    • Escalates incidents that indicate underlying problems.
    • Maintains problem records in ITSM tools.

Facilities Team

  • Facility Problem Coordinator
    • Handles recurring physical issues (HVAC failures, power outages).
    • Coordinates with vendors for permanent resolution.
  • Safety Officer
    • Ensures compliance and safety during problem resolution.
  • Vendor Managers
    • Validate vendor fixes and preventive measures.

2. How to Manage Problems (Step-by-Step)

A. Problem Identification

  • Analyze incident trends from ITSM or ticketing systems.
  • Identify recurring issues (e.g., frequent ISP downtime, HVAC breakdowns).

B. Problem Logging

  • Create a Problem Record in ITSM tool (ServiceNow, Jira, etc.).
  • Include details: symptoms, impact, suspected root cause.

C. Root Cause Analysis (RCA)

  • Use techniques like:
    • 5 Whys
    • Fishbone Diagram
    • Fault Tree Analysis
  • Document findings.

D. Workarounds

  • Provide temporary solutions to reduce impact until permanent fix is implemented.

E. Permanent Fix

  • Implement changes (patches, hardware replacement, vendor upgrades).
  • Validate fix in production.

F. Problem Closure

  • Update records, communicate resolution.
  • Conduct Post-Implementation Review.

3. Best Practices

  • Maintain a Known Error Database (KEDB) for quick reference.
  • Link Incidents → Problems → Changes for traceability.
  • Regular trend analysis to spot emerging problems.
  • Vendor SLAs should include problem resolution timelines.

4. Tools & Templates

  • Problem Register (Excel or ITSM tool).
  • RCA Template.
  • KEDB for known issues and workarounds.