How to operate Problem Management
How to deal with major incidents or major problems 1. Notification of a problem 2. Requesting technical support 3. Problem analysis 4. Produce theory 5. Produce resolution 6. Results of resolution 7. Problem closure
How to deal with major incidents or major problems
A major incident or problem can be classified as one which causes serious disruption to the computer service in the school. This can include:
  • a virus outbreak or threat
  • closure of Internet services
  • file server failure
  • partial or total network failure
  • building problems - for example, fire, smoke, flood or frost damage
  • software problem affecting over 30% of computers.

Major incident process
  • Service desk to field all calls and reschedule planned incident responses
  • Technician to be notified of the major incident
  • Technician to identify extent of problem before taking any action
  • School to allocate an additional person to help the technician
  • Additional person to be responsible for communication between the technician and users and to provide ad hoc help to enable the technician to deal with the incident
  • Technician to discuss with the school leader the extent of the problem and a planned response (the school leader needs to know how to reschedule the planned work that involves the affected computers)
  • School leader to ensure that the technician has the necessary resources to deal with the problem - including time
  • Technician to decide how long to continue with trying to fix the problem before calling on the school's 'disaster recovery' option

1. Notification of a problem
It is either the single point of contact at the service desk or the technician who will decide if an incident is really a problem.
The user will notify an incident in the usual way using the incident form.

When the service desk receives the form it will be checked. With experience the service desk will know if this is really a problem to be passed through Problem Management. Otherwise it will be passed to the technician in the usual way.

Deciding if an incident is a Problem
When the incident sheet is checked by the service desk the following may be noticed:
  • the same type of incident reported on several other computers in the last few days
  • the same type of incident reported on this computer in the last few weeks
  • yet another fault on the same computer on a regular basis.

These checks can be carried out by looking at the call log or by using simple searches using a find function to spot certain words or phrases - for example network card, pc24 (using the computers unique reference) or printer jam.

The single point of contact (SPOC) at the service desk may record this information on the incident sheet and inform the technician when placing the call

2. Requesting technical support
If the service desk has decided that this is a problem, then it must be passed to the technician and no further diagnostic work is required by the service desk.
The technician is informed in the usual way about the call and the service desk will advise why this is thought to be a problem.

If the school has 'swap out spares' that can be installed by a competent person in the school, the technician may advise doing this before any further work is done to the faulty equipment. The benefits would be:
  • time saved waiting for the technician to be available to attend to the call
  • reduction in time the equipment is unavailable
  • an opportunity for the technician to investigate the problem without being under time pressure.

3. Problem analysis
Problem analysis uses common sense, asks lots of questions and should not be too far fetched with the final theory.
The technician should remember that using phrases such as 'power line glitch', 'infrequent reset phenomenon', 'intermittent random fluctuating memory address' and other such odd sounding phrases do not impress the user. If the technician is not sure what is happening then they should say so.
The technician should avoid losing credibility with the user by trying to sound as if they always know the answer, honesty is the best policy. As long as the technician has an answer - which may be to replace the equipment or that a purchase is required - then they should remember this is the job they are expected to do.

See problem analysis tools in Problem Management resources

4. Produce theory
From the evidence, analysis and experience, the technician produces a theory for what is happening -orĀ  has happened.
The theory of 'what went wrong' is then used to produce action to resolve the problem. The first theory may not always be correct and the technician should try to show why this is. Therefore unexplainable theories should be avoided!

5. Produce resolution
Pause before taking action.
Write down exactly what was done and the outcome. Do this for all actions taken - even if it is one line of a system setup file.
The step by step actions should be able to be traced to find out what the technician did to resolve the problem and therefore help in future problems. This process is a huge learning curve and although time consuming, making notes is very important.

6. Results of resolution
The results of the resolution may impact on many systems in the school. If a plan is to be drawn up to replicate the actions across other systems, this must be done using the process described in Change Management.

7. Problem closure
The technician updates the incident diagnostics sheet and the incident sheet and passes them to the service desk.
The service desk performs the usual call closure operations.