Ops Mngt Lesson 8 Exercise
This blog will discuss about Lean management, assessing each step of a services process and deciding whether each step is adding value from Lean thinking perspective. I will also analyse delays, quality checking points and control flow.
In the diagram below, a simple process for Wintel (Windows servers running in Intel microprocessors) infrastructure incident resolution is depicted. Once a problem is identified by any of the identification groups a call is made to the Incident Manager who attends it and open a ticket into a ticketing system with all the information available provided by the customer.
Once the ticket is open, this is shown on the technician screen who checks if the incident refers to their group (the Wintel support in this case) and if not, it returns it to the incident manager for reassignment to another group.
If the technician accepts the ticket, he began working in the incident and may call the end user or the Incident Manager for additional information.
After some investigation, the technician can discover that the problem is an application problem and not an infrastructure and in this case it will send the ticket back to the incident manager.
Finally, when the incident is resolved he closes the ticket. The incident manager receives the closed ticket and checks if the solution is working for the customer. If the customer is not satisfied with the solution it will open another incident starting the process again.
Reflections and Observations:
For the lean perspective, every step in the process adds some value and the synchronisation between steps is performed by the ticketing tool, but there are some delays in the process which can be creating waste (Slack et al, p. 250). The identified delays are:
Once the ticket is open (step 1), the ticket is waiting for an available technician to open it (step 2)
- If the technician determines the ticket is not of his responsibility it will send it back to the Incident Manager and it will have to wait to be opened again
- In steps 5 and 6 it happens the same than the previous point
- If the technician need additional information to resolve the ticket it will ask for it to another groups (step 9) which will add some delays
- Once the ticket is resolved, there has to be an available incident manager to check with the customer if he is satisfied with the resolution
On the other hand, there is just one quality check point at the end of process where the incident manager tests the solution with the customer. Probably additional control point would be needed in order to improve the performance of the process.
The throughput efficiency of the process doesn’t look very good. In the first place, there are many delays in the process and the process allows looping to occur. In example, if one ticket is assigned to the Wintel group, the person analysing it can send the ticket back to the Incident manager if he thinks the tickets it’s out of his scope, but the incident manager can assign again the ticket to the same group starting the process again. The process doesn’t have any mechanism to prevent this situation to occur and can happen in two different steps of the process (3 and 6).
A more profound analysis following the Lean thinking should answer to the following questions:
- How many time is employed managing “false positives”
- How many time is employed managing application incidents (out of the scope of this process)
- How many time is employed managing repetitive incidents all related with the same infrastructure element
- How many time is employed with indents incorrectly assigned to the group
- What is the average resolution time by incident
- What is the open queue of every group
An analysis of the previous items should lead to the definition and implementation of additional quality control points that will require ongoing measurement of several parameters.
The analysis could lead to the following conclusions:
Too many wrong assignments made by the Incident Manager (step 1)
Excessive information request to customer or incident manager (step 9)
Poor problem determination or lack of definition (steps 3 and 6) leading to loops in incident resolution
For which appropriate actions can be developed.
Finally, from the control flow point of view , it looks like the ticketing system implements a pull control system (Slack et al page 356) being the tickets the signals needed for synchronisation. This is clear as if there’s no ticket in the system, nobody is “producing solutions” for problems that doesn’t exist. If this is the case, the organisation must provide with enough flexibility to use the available resources.
Slack, N.; Chambers, S.; Johnston, R. and Betts, A. (2006)
Operations and Process Management, London: FT Prentice Hall
WarwickUniversity, study notes