DevOps goes to large IT companies, regardless of whether they are ready for it or not. There can be many problems here, and I would like to talk about one of them. Perhaps this is an overly bold statement, but I believe that the current organizational structure of most IT support services is fundamentally wrong.
In such a situation, the successful implementation of DevOps practices may be almost impossible.
As an alternative, I would like to propose a new methodology called Swarming, which is already ready for implementation in a large business and is ideally suited to perform technical support tasks in the DevOps era.
We should start with a small review of the existing governing structures, which underlie the vast majority of technical support services in organizations related to medium and large businesses.
Classical technical support, built according to the principles of IT service management, has a three-level hierarchy.
Level 1. This is in direct contact with users - usually by telephone - the Technical Support Service (Service Desk). Aims to provide mid-level technical support on non-specialized issues. The main task is to maintain a stable quality of services subject to the solution of the majority of incoming requests here, on the first level.
Level 2. Usually closely related to the first, but implies deeper general or specialized knowledge and skills of employees. Second-level specialists, for example, may receive additional training to support common operating systems (such as Microsoft Windows) or hardware, gaining the necessary skills to solve more complex problems.
For a better understanding of the three-tier structure, it is necessary to analyze the business reasons that generated and support it. The considered methodology is applied almost everywhere. There are several advantages that encourage its use.
Customers are provided with a single channel of communication with technical support regardless of the nature of the problem.
It is easy to find specialists in the labor market with the technical skills necessary to work in the first two levels of support. It also facilitates the transfer of the task to an outsourcing, which is done quite often.
The journey of customer support calls can end at the first level, almost without starting. In fact, in many organizations, part of the requests is processed using fully automated services, which are often called “level zero”.
However, there are many problems that cannot be solved at the first level. The process of transferring them to levels 2 and 3 is called escalation:
Second-level specialists usually process fewer applications than their counterparts from the first, but these are more complex tasks that, on average, require more time to be solved.
Tickets that reached the third level (escalated from the second or sent directly from the first) usually make up a small part of all incoming calls. But these are the most complex tasks, the solution of which requires high skill of specialists and considerable time-consuming.
Many attempts have been made to calculate the relative cost of solving a problem at each level of support. In this work in 2014 , for example, the average cost of closing a ticket at the first level is estimated at $ 22, at the second at $ 62, and at the third at $ 85 (according to other studies, the last figure is several times more).
Criticizing such a generally accepted structure is not easy. However, the Swarming-movement aimed precisely at this, taking as a basis the essential but correctable shortcomings of the multi-level model. Let's look at some issues affecting DevOps.
Multi-level support involves multiple queues . Since the first level seeks to solve problems as quickly as possible, everything that could not be fixed right away is put in a queue. The actual status of the problem changes, and it goes from current to deferred. Essentially, Levels 2 and 3 are warehouses of tasks that are in progress (Work in Progress), which is a problem within the Lean philosophy that underlies DevOps. Successful implementation of DevOps as part of Lean requires decisive steps to reduce work in progress. This problem alone is a significant deterrent to the arrival of DevOps in technical support.
“Collaborative communities can overcome professional and organizational barriers by encouraging cooperation, learning, and progress.”
(Don Tapscott and Anthony D. Williams, in Wikinomics )
The concept of Swarming was proposed at the end of the past decade as a new platform for organizing technical support. It explicitly rejects the conservative multi-level structure in favor of the network interaction model:
Source: Consortium for Service Innovation
The key company that first introduced this system was Cisco. In 2008, in a document called Digital Swarming, she presented the “Distributed Cooperation and Decision Making Model”. The concept was subsequently adopted by Consortium for Service Innovation, transforming into Intelligent Swarming . Some of its principles are:
Should not be divided into levels of support groups.
There should be no escalation from one group to another.
The task should be transferred directly to the employee who is likely to be able to solve it.
Swarming does not have a single well-defined structure. This is partly due to the novelty and, accordingly, the low prevalence. However, the following example (based on the swarming-methods of user support used in the BMC) is typical. He significantly improved the support service ( as described at the UK's Servicedesk and IT Support Show in 2015 ).
Swarming begins when a problem appears that cannot be solved immediately at the time of receiving a call from the user. Quick primary sorting of a task ends with sending it to one of two groups (Swarms):
Primary sorting in the Swarm structure
Each group (Swarm) is a small team that processes incoming applications in near real-time mode:
“Severity 1” Swarm (First Gravity Incident Team)
This group aims to solve the most serious problems. Its participants coordinate the response to difficult situations, connect the right people, try to organize the fastest possible solution of critical problems. This process is not much different from the major incident incident procedure used in the traditional multi-level model. However, another group is also developed in parallel, processing a much larger number of hits:
Dispatch Swarm ( Dispatch Group)
This type of groups appeared as an answer to the main problem of multi-level support: a lot of calls that could be resolved very quickly when they reach the right specialist are lost in the lists of unfinished tasks. Thus, the solution of the five-minute question can stretch for days.
Members of this group are literally pushed to “grab cherries”, not paying attention to problems that cannot be fixed instantly. Thus, the time spent on solving a significant number of types of tasks can be greatly reduced.
There is an additional benefit. The inclusion of inexperienced employees leads to the fact that they gain knowledge, access to which in a multilevel model would appear only when transferred to a highly specialized team. At the same time, highly qualified specialists of the third level of support are closer to the client.
Using Dispatch Swarming leads to a quick solution of a significant number of tasks (in the BMC, their number is about 30%), and the remaining calls fall into the queues of more familiar support teams that are engaged in individual product lines. Here, many tasks will be familiar and understandable to ordinary members of the team, so their solution should not cause difficulties. At the same time, another part of the appeals (perhaps about 30%) may be worthy of the attention of the best support service specialists, regardless of their structural affiliation.
The third type of group is used here: Backlog Swarm .
Backlog Swarm (Group work with accumulated tasks)
To solve the most difficult problems, Backlog Swarm unites groups of experienced and qualified engineers, regardless of geographic or structural boundaries. They receive tasks from field specialists who are now forbidden to directly contact experts individually. Instead, they must submit tasks to the appropriate Backlog Swarm.
When classical tech support works in conjunction with DevOps, the problems of the multi-level model are only exacerbated. In this case, unsolved tasks (Work in Progress backlogs) are actively accumulated, which, in turn, limits autonomy and flexibility. Such a system is essentially insulating. These problems are contrary to the philosophy of DevOps and are the main challenge to the implementation of DevOps practices in organizations with a traditional business model.
Already we can highlight the following negative points.
DevOps encourages software developers to take on its support (which is sometimes called “I wrote it myself — and figure it out myself”). However, in developed support services, typical of large organizations, the multi-level structure is the main channel through which user problems reach engineers. As we already know, barriers between the first level of support and the DevOps command can lead to a delay in solving problems, as well as to poor-quality primary processing of requests.
The “throw it over the fence” type of integration model used between ITSM-based call-in systems and software life cycle tools for DevOps teams leads to a lack of situational awareness of employees.
On the contrary, the concept of Swarming is built largely on the same principles that are at the heart of DevOps success.
Dynamic cross-functional collaboration that allows you to put together a team of specialists with different skills and areas of expertise.
Flexible teams as opposed to inert hierarchical structures.
Self-reliance versus dogmatic processes (a key example here is the ability to “grab cherries” while working as part of a Dispatch Swarm).
Reducing the number of pending tasks.
“Big business is not changing slowly because there are stupid people or technology-hating people. They just have users. ”
(Luke Kaines, Founder and CEO, Puppet Labs. Configuration Management Camp, Belgium, 2015)
DevOps casts doubt on the very essence of established conservative working methods, combining the previously isolated roles of developers and operating services specialists, as well as trying to get rid of ingrained ineffective practices. This philosophy was largely (if not completely) formed in the organizations of the new generation, often without the need to maintain outdated, but working systems and their users.
It is important to note that this was done very successfully:
Source: Devops 2016 Status Report
Now DevOps got to traditional organizations, where in the process of implementation (often very painful), he will face new challenges. But for such companies it is no longer a question of improvement, but a necessary step in the struggle for survival. Changes in the form of “creative destruction” are a constant and real threat to the existence of large companies. Only 12% of the Fortune 500 list from 1955 remained in it in 2014.
IT companies should try to use fresh ideas wherever possible and constantly challenge conservative practices.
The swarming movement has launched an attack on the multi-tier support model, but progress in managing the IT services of traditional companies is slow, as it is limited to only a few far-sighted organizations. However, the proximity of the basic elements of Swarming and DevOps is difficult to deny, and therefore they have similar implementation problems, the solution of which makes it easier to use both systems.
Thus, there is a need to rethink the multi-level support model. The new methodology should take advantage of DevOps, while maintaining performance and efficiency across large companies. I think Swarming may well fit this role.
Source: https://habr.com/ru/post/318826/
All Articles