Survival Instructions for the Staff System Administrator 2

Restore order in the IT infrastructure

The previous instruction for the survival of the system administrator was mainly related to the interaction of the system administrator and company management, but, as readers correctly noted, the stress level at the system administrator also depends on the level of chaos in the IT infrastructure. At the same time, the chaos in the IT infrastructure, as a rule, is the historical heritage of the company, which was created by the previous sysadmin temporary supporters of certain (sometimes extremist) technologies. A system administrator, coming to a company with such an IT infrastructure, is faced with a difficult choice: either to change jobs (and thereby reduce its attractiveness in the eyes of employers), or to gather all the will into a fist and try to bring the IT infrastructure to a “normal” state.

Six years ago, when I opened my IT outsourcing company, there was no choice of customers, and we only had the way forward. But it was complex clients at the start that allowed us to develop a unified approach, which allows to bring the IT infrastructure of any company from the “as it is” state to the “as it should” state and even now not to be afraid to take on heavy projects. But before we talk about this approach, we start with one, but quite an important term:
')

What is a "normally functioning IT infrastructure"?

Everyone puts their own meaning in this definition: for someone this is a sysadmin who does not run like mad, for someone this is branded equipment and nicely laid out wires, but for ourselves we consider that a normally functioning IT infrastructure is IT infrastructure that solves business problems and fully meets its requirements. Those. Normally, this is not when you have new branded equipment and beautifully made installation, but when the IT infrastructure is not, and, just as important, it will not be a limiting factor in the company's work for at least a year. So, if your IT infrastructure consists of permanent downtime, but at the same time a week without the Internet (server / computer) does not affect the work of the company - you have a normal IT infrastructure. And if you have a modern IT infrastructure in which everything works, but there is one single printer in the company's warehouse (which simultaneously loads 50 trucks) - your IT infrastructure is not normal, because when (not if, namely “when”) this printer breaks down, the business in the company will rise until the moment of its replacement.

Accordingly, the “norm” in the work of the IT infrastructure is determined for each company individually, and before bringing the IT infrastructure to a “normal state” it is necessary to determine this very norm, and for this:

Survey IT users and find out their needs.

Their subjective assessment can not accurately give objective information about the quality of the IT infrastructure in relation to the company's business. Users should be interviewed for their overall satisfaction, as well as to identify specific problems in the operation of IT systems: work computer, mail, telephone, enterprise management system, etc. When compiling the questionnaire, be sure to offer a choice of typical problems that may arise in the operation of this or that system (hangs up, loads for a long time, mail sometimes does not reach, etc.) - it will be easier for people to fill it out, but be sure to leave the field for fantasy - there is often important information.

Yes, I know that you have already heard a lot about “all problems” from users and “you know everything very well”, but first, people tell you only about the most important problems in IT, leaving everything unimportant for later, but secondly, the documentary mass confirmation of the real situation in IT will help you in the future when communicating with management. But about communicating with the management later, but for now we collect the completed questionnaires, process the data, we understand for ourselves that the users are not enough for “the normal functioning of IT” and:

Document the entire IT infrastructure.

What exactly to document, you ask? The answer is everything. Start with an inventory of all the hardware and software in the company: what is in the workplace, what is in the warehouses, what software is used, what software versions, how is everything configured? Describe the structure of information services: which servers are configured, how they work. Analyze the reasonableness of operating expenses: Internet channels, cartridges, service contracts. After this, go to the main:

Identify problems in the work of IT infrastructure

An incident is a deviation from “normal operation,” the problem being the key cause of repeated incidents. A slow-running database is an incident; the lack of a processor for the “normal” operation of databases is a problem. Printer breakdown is an incident, a printer on which in “normal mode” they print more documents than it pulls is a problem. Computer infected with a virus - an incident, the lack of antivirus software in the workplace - the problem. The user broke Windows on the computer - an incident, the user's rights are greater than enough to perform work duties - a problem. The computer broke down - an incident, computers that have been working in the company for more than 5 years - a problem, etc.

You can solve some problems on your own, investments in hardware and software are necessary to solve others, and management decisions are necessary to eliminate third ones. There is no need to rush for the solution of the first problem identified, for a start you must do the main thing - try to collect information on all problem areas in IT. This will help you as a previously conducted user survey, and your expert assessment of the state of IT according to the results of the analysis of your own documentation. After you have identified all the problems in IT that are the causes of constant user complaints, and also identified problems that may lead to repeated incidents in the future, do not forget to take another important step:

Perform operational risk analysis again.

In contrast to identifying problems in IT infrastructure that can reduce the number of incidents, analysis of operational risks is the ability to pre-plan actions in case of their occurrence and reduce negative consequences. The maximum allowable IT downtime is also one of the parameters of the “normal functioning of the IT infrastructure,” but it is already coordinated not with users, but directly with management. I wrote a lot about how to carry out an analysis of operational risks in the last article - we will not dwell on this now, but go straight to the next stage:

Choose technology and IT strategic direction

Traditionally, in companies where there was no concept of “system administration” before you (there were system administrators — there were no system administrators), there is a zoo of technologies: some of them have already outlived their time and it’s time for them to retire For reasons understandable only to the previous sysadmin, the company invested (and seriously invested) just before you came. The absence of a unified IT development strategy for many years creates a mess in building an IT infrastructure of an incredible scale. In this situation, you need to choose the technology and develop a unified strategy for the development of IT infrastructure, which the company will adhere to in the next few years. It is also important to develop a modernization plan, taking into account the useful lives of existing solutions that you inherited in one way or another and that you cannot refuse from yet. When making a modernization plan, do not forget about one important point:

Relate IT strategy to company development plans

Perhaps the management plans to change the office in six months, or to open regional offices. Perhaps they generally plan to close the direction in which the database has now eaten all the server resources. Whatever the management of the company thinks, the IT department should always be ahead of any business changes and your IT development strategy should reflect this.

After you have understood (and documented) the existing IT infrastructure, the list of requirements for it, existing problems and operational risks, and also have chosen a long-term development direction (and action plan), proceed to the formation of the main document:

Build an annual IT budget

After these actions, you clearly know what needs to be done / changed for the "normal" functioning of the IT infrastructure, what problems there are now and what actions / solutions / procurement are needed to eliminate them, you have a clear understanding of the appropriate way to develop IT and plan action. All this allows you to easily calculate the amount of funds that are needed to bring the IT infrastructure to a "normal" state. Most importantly, this amount of funds will be a necessary and sufficient amount of investment in IT for the coming year and will save the management from eternal “situational” purchases and financial surprises.

In principle, after carrying out the above actions, forming an IT budget does not pose any particular problems; the main difficulty is to make sure that the results of your research can study and understand the manual. But even if everything is not always good with design and presentation of thoughts through paper, there is one more (and no less important) stage in the process of bringing the IT infrastructure to a “normal” state:

Justify IT budget before management

Computer technology (as opposed to accounting) is not a purely expense item - it is an investment item. Companies invest in technology to work easier, faster, more convenient. A normally functioning IT infrastructure (in our understanding) is needed not by the system administrator, but by the business. Coming to the leadership with a formed annual IT budget, you do not offer them “spend more money on IT”, but offer them to invest in:

The speed of their employees,
The convenience of the staff,
Additional information exchange tools,
Technological continuity of business processes
Reduced operating costs.

Accordingly, your argument for each item looks like this: “If you invest in the purchase of equipment the annual salary of one employee, then the remaining 30 will work with IT systems 3-4 times faster”, “and also, if we acquire an additional server, in the event of an accident, the restoration of the systems will take not several days, but 3-4 hours, ”“ the purchase of powerful printers will pay off within a year by saving on consumables, and original consumables can be used. ”

After the management has recognized that the IT budget you set up is reasonable, there is only one formal point:

Agree on a disbursement schedule and action plan.

As a rule, if you just instill the tradition of budgeting in IT, in the first year you will not receive money in full at once (but in subsequent years this budget will already be looking forward to - everyone wonders what else can be improved in IT so that the company worked more efficiently). For this reason, coordinate your fund allocation plans and your action plans based on the real capabilities of your company. Limiting the amount of funds allocated, do not forget to also state with the management a list of problems that cannot be solved yet due to their lack (again, in writing). Well, after this:

Act!

This is not to say that you kicked the noodle before this time, but now that you have a clear plan for allocating funds, you can, without waiting for the first purchases, prepare your IT infrastructure, its part or even just users (for example, through training) to planned changes.

In reality, given the limited allocation of funds, it may not be enough for the year to restore order in the IT infrastructure, the chaos in which you have been bred by a lot of previous sysadmins for 5 years. For this reason, the sooner you begin to act, the sooner you finish and the less your IT infrastructure will be in an “abnormal” state, in which everyone constantly wants and does not allow you to calmly carry out your IT administration system work. systems. In any case, regardless of whether you have time for the year to carry out all the planned changes or not:

A year later, repeat this cycle again.

Yes, as in the previous article, everything is tied to a constant repetition of the life cycle of the IT infrastructure. For me, these are cycles of continuous improvement (google this phrase), passing through each year through which I take pride in my work, seeing how the IT infrastructures of our clients become more and more reliable every year, saving them from unnecessary work stoppages, and us from the hustle and endless calls.

Successes!

Ivan Kormachev
Company "IT Department"
www.depit.ru

Source: https://habr.com/ru/post/193186/

All Articles