As we raised the IT infrastructure [from the bottom]

Hello everybody!

Exactly one year after writing the article “ Experience as an enikeyschik / system administrator in a budget organization ” and 2.5 years after writing my deputy. I would like to continue this story with the director of the “ Resuscitation of IT Infrastructure ” article.

')
I remember in one of the comments I met the phrase:

Therefore, I still advise both habrauser from this dustbin to gather their courage and topple - you won't rake anything in this swamp even after two more years, and sitting in a hell of 12,000 to suffer is a very idiotic way to kill time.

But, oddly enough, from the fact that we did still get something, and I would like to tell how far we went:

Active Directory Domain Services domain has been created with automatic management of accounts and divisions (OU);
Office 365 is implemented;
Spacewalk deployed (management software * nix operating systems);
Created HA MySQL Server master-master (Active-Passive);
Ajenti hosting panel is deployed;
SSL access to company web resources is configured;
VMware vCenter migrated from 4.0 to 5.1U3;
introduced by ESET NOD32 Business Edition ver. five;
Networking based on ITU Cisco ASA 5525-X NGFW with CDA;
resolved problems with air conditioning in the server.

Over the past year, we have a new colleague ( astrike - he has a Read-only account, so we write an article together), which was discussed in the last article, delimited the competence areas and began to bring the IT infrastructure to a decent look. He is engaged in writing automation scripts (including on PowerShell), setting up synchronizations, scanning and supporting * nix servers and everything connected with them. I do Microsoft Windows, MS SQL, Office 365, virtualization and data networks (Ethernet, SAN).

Brief description of the company's IT infrastructure

Cisco active network equipment:

core: Catalyst 6509;
access: Catalyst 35XX;
WLAN - controller: 4404 and 50 AP.

A 1G fiber optic cable goes to all switches, which, at best, is connected to each switch in the rack, at worst - to half of the switches in the rack (the remaining switches are connected in cascade).

Server infrastructure:

two 42U racks with 11 Sun Fire X4170M2 servers clustered in an ESXi cluster;
Storage Hitachi AMS 2100;
2 in-line APC InRow conditioners (without redundancy), whose remote units are cooled with running water from garden sprays in the heat.

AD DS and Office 365 (Azure)

The main task at the time of implementation was the automation of the life cycle of an e-mail service. For these purposes, we used the Office 365 software product that was partially deployed in the enterprise. Employees began to use corporate mail, and quite a lot and often (especially after signing an order for a corporate mail system).

Together with our 1C-nickname, we set up automatic account creation in AD. We organized this process as follows:

1C server unloads data about employees and organization structure (OU) into a file with a specific structure.
The PowerShell script picks up the data and generates a file with a patch (the difference between the current state of AD and the download from 1C). A summary of this patch is sent to interested parties (actions affecting some senior positions require manual confirmation). Then at night the patch is applied: new ones are created and existing OUs are updated; accounts are added, updated, activated and deactivated (by subdivisions, of course). This approach ensures that they do not accidentally (for example, in case of an error in 1C) all accounts are deactivated (a separate check for the number of changes in a patch) or accounts of management positions and system administrators;
after establishing a local account, DirSync synchronizes (in general, it is synchronized once every half hour), local AD with cloud Azure AD (AAD);
the script then licenses and configures account settings in cloud-based AAD depending on their group membership.
Now the user is required to obtain a password from his account. To automate the issue of passwords, a special PowerShell script was written and the “Print new password” menu item was added to AD. Clicking on it automatically prints the user's password and instructions for using information services on the printer “by default” for this user (figure below). After printing, all files are deleted.

Currently, the process of entering a PC into the domain is continuing. Currently, there are already more than 150 PCs running Windows 8.1 in the domain. In the future we plan to deploy Windows 10 on all PCs of the company. For PC naming, we use stickers with a template computer identifier (for example, COMP-00045), which we glue on the PC and register in the domain.

OS * nix and Spacewalk

Due to the presence of the zoo * nix servers, some of which have already gone out of support (for example, Ubuntu 10.04), while others are simply spoiled, for example, by assembling packages via make install , it was decided to create a “golden image” on the basis of which new * nix servers. We decided that the CentOS distribution is best for our purposes, since it has a large number of structured manuals, a period of support, and also a fairly conservative policy. To administer these servers, it was decided to use Spacewalk. Spacewalk is a control system for * nix operating systems with support for proxying repositories (like WSUS on Windows), managing configuration files, installed packages, and the ability to execute commands on connected servers.

As a result, a template was created in VMware vSphere with a script that enters the server into the domain (for DNS and authorization), configures the network and users configuration. Thus, the deployment of the new system was reduced to the following:

create a virtual machine from a template in VMware vSphere;
connecting to this virtual machine and launching a sweep script (on bash), which assigns IP (optional static or DHCP) to the system and other parameters (including hostname), enters the server into the AD domain, connects the server to Spacewalk, which immediately installs the necessary packages and expands the configuration files;

Thus, it turns out a new * nix server with domain access policies (login to the domain account), automatic Errata update, configured by Zabbix monitoring, centrally managed repositories and configuration files (Spacewalk).

Due to the unification of the distribution kit, server administration has been greatly simplified (identical commands, location of configuration files, common repositories and packages). And thanks to Spacewalk’s centralized OS management system, configuration files are not lost (and there is the ability to update configuration files on all servers at once with variable substitution), you can always see which servers need updating, and critical updates are automatically installed.

Currently, work is underway to transfer applications (mainly web applications) to new servers and to deploy new services.

HA MySQL DB

For continuous operation of almost any application requires uninterrupted operation of the database. And since most of the existing applications in the company use MySQL, a MySQL server was deployed (not MariaDB, since at the time of deployment it had some kind of error, which was reported to me by our developer).

The continuity of the database is provided by master-master replication according to the Active-Passive strategy (load balancing is not yet relevant), i.e. one server is always the main (Active), but when it crashes (or the MySQL server crashes on it), all requests go to the Passive server. When you raise the Active server all requests again go to him.

This failover is achieved by using a virtual IP (VIP), which is provided by the keepalived daemon. The use of this technology allows (unlike proxy-servers) to restrict access to the schemes by IP and does not create an extra hop, and therefore an extra delay.

From the basic database settings, in my opinion, only the LDAP authorization setting remains. Of course, you can configure SSL, but this is after the deployment of a certificate authority (CA).

vCenter and backups

Over the past year, we were able to migrate a VMware vSphere cluster with more than 100 working virtual machines and configure HA, DRS, SDRS and DPM functions, and VMware Tools was installed on all guest operating systems, which allowed us to use the virtual infrastructure at full capacity.

Later, a backup system was configured using Veeam Backup and Replication tools, which are currently deployed in trial mode, but features such as automatic backup verification in a virtual lab, VSS for Windows applications are already configured. As practice has shown backups copies of VM consistents and starts without problems (pah, pah).

So now looks like the percentage of virtual machines in our cluster. By the end of the year, we plan to completely decommission Windows Server 2003, as well as replace the old Debian and Ubuntu with CentOS 7.

Protection of workstations and Windows servers

To protect workstations, we use ESET antivirus. For several years of operation, he proved himself quite well, and most importantly, he has for good centralized management on computers outside the domain. We have deployed version 5, since approximately 400 computers access the ERA managment server without installed agents (as is necessary in version 6), and to install them, it is advisable to have a domain for installing ESET agents on them.

How we solved the problem with server conditioning

We created in the server room cold and hot corridors with minimal costs: the racks were covered with monolithic polycarbonate, and with the help of guides the doors were made to enter the corridors.

As a result, the temperature in the server room dropped to 18 degrees in the cold and to 23 in the hot, and the load on air conditioners decreased by 40%.

Creating authorization for access to the global network

To limit Internet access, we are currently deploying a system of authorization on a network based on the Cisco ASA 5525-X and CDA server. But while all this is in progress. Of course, I would like to do this using IEEE 802.1X for all devices on the network, but how it will turn out in the end is unknown.

Summing up

Strangely enough, but this year my friend and I were able to implement quite a lot of large and interesting projects, having graduated from the University along the way. Now there is a process of combing various things, as well as writing regulations and orders on the use of the network, domain, etc.

After that, it remains to select and implement solutions for organizing HelpDesk`a, as well as to deploy some products from the System Center Configuration Manager line, Operations Manager, and possibly Service Manager. Then almost all problems will be solved, except for the lack of large financial investments in the IT department, which entail a lack of money for upgrading servers, switches, etc., and increasing salaries, but this is a completely different story ...

If you are interested, we can describe in more detail our implementation experience for any of the above described software products, where there will already be specific technical details.

Source: https://habr.com/ru/post/264815/

All Articles