📜 ⬆️ ⬇️

Purple screen videos and other 3 Dell abilities

image alt text


The newer the server, the longer it loads. This is especially annoying when diagnosing or updating firmware, so one day I rummaged through Dell’s bins and found some useful tools to save nerves.


Automatic migration and self-firmware


Dell has a free toolkit, OpenManage Essentials, for monitoring equipment and updating firmware. For infrastructure with physical servers, this is a good way to solve some administrative tasks:



If you have a virtual environment, then the Dell Management Plug-in for VMware vCenter is useful.


image alt text


The plugin allows vCenter to timely learn about hardware failures, and at the same time update the firmware, drivers and BIOS on hosts with ESXi. But what turned out to be the most useful, with it vCenter can migrate virtual machines before launching the firmware.


Migration when updating a server is available for both current Dell PowerEdge systems of the 12th and 13th generations, and for repaired 11th generation servers with an iDRAC controller.


What is iDRAC

This is a proprietary version of IPMI (Intelligent Platform Management Interface). In fact, the same "remote access to the physical console and BIOS", but with additional tools from Dell. Analog from HP - iLO (Integrated Lights-Out), from IBM - RSA (Remote Supervisor Adapter).


The iron update script on virtual hosts now looks like this:


  1. Right in the vCenter console, I see notes about the need for an update - you can also run the FirmWare Wizard there . The wizard can pull up updates from both the Dell online repository and local sources;

image alt text


  1. Then select updates and set a convenient time. Be sure to check the BIOS, iDRAC and LifecycleControler, so as not to get problems during installation;


  2. Cluster update takes from 30 to 60 minutes. If everything goes well, the machines will return back to their host, and the process can be monitored via email notifications or on the Job Queue page.

image alt text


To automatically move virtual machines to another host, vSphere must be running in a fully automatic mode DRS (Distributed Resource Scheduler) configuration.


Without the plug-in, all these operations were performed manually, including the preliminary migration of machines, the creation of a bootable flash drive for updating, the firmware of each component separately, etc. Of course, sitting next to the server, which is not so nice and comfortable.


Of course, something similar can be tried to implement based on Zabbix and a set of scripts. But why not use some amenities that cost companies a modest $ 495 per license for a plugin for 5 hosts.


There were some concerns like "what if everything goes wrong, and the next day the servers will not work at all." From the majority of difficulties there is such an algorithm:


  1. I get an alert that the virtual machines have migrated successfully or not. If the migration fails, the update will not start, and the next day I will calmly sort out the situation;


  2. After the machines are migrated to another host, our "patient" may even burn out, users will not notice. If the update is unsuccessful, the host will not try to boot again and again, but it will be easy to wait for the firmware to roll back. Of course, if you remember to check Enable Alarms for Dell Hosts when configuring OpenManage Integration;


  3. If the update of the micro-software did not go well, then the restart logs and screenshots of the console will be ready.

Of course, sometimes the cluster is not a panacea, but you will not insure everything from it.


Speaking of screenshots, this is the next useful option.


Screenshots of blue and purple screens


You know how it happens: you add a driver to the host, reboot ESXi, and quietly disconnect from the remote control while waiting for the system to boot. And it is all inaccessible and inaccessible. Connect again and see the purple screen. It is clear that the first action will roll back the added module. And if the problem is not in it?


In such cases, I first reboot the host and watch the boot process, its stages and the messages that appear. For servers with the iDRAC 7 Enterprise module, a log of messages and console screens is maintained when the server is loaded, hung, or errors occur. That is, all diagnostic information is already at hand and does not require multiple reloads, each of which in modern systems can last more than 5 minutes.

image alt text


You can view the records of the last three downloads. The option requires a $ 75 iDRAC Enterprise license.


More iDRAC magic


The Dell Hardware Management Module may cancel a failed installation of the firmware and return any previous version. This is useful, for example, when detecting a bug in the new version - there was a discussion about such things in the comments to the article about service contracts . You can simply return "as it was" and no longer touch. The firmware rollback is available for several components at once for one reboot, which is a great time saver (remember about the slow loading of modern hardware with dozens of GB of memory).


A couple of years ago, we unsuccessfully flashed the BIOS in the server and, after rebooting, admired the following inscription:


image alt text


All I did after the initial panic was to go into Lifecycle Controller from my station and chose Launch Firmware Rollback for the BIOS, which successfully played my part.


image alt text


The firmware recovery wizard supports the following devices:



Additional useful features of iDRAC include:



Mobile reboot and monitoring


In our age of fashionable mobile technology, even a car can already be opened or started from a telephone. Something similar is from Dell - called OpenManage Mobile, the application is available for Android and iOS .


image alt text


In fact, this is a stripped-down client of the OpenManage Essentials and iDRAC consoles, which can be installed on a smartphone or tablet and get the following features:



I have an old bad habit: after the manipulations with hardware or software on one of our sites, I go to the office and, on the way, look through the latest “patient” states through email alerts. OpenManage Mobile brought the habit to a new level and allowed to watch the status in real time. Of course, I would not risk simply restarting the server remotely outside the office, but viewing statuses and logs often comes in handy.


image alt text


Using the application is more logical from the tablet - then you can organize a convenient VNC-access and get a kind of server control panel. Of course, access to the network with iDRAC interfaces is via VPN.


About Earth


If you count the cost of all these bonuses for a single server , then it turns out $ 174 :



Even without buying additional licenses, you get detailed monitoring of all PowerEdge components out of the box, along with a free network monitoring system and third-party equipment OpenManage Essentials.


And what management tools brighten up your administrative routine?


Finally, some useful links:



')

Source: https://habr.com/ru/post/311314/


All Articles