📜 ⬆️ ⬇️

Recovering LSI RAID Controller firmware

Good day, habravchane!

I want to tell you about how I restored the firmware of the LSI MegaRAID RAID controller after an unsuccessful update.
When this trouble happened to me, I practically did not find any information about this, although I admit that I did not google it well.

Anamnesis


In my work, I have been using Supermicro servers for quite a long time, since they have a large selection of platforms, a fairly affordable price and decent reliability.
')
Often, especially in the case of 1U servers, I take them already with the integrated LSI MegaRAID controller.

But the problem with them is that Supermicro itself is not very willing to post firmware for embedded controllers, so I usually flash them with the latest firmware (oil, yes) from a similar LSI controller. There were no problems until now.

Recently we brought several servers with LSI 2208 controllers on board and fairly old firmware.
Since I also actively use the discrete controllers on these chips, then, without any hesitation, I booted from a USB flash drive with Linux and launched the usual one:
./MegaCli64 -AdpFwFlash -f mr2208.rom -a0 
and went on to do more business.

The next time I looked at the server terminal, I saw the same picture as it was - “Flashing firmware ...” and no result. Trouble, thought Stirlitz.

Logging into the server via SSH failed, looking at the VGA console saw messages that the root file system went into Read Only mode and everything is very bad, and at any moment it will be even worse.

I do a reset and see this picture:

image

Yes, trouble. Searches on the Internet did not lead to any result. Apparently, the problem is quite rare.

Treatment


I tried to boot from the flash drive and flash the controller again, but the MegaCli utility did not detect it at all under DOS or Linux. Respectively, too, refused to flash.

So I turned to LSI support, where a kind person with a Hindu name pointed me to the MegaRAID documentation, namely to page 305, where there is such a rather subtle section that doesn’t really explain why it does what it says:

image

Yeah, the partisans thought, probably this is the firmware in recovery mode, and got down to business.

Under Windows, a flash drive with FreeDOS is easiest to use using the Rufus utility, just a click away.
Under Linux, you can do the same with improvised tools (using syslinux or GRUB), there are many articles on this topic.

Fill it with MegaCli.exe and firmware found in the ftp.supermicro.com open spaces.

Load, run:
 MegaCli.exe -AdpM0Flash -f smc2208.rom 

I draw your attention to the fact that you do not need to specify an adapter (option -a), apparently it is flashing everything it finds, or the first one found on the PCI bus.

The matter went:

image

The firmware in this mode takes quite a long time, about 15 minutes, so be patient.

When he finishes, turn off the power server, turn it back on and wait for the miracle.
But instead of a miracle, we see such a bleak picture:

image

Googling such an error leads to a single link to our compatriot's blog , where he advises you to disconnect the BBU from the controller in pure English, remove the controller from the server and then put it back.

In my case, you can only remove the card from the server with a jigsaw, I don’t have a BBU, so it’s not an option.
I try to flash in the standard way, MegaCli detects the controller, but says the same thing, saying F / W is in fault state , so I won’t do anything.

We appeal again to the support, which throws up his hands and advises to try the LSI Pre-Boot USB and CD tool , and if it does not help, then take the iron back.

Ok, download the ISO, connect it via IPMI to the server and load.
We select the recovmr item in the boot menu, then we are prompted to write recover in the command line and happiness will come. But it did not come.
The BAT file cannot find the connected D: drive, apparently the CD-ROM driver in FreeDOS on this LSI image is not friendly with the IPMI virtual drive.

Well, look into the BAT file and see what he was going to do there:
 MegaCli.exe -AdpFwFlash -f D:\FW\RECOVER\TB_16MB.ROM -aALL 

Open the ISO, look for this mysterious file and see that it is already 16 megabytes in size (yes, we already guessed from the name) that it is twice the size of the standard firmware. Apparently, this ROM image completely rewrites the Flash chip on the controller.

We are trying to flash it the same way as the BAT nick was going to do, but we get the familiar: F / W is in fault state
Yes, so-so Recovery image has prepared us LSI.
Okay, we use our previous experience and try to flash this file through Mode0.

This time, the firmware took about 30 minutes, since the file is twice as long as usual. After the firmware, we de-energize the server, turn it back on and see the coveted screen:

image

Salute, champagne, server saved!

But this vivifying image does not contain the latest version of the firmware, so I, with a light heart, booted again from the FreeDOS flash drive and went to flash it with the latest Supermicro firmware ... and again I got stuck at the same stage as at the very beginning:
image

The circle is closed. I even for loyalty left him in this form for the night, but nothing has changed.
After the reboot, we have again the beaten firmware.

By trial and error, it was found that after flashing the recovery image, you need to reset to factory settings:
 MegaCli.exe -AdpFacDefSet -a0 

and turn off-turn on the server.

After that, it is already stitched without a hang, and we see the latest firmware version:
image

Everything, this time it turned out to be a 100% victory over the recalcitrant gland!

Statement


The moral of this story is this: if you don’t want to spend a couple of days restoring or even more to return the equipment, then it’s better still to flash the firmwares intended by the iron manufacturer (if he uploads them, I found it at the Supermicro just by digging through the wilds of FTP - on There is no reference to the server or motherboard page), or do not touch anything and live with the one that already exists.
Although I’m not sure that the problem was caused by the “foreign” firmware, and not by some random glitch, I don’t want to check it again.

There are also cases when the firmware simply for some reason deteriorates (the electricity was turned off during the firmware or some other gamma-ray burst occurred in near space), and then you have to resort to disaster recovery.

I hope that this article will help those who stumble upon a similar problem in the future.

Source: https://habr.com/ru/post/209348/


All Articles