Removing disks from a live raid on adaptec 6805

Task:

Extract 2 hard drives from raid-10 at the logical level (i.e., without physically removing from the server) to assemble raid-1 from them, transfer the system there and prepare everything you can to reboot, thus minimizing the time and number of downtime.

What is the difficulty?

In the 5 series of adaptek the question was solved by two teams:
1. Zafeili disk: arcconf setstate 1 device 0 0 ddd
2. Translated to Ready status: arcconf setstate 1 device 0 0 rdy
3. We do with disks what we want.

In the 6 series so does not roll. Regardless of whether failover is enabled or not, the disks return to the Present state and nothing can be done with them (I think it is clear that the raid itself will be Degraded until Rebuild passes).
An attempt to contact the official tech support was unsuccessful - I received an answer, but I had a feeling that I was using a homemade piece of hardware, and not a server that could not be jerked back and forth:
')

After you ran the “arcconf setstate 1 device 0 0 ddd” command, was the system rebooted? If not, then reboot and initialize both disks in the controller BIOS. There you can immediately create a RAID-1.

To erase the meta-data on the disk under Arcconf, you can initialize the disk with the "arcconf task" command. For example: arcconf task start 1 device 0 0 initialize

After that, the drive must be accessible to create other logical drives.

However, if you throw two disks out of RAID-10, it remains in the “Degraded” status. If one of the remaining errors in the disk array fails, the entire array may collapse. So maybe just back up all the data, then just delete the RAID-10 array and create two separate RAID-1s.

I thought and resolved the issue with a queue of experiments, after which I was able to complete the task.

Description:

We have logical device with raid-10 on 4 disks

  Logical device segment information -------------------------------------------------------- Group 0, Segment 0 : Present (0,0) J0VV3R8N Group 0, Segment 1 : Present (0,1) J0VV3ZBN Group 1, Segment 0 : Present (0,2) J0VV3YEN Group 1, Segment 1 : Present (0,3) J0VX2WXN

It is necessary to pull out of it 2 hards (one from different groups) and make raid-1 of them

Decision:

1. Make sure failover is enabled

 arcconf failover 1 on

2. Feilim 2 disks from different groups

 arcconf setstate 1 device 0 0 ddd arcconf setstate 1 device 0 2 ddd

Disks will become Inconsistent in logicaldevice and Failed in physicaldevice

3. We translate these discs into ready status

 arcconf setstate 1 device 0 0 rdy arcconf setstate 1 device 0 2 rdy

The disks will become Missing in logicaldevice and Ready in physicaldevice

4. We wait until failover starts rebuilding

 Group 0, Segment 0 : Rebuilding (0,0) J0VV3R8N

They will be able to rebuild in turn, including, as soon as we see the state of Rebuilding in one of them, we are doing point 5 immediately, then for the next.

5. Feilim and very quickly proceed to point 6.

 arcconf setstate 1 device 0 0 ddd arcconf setstate 1 device 0 2 ddd

Disks will become Inconsistent in logicaldevice and Failed in physicaldevice

6. We transfer disks in the status Ready and very quickly we pass to point 7

 arcconf setstate 1 device 0 0 rdy arcconf setstate 1 device 0 2 rdy

The disks will become Missing in logicaldevice and Ready in physicaldevice

7. Disable failover and very quickly go to step 8.

 arcconf failover 1 off

8. Initializing the disks

 arcconf task start 1 device 0 0 initialize arcconf task start 1 device 0 2 initialize

Hooray, we can concoct from them raid-1

 arcconf CREATE 1 LOGICALDRIVE MAX 1 0 0 0 2

The reader may have questions, for which we performed the same actions 2 times and why failover was not immediately disabled.
Again, the 6th series of adapters does not allow you to safely remove disks from the raid with failover disabled after the command:
arcconf setstate 1 device 0 0 rdy we would get the status of the disk in the logicaldrive Present, and the status of the raid Degraded, while the disk in the physicaldrive would be in the status Online, not Ready.
And why, starting from point 5, we do everything quickly? It's simple, the controller after a few seconds has time to recover, and change the status
drives, so you need to have time to execute commands before he does it.

I could not find a ready-made solution, I had to reinvent my own, I hope someone will be useful - I’m not the only one using the 6th series of adaptecs.

UPD. Upgrade 10 servers, everything went well. The only edit is that you manage to pull only one hard out of the raid at a time, then you need to repeat the actions and pull out the 2nd one. If you managed to pull out the disk, but the persistent controller tries to reuse it, just push it into JBOD, remove the 2nd disk, remove the 1st disk from JBOD and you can create 1 raid on 2 free disks.

Source: https://habr.com/ru/post/214187/

All Articles

Removing disks from a live raid on adaptec 6805

Task:

What is the difficulty?

Description:

Decision:

More articles: