Responding to a Failed Logical Disk

When everRun software detects a missing or broken logical disk, it displays a failed logical disk alert on the DASHBOARD page of the everRun Availability Console. (For examples of alerts, see Logical Disk Management.) You can also view the alert on the ALERT HISTORY page. The everRun Availability Console continues to display the alert until you respond to the problem using one of the following methods, as appropriate for your situation:

Caution:  
  1. Clicking the Repair button removes all data on failed logical disks.
  2. If you attempt to recover a missing or failed logical disk with the Repair button in the masthead of the everRun Availability Console, the system may be slow to repair the disk. Although the system successfully removes the failed logical disk from its storage group, it may be slow to migrate the data from the failed disk to other disks in the storage group. The Alerts page may continue to report that the logical disk is not present, that volumes have failed, and that storage is not fault tolerant. Also, the Volumes page may continue to show volumes in the broken () state. If this condition persists, contact your authorized Stratus service representative for assistance.
  3. Repairing storage causes virtual machines (VMs) that are using failed logical disks to become simplex until repair is complete.
  4. Systems configured for UEFI will only boot from the logical disk that the everRun software was originally installed on.

  5. In some legacy BIOS configurations, if you need to repair a logical disk that is the boot disk, you may need to reconfigure the RAID controller to boot from one of the remaining logical disks. Any logical disk that is not affected by the failed disk is able to boot the server. The everRun software mirrors the boot files for each node, in order to maximize overall availability. However, some systems may be able to boot from only the predefined boot logical disk in the RAID controller, and may be unable to boot from an alternate logical disk, if the predefined boot logical disk is present but not bootable. After the node has recovered and the logical disk with the replacement drive has been brought up to date, you should restore the boot device to the original value in the RAID controller.
To repair a failed logical disk
  1. Click the Repair button that appears in the masthead of the everRun Availability Console.
  2. Click Yes in the Confirm message box if you want to continue with the repair.

    After you click the Repair button, the everRun software attempts to repair all broken volumes by migrating data to other logical disks. When other logical disks have enough space for the data, the everRun software can successfully complete the repair. When other logical disks do not have enough space for the data, the everRun software generates the alert Not enough space for repair. In this case, you need to add more storage to the storage group by creating new logical disks or by deleting some existing volumes.

    When enough space for the data exists, the everRun software automatically re-mirrors broken volumes.

After the repair is complete, use RAID controller software to remove the failed logical disk and to create a new logical disk. everRun software automatically recognizes the new logical disk and brings it into service if the disk does not contain data. If the disk contains data, the DASHBOARD displays the message Logical Disk – n on PM noden is foreign and should be activated or removed. To activate the logical disk, see Activating a New Logical Disk.

Related Topics

Logical Disks and Physical Disks

The everRun Availability Console