Recovering a Failed Physical Machine

Recover a physical machine (PM), or node, when it cannot boot or if it fails to become a PM in the everRun system. In some cases, the everRun Availability Console displays the state of a failed PM as Unreachable (Syncing/Evacuating).

To recover a PM, you must reinstall the everRun release that the PM has been running. Recovering a failed PM, though, is different from installing the software for the first time. The recovery preserves all data, but it re-creates the /boot and root file systems, re-installs the everRun system software, and attempts to connect to the existing system. (If you need to replace the physical PM hardware instead of recovering the system software, see Replacing Physical Machines, Motherboards, NICs, or RAID Controllers.)

To reinstall the system software, you can allow the system to automatically boot the replacement node from a temporary Preboot Execution Environment (PXE) server on the primary PM. As long as each PM contains a full copy of the most recently installed software kit (as displayed on the Upgrade Kits page of the everRun Availability Console), either PM can initiate the recovery of its partner PM with PXE boot installation. If needed, you can also manually boot the replacement node from DVD/USB installation media.

Use one of the following procedures based on the media you want to use for the installation, either PXE or DVD/USB installation.

Caution: The recovery procedure deletes any software installed in the host operating system of the PM and all PM configuration information entered before the recovery. After you complete this procedure, you must manually re-install all of your host-level software and reconfigure the PM to match your original settings.
Prerequisites:  
  1. Determine which PM you need to recover.
  2. Check that a monitor and keyboard are connected to the PM.
  3. Check that Ethernet cables are connected from the PM your are replacing to the network or directly to the other PM, if the two everRun system PMs are in close proximity. The Ethernet cable should connect from the first embedded port on the PM you are recovering or from an option (that is, add-on or expansion) port if the PM does not have an embedded port.
  4. If you want to use DVD or USB media to install the system software on the replacement PM, obtain installation software for the release that the PM has been running by using one of the following methods:

    • Create a bootable USB medium on the Upgrade Kits page, as described in Creating a USB Medium with System Software.
    • Download an install ISO from your authorized Stratus service representative.
    • Extract an install ISO into the current working directory from the most recently installed upgrade kit by executing a command similar to the following (x.x.x.x is the release number and nnn is the build number):

      tar -xzvf everRun_upgrade-x.x.x.x-nnn.kit *.iso

    If you download or extract an install ISO, save it or burn it to a DVD or USB medium. See Obtaining everRun Software.

Related Topics

Maintenance Mode

Managing Physical Machines

The everRun Availability Console

The Physical Machines Page