Replacing Physical Machines, Motherboards, NICs, or RAID Controllers
You replace motherboards, NICs, RAID controllers, and a physical machine (PM), or node, while the system is running. You can remove PMs to upgrade a PM or to replace a failed PM. You can replace motherboards, NICs, or RAID controllers. Several types of hardware faults can hang or crash a PM, including a failure of the motherboard, CPU, mid-plane, or storage controller. (If you need to recover the system software on a failed PM instead of replacing the PM hardware, see Recovering a Failed Physical Machine.)
When you remove and replace a PM, the system completely erases all of the disks in the replacement PM in preparation for a full installation of the everRun system software. To install the software, you can allow the system to automatically boot the replacement node from a temporary Preboot Execution Environment (PXE) server on the primary PM. As long as each PM contains a full copy of the most recently installed software kit (as displayed on the Upgrade Kits page of the everRun Availability Console), either PM can initiate the replacement of its partner PM with PXE boot installation. If needed, you can also manually boot the replacement node from DVD/USB installation media.
Use one of the following procedures based on the media you want to use for the installation, either PXE or DVD/USB installation.
If you replace a PM or a component, use vendor instructions, but first read Physical Machine Hardware Maintenance Restrictions.
Caution: The replacement procedure deletes any software installed in the host operating system of the PM and all PM configuration information entered before the replacement. After you complete this procedure, you must manually re-install all of your host-level software and reconfigure the PM to match your original settings.
Caution: To prevent data loss, if the system log indicates that manual intervention is necessary to assemble a disk mirror, contact your authorized Stratus service representative for assistance. You may lose valuable data if you force a resynchronization and overwrite the most recent disk in the mirror.
Prerequisite: If you want to use DVD or USB media to install the system software on the replacement PM, obtain the installation software for the release that the PM has been running by using one of the following methods:
- Create a bootable USB medium on the Upgrade Kits page, as described in Creating a USB Medium with System Software.
- Download an install ISO from your authorized Stratus service representative.
-
Extract an install ISO into the current working directory from the most recently installed upgrade kit by executing a command similar to the following (x.x.x.x is the release number and nnn is the build number):
tar -xzvf everRun_upgrade-x.x.x.x-nnn.kit *.iso
If you download or extract an install ISO, save it or burn it to a DVD or USB medium. See Obtaining everRun Software.
Note:
You must reactivate the product license for the everRun system after replacing a PM.
To remove and replace a failed PM
or component
(with PXE boot installation)
Use the following procedure to replace a failed PM, motherboard, NIC, or RAID controller and reinstall the system software by using PXE boot installation from the software kit on the primary PM.
- In the everRun Availability Console, click Physical Machines in the left-hand navigation panel.
- Select the appropriate PM (node0 or node1) and then click Work On, which changes the PM’s Overall State to Maintenance Mode and the Activity state to running (in Maintenance).
-
After the PM displays running (in Maintenance), click Recover.
-
When prompted to select the type of repair, click PXE PM Replace - Initialize All Disks.
Caution: Selecting PXE PM Replace - Initialize All Disks deletes all data on the replacement PM.
-
Select one of the following PXE Settings:
-
Only respond to PXE requests from the current partner node.
Waits for a PXE boot request from the MAC address of the current partner node. Select this option if you are recovering the existing PM by completely wiping and reinstalling it (with no hardware changes). This process deletes all data on the PM, but restores its current network configuration.
-
Only respond to PXE requests from the following MAC address.
Waits for a PXE boot request from the MAC address that you specify. Select this option if you are replacing the PM with a new PM, or replacing network adapters in the existing PM. Enter the MAC address of the specific network adapter that will initiate PXE boot.
-
Accept PXE requests from any system on priv0.
Waits for a PXE boot request from priv0, the private network that connects the two everRun nodes. Select this option if you are replacing the PM with a new PM, or replacing network adapters in the existing PM, but you do not know the MAC address for the new PM.
-
If prompted, under Assumed Network Settings, select one of the following options:
- Use below settings—The PM uses the displayed network settings. No user interaction is needed during the software installation process.
- Ask during install—The PM prompts for network settings. When the software installation begins, you must be present at the console of the replacement PM to enter the settings.
- Click Continue to begin the replacement process. The system shuts down and powers off the PM.
-
After the PM is powered off, install the replacement PM or component, if applicable:
-
If you are replacing a motherboard, NIC, or RAID controller, do so now. If you are replacing the PM, disconnect and remove it now, and then install the new PM. Connect a monitor and keyboard.
-
Reconnect the network cables to their original ports. Check that Ethernet cables are connected from the replacement PM (or new NIC) to the network or directly to the running (primary) PM, if the two everRun system PMs are in close proximity. One Ethernet cable should connect from the first embedded port on the new PM or from a NIC port if the new PM does not have an embedded port.
-
Manually power on the replacement PM. As the PM powers on, enter the firmware (BIOS or UEFI) setup utility, and enable PXE boot (boot from network). If you previously selected Only respond to PXE requests from the following MAC address, enable PXE boot on the NIC associated with that MAC address; otherwise, verify that PXE boot is enabled on the priv0 NIC. Save the setting and restart the system.
-
The replacement process continues, as follows:
- The replacement PM begins to boot from a PXE server that temporarily runs on the primary node.
- The system automatically deletes all of the data on disks in the replacement PM.
- The replacement PM reboots again and automatically starts the system software installation, which runs from a copy of the installation kit on the primary node.
If you previously selected Ask during install to specify the network settings of the replacement PM during the installation, monitor the installation process and respond to prompts at the physical console of the replacement PM; otherwise, skip to step 16.
-
The Select interface for private Physical Machine connection screen sets the physical interface to use for the private network. To use the first embedded port, use the arrow keys to select em1 (if it is not already selected), and then press F12 to save your selection and go to the next screen.
Notes:
- If you are not sure of which port to use, use the arrow keys to select one of the ports, and click the Identify button. The LED on the selected port will then flash for 30 seconds, allowing you to identify it. Since the LED may also flash due to activity on that network, Stratus recommends that you leave the cable disconnected during the identification process. Reconnect the cable immediately after identification is complete.
- If the system contains no embedded ports, select the first option interface instead.
-
The Select interface for managing the system (ibiz0) screen sets the physical interface to use for the management network. To use the second embedded port, use the arrow keys to select em2 (if it is not already selected), and then press F12 to save your selection and go to the next screen.
Note: If the system contains only one embedded port, select the first option interface. If the system contains no embedded ports, select the second option interface.
-
The Select the method to configure ibiz0 screen sets the management network for node1 as either a dynamic or static IP configuration. Typically, you set this as a static IP configuration, so use the arrow keys to select Manual configuration (Static Address) and press F12 to save your selection and go to the next screen. However, to set this as a dynamic IP configuration, select Automatic configuration via DHCP and press F12 to save your selection and go to the next screen.
- If you selected Manual configuration(Static Address) in the previous step, the Configure em2 screen appears. Enter the following information and press F12.
- IPv4 address
- Netmask
- Default gateway address
- Domain name server address
See your network administrator for this information.
Note: If you enter invalid information, the screen redisplays until you enter valid information.
- At this point, the software installation continues without additional prompts.
-
When the software installation is complete, the replacement PM reboots from the newly installed system software.
Note: After the system software installation, the replacement PM may take up to 20 minutes to join the system and appear in the everRun Availability Console.
- As the replacement PM joins the system, you can view its activity on the Physical Machines page of the everRun Availability Console. The Activity column displays the PM as (in Maintenance) after the recovery is complete.
-
Assign logical disks from the replacement PM to storage groups on the everRun system, as described in Assigning a Logical Disk to a Storage Group.
Notes:
- When the replacement PM joins the everRun system, the system automatically adds the secondary everRun system disk to the Initial Storage Group; however, the system does not assign any other logical disks from the PM to existing storage groups.
- If you assigned logical disks to the Initial Storage Group or other storage groups on the first PM, you must manually add matching logical disks from the replacement PM to the same storage groups; otherwise, the everRun system cannot fully synchronize.
-
To activate the replacement PM, re-activate the product license for the everRun system. On the Preferences page, click Product License, expand License Check and Activation, and click Check License Now to automatically activate the license (as described in Managing the Product License).
Note: The new PM cannot exit maintenance mode and run VMs until the everRun license is re-activated.
- If applicable, manually reinstall applications and any other host-level software, and reconfigure the replacement PM to match your original settings.
- When you are ready to bring the replacement PM online, click Finalize to exit maintenance mode. Verify that both PMs return to the running state and that the PMs finish synchronizing. The initial synchronization may take minutes or hours depending on your configuration, including the amount of storage and the number of VMs.
Note: When the replacement PM exits maintenance mode, the system automatically disables the PXE server on the primary node that was used for the replacement process.
To remove and replace a failed PM
or component
(with
DVD/USB installation)
Use the following procedure to replace a failed PM, motherboard, NIC, or RAID controller and reinstall the system software by using a DVD or USB medium.
- In the everRun Availability Console, click Physical Machines in the left-hand navigation panel.
- Select the appropriate PM (node0 or node1) and then click Work On, which changes the PM’s Overall State to Maintenance Mode and the Activity state to running (in Maintenance).
- After the PM displays running (in Maintenance), click Recover.
-
When prompted to select the type of repair, click DVD/USB PM Replace - Initialize All Disks.
Caution: Selecting DVD/USB PM Replace - Initialize All Disks deletes all data on the replacement PM.
- Click Continue to begin the replacement process. The system shuts down the PM in preparation for the system software reinstallation.
-
After the PM is powered off, install the replacement PM or component, if applicable:
-
If you are replacing a motherboard, NIC, or RAID controller, do so now. If you are replacing the PM, disconnect and remove it now, and then install the new PM. Connect a monitor and keyboard.
-
Reconnect the network cables to their original ports. Check that Ethernet cables are connected from the replacement PM (or new NIC) to the network or directly to the running (primary) PM, if the two everRun system PMs are in close proximity. One Ethernet cable should connect from the first embedded port on the new PM or from a NIC port if the new PM does not have an embedded port.
- Insert the bootable media or mount the ISO image on the replacement PM, and then manually power on the PM.
-
As the replacement PM powers on, enter the firmware (BIOS or UEFI) setup utility and set the Optical Drive or USB media as the first boot device.
-
Monitor the installation process at the physical console of the replacement PM.
- At the Welcome screen, use the arrow keys to select the country keyboard map for the installation.
-
At the Install or Recovery screen, select Replace PM, Join system: Initialize Data and press Enter.
Caution: Selecting Replace PM, Join system: Initialize data deletes all data on the replacement PM.
-
The Select interface for private Physical Machine connection screen sets the physical interface to use for the private network. To use the first embedded port, use the arrow keys to select em1 (if it is not already selected), and then press F12 to save your selection and go to the next screen.
Notes:
- If you are not sure of which port to use, use the arrow keys to select one of the ports, and click the Identify button. The LED on the selected port will then flash for 30 seconds, allowing you to identify it. Since the LED may also flash due to activity on that network, Stratus recommends that you leave the cable disconnected during the identification process. Reconnect the cable immediately after identification is complete.
- If the system contains no embedded ports, select the first option interface instead.
-
The Select interface for managing the system (ibiz0) screen sets the physical interface to use for the management network. To use the second embedded port, use the arrow keys to select em2 (if it is not already selected), and then press F12 to save your selection and go to the next screen.
Note: If the system contains only one embedded port, select the first option interface. If the system contains no embedded ports, select the second option interface.
-
The Select the method to configure ibiz0 screen sets the management network for node1 as either a dynamic or static IP configuration. Typically, you set this as a static IP configuration, so use the arrow keys to select Manual configuration (Static Address) and press F12 to save your selection and go to the next screen. However, to set this as a dynamic IP configuration, select Automatic configuration via DHCP and press F12 to save your selection and go to the next screen.
- If you selected Manual configuration(Static Address) in the previous step, the Configure em2 screen appears. Enter the following information and press F12.
- IPv4 address
- Netmask
- Default gateway address
- Domain name server address
See your network administrator for this information.
Note: If you enter invalid information, the screen redisplays until you enter valid information.
- At this point, the software installation continues without additional prompts.
-
When the software installation is complete, the replacement PM reboots from the newly installed system software.
Note: After the system software installation, the replacement PM may take up to 20 minutes to join the system and appear in the everRun Availability Console.
- As the replacement PM joins the system, you can view its activity on the Physical Machines page of the everRun Availability Console. The Activity column displays the PM as (in Maintenance) after the recovery is complete.
-
Assign logical disks from the replacement PM to storage groups on the everRun system, as described in Assigning a Logical Disk to a Storage Group.
Notes:
- When the replacement PM joins the everRun system, the system automatically adds the secondary everRun system disk to the Initial Storage Group; however, the system does not assign any other logical disks from the PM to existing storage groups.
- If you assigned logical disks to the Initial Storage Group or other storage groups on the first PM, you must manually add matching logical disks from the replacement PM to the same storage groups; otherwise, the everRun system cannot fully synchronize.
-
To activate the replacement PM, re-activate the product license for the everRun system. On the Preferences page, click Product License, expand License Check and Activation, and click Check License Now to automatically activate the license (as described in Managing the Product License).
Note: The new PM cannot exit maintenance mode and run VMs until the everRun license is re-activated.
- If applicable, manually reinstall applications and any other host-level software, and reconfigure the replacement PM to match your original settings.
- When you are ready to bring the replacement PM online, click Finalize to exit maintenance mode. Verify that both PMs return to the running state and that the PMs finish synchronizing.