diff --git a/.spelling b/.spelling index 8fe3fdfe59194..b6148d36a2754 100644 --- a/.spelling +++ b/.spelling @@ -872,6 +872,10 @@ xnames zeroization zeroize zypper +Baremetal +baremetal +FabricManager +FMNs # Network Terms - Starting to organize a little bit here but it's not done 0.5m diff --git a/glossary.md b/glossary.md index a6d5b90f3fe85..3acf9ad736ce5 100644 --- a/glossary.md +++ b/glossary.md @@ -34,6 +34,8 @@ Glossary of terms used in CSM documentation. * [EX Compute Cabinet](#ex-compute-cabinet) * [EX TDS Cabinet](#ex-tds-cabinet) * [Fabric](#fabric) +* [Fabric Manager](#fabric-manager) +* [Fabric Manager Node](#fabric-manager-node) * [Firmware Action Service (FAS)](#firmware-action-service-fas) * [Floor Standing CDU](#floor-standing-cdu) * [Hardware Management Network (HMN)](#hardware-management-network-hmn) @@ -376,6 +378,17 @@ compute blades and 16 [High Speed Network (HSN)](#high-speed-network-hsn) switch The [Slingshot](#slingshot) fabric consists of the switches, cables, ports, topology policy, and configuration settings for the Slingshot [High-Speed Network](#high-speed-network-hsn). +## Fabric Manager + +The [Slingshot](#slingshot) Fabric Manager software includes a suite of software which configures, manages, and monitors the network. It runs on an +external server and communicates with the switches over the out-of-band management network. + +## Fabric Manager Node + +The [Slingshot](#slingshot) Fabric Manager runs on at least one dedicated server referred to as the HPE Slingshot Fabric Manager Node (FMN). It also runs on the +HPE Slingshot switches. HPE Slingshot Fabric Manager software is installed on a bare metal server (FMN) instead of using Kubernetes pods in order to support +systems with HPE Slingshot version 3.0.0 and above and High Availability (HA) requirements. + ## Firmware Action Service (FAS) The Firmware Action Service (FAS) provides an interface for managing firmware versions of Redfish-enabled hardware in the system. diff --git a/install/README.md b/install/README.md index 1c17d3b8b37b4..b3c56e0e7c9b6 100644 --- a/install/README.md +++ b/install/README.md @@ -74,6 +74,7 @@ shown here with numbered topics. 1. [Kubernetes encryption](#1-kubernetes-encryption) 1. [Export Nexus data](#2-export-nexus-data) - [Installation of additional HPE Cray EX software products](#installation-of-additional-hpe-cray-ex-software-products) +- [Fabric Manager Node redeployment](#fabric-manager-node-redeployment) > **`NOTE`** If problems are encountered during the installation, > [Troubleshooting installation problems](#12-troubleshooting-installation-problems) and @@ -336,3 +337,11 @@ See the [Install or upgrade additional products with IUF](../operations/iuf/work procedure to continue with the installation of additional HPE Cray EX software products. For additional information on the IUF, see [Install and Upgrade Framework](../operations/iuf/IUF.md). + +## Fabric Manager Node redeployment + +> **OPTIONAL:** This section is only applicable if Fabric Manager nodes were deployed during the CSM installation. + +After additional HPE Cray EX software products have been installed, Fabric Manager nodes need to be redeployed with a new customized image. + +See [Redeploy Fabric Manager Nodes](../operations/fm_on_baremetal/Redeploy_Fabric_Manager_Nodes.md). diff --git a/install/deploy_non-compute_nodes.md b/install/deploy_non-compute_nodes.md index 77159bb9f3c3b..822351cafcd77 100644 --- a/install/deploy_non-compute_nodes.md +++ b/install/deploy_non-compute_nodes.md @@ -2,7 +2,7 @@ The following procedure deploys Linux and Kubernetes software to the management NCNs. Deployment of the nodes starts with booting the storage nodes, followed by the master nodes -and worker nodes together. +and worker nodes together. Optionally, HPE Slingshot Fabric Manager nodes can also be deployed. After the operating system boots on each node, there are some configuration actions which take place. Watching the console or the console log for certain nodes can help to understand @@ -26,7 +26,8 @@ the number of storage and worker nodes. 1. [Deploy management nodes](#2-deploy-management-nodes) 1. [Deploy storage NCNs](#21-deploy-storage-ncns) 1. [Deploy Kubernetes NCNs](#22-deploy-kubernetes-ncns) - 1. [Configure `kubectl` on the PIT](#23-configure-kubectl-on-the-pit) + 1. [Deploy HPE Slingshot Fabric Manager nodes (optional)](#23-deploy-hpe-slingshot-fabric-manager-nodes-optional) + 1. [Configure `kubectl` on the PIT](#24-configure-kubectl-on-the-pit) 1. [Validate deployment](#3-validate-deployment) 1. [Next topic](#next-topic) @@ -52,9 +53,11 @@ Preparation of the environment must be done before attempting to deploy the mana > These values do not need to be altered from what is shown. ```bash - export IPMI_PASSWORD ; mtoken='ncn-m(?!001)\w+-mgmt' ; stoken='ncn-s\w+-mgmt' ; wtoken='ncn-w\w+-mgmt' + export IPMI_PASSWORD ; mtoken='ncn-m(?!001)\w+-mgmt' ; stoken='ncn-s\w+-mgmt' ; wtoken='ncn-w\w+-mgmt' ; ftoken='fmn\w+-mgmt' ``` + > **NOTE:** The `ftoken` variable is used for HPE Slingshot Fabric Manager nodes, which are optional and not present on all systems. + ### 1.2. BIOS baseline 1. (`pit#`) If the NCNs are HPE hardware, then ensure that DCMI/IPMI is enabled. @@ -68,14 +71,16 @@ Preparation of the environment must be done before attempting to deploy the mana 1. (`pit#`) Check power status of all NCNs. ```bash - grep -oP "(${mtoken}|${stoken}|${wtoken})" /etc/dnsmasq.d/statics.conf | sort -u | + grep -oP "(${mtoken}|${stoken}|${wtoken}|${ftoken})" /etc/dnsmasq.d/statics.conf | sort -u | xargs -t -i ipmitool -I lanplus -U "${USERNAME}" -E -H {} power status ``` + > **NOTE:** If the system does not have HPE Slingshot Fabric Manager nodes, the `ftoken` pattern will not match any entries. + 1. (`pit#`) Power off all NCNs. ```bash - grep -oP "(${mtoken}|${stoken}|${wtoken})" /etc/dnsmasq.d/statics.conf | sort -u | + grep -oP "(${mtoken}|${stoken}|${wtoken}|${ftoken})" /etc/dnsmasq.d/statics.conf | sort -u | xargs -t -i ipmitool -I lanplus -U "${USERNAME}" -E -H {} power off ``` @@ -89,16 +94,16 @@ Preparation of the environment must be done before attempting to deploy the mana - Disable VT-x, AMD-V, SVM, VT-d, and AMD IOMMU for Virtualization, on both AMD and Intel CPUs; there is no way to enable at this time. ```bash - grep -oP "(${mtoken}|${stoken}|${wtoken})" /etc/dnsmasq.d/statics.conf | sort -u | + grep -oP "(${mtoken}|${stoken}|${wtoken}|${ftoken})" /etc/dnsmasq.d/statics.conf | sort -u | xargs -t -i ipmitool -I lanplus -U "${USERNAME}" -E -H {} chassis bootdev none options=clear-cmos ``` 1. (`pit#`) Boot NCNs to BIOS to allow the CMOS to reinitialize. ```bash - grep -oP "(${mtoken}|${stoken}|${wtoken})" /etc/dnsmasq.d/statics.conf | sort -u | + grep -oP "(${mtoken}|${stoken}|${wtoken}|${ftoken})" /etc/dnsmasq.d/statics.conf | sort -u | xargs -t -i ipmitool -I lanplus -U "${USERNAME}" -E -H {} chassis bootdev bios options=efiboot - grep -oP "(${mtoken}|${stoken}|${wtoken})" /etc/dnsmasq.d/statics.conf | sort -u | + grep -oP "(${mtoken}|${stoken}|${wtoken}|${ftoken})" /etc/dnsmasq.d/statics.conf | sort -u | xargs -t -i ipmitool -I lanplus -U "${USERNAME}" -E -H {} power on ``` @@ -113,7 +118,7 @@ Preparation of the environment must be done before attempting to deploy the mana 1. (`pit#`) Power off the nodes. ```bash - grep -oP "(${mtoken}|${stoken}|${wtoken})" /etc/dnsmasq.d/statics.conf | sort -u | + grep -oP "(${mtoken}|${stoken}|${wtoken}|${ftoken})" /etc/dnsmasq.d/statics.conf | sort -u | xargs -t -i ipmitool -I lanplus -U "${USERNAME}" -E -H {} power off ``` @@ -134,8 +139,8 @@ for all nodes, the Ceph storage will have been initialized and the Kubernetes cl 1. (`pit#`) Set each node to always UEFI network boot, and ensure that they are powered off. ```bash - grep -oP "(${mtoken}|${stoken}|${wtoken})" /etc/dnsmasq.d/statics.conf | sort -u | xargs -t -i ipmitool -I lanplus -U "${USERNAME}" -E -H {} chassis bootdev pxe options=efiboot,persistent - grep -oP "(${mtoken}|${stoken}|${wtoken})" /etc/dnsmasq.d/statics.conf | sort -u | xargs -t -i ipmitool -I lanplus -U "${USERNAME}" -E -H {} power off + grep -oP "(${mtoken}|${stoken}|${wtoken}|${ftoken})" /etc/dnsmasq.d/statics.conf | sort -u | xargs -t -i ipmitool -I lanplus -U "${USERNAME}" -E -H {} chassis bootdev pxe options=efiboot,persistent + grep -oP "(${mtoken}|${stoken}|${wtoken}|${ftoken})" /etc/dnsmasq.d/statics.conf | sort -u | xargs -t -i ipmitool -I lanplus -U "${USERNAME}" -E -H {} power off ``` > **NOTE:** The NCN boot order is further explained in [NCN Boot Workflow](../background/ncn_boot_workflow.md). @@ -243,7 +248,54 @@ for all nodes, the Ceph storage will have been initialized and the Kubernetes cl > **NOTE:** To exit a conman console, press `&` followed by a `.` (e.g. keystroke `&.`) -### 2.3 Configure `kubectl` on the PIT +### 2.3 Deploy HPE Slingshot Fabric Manager nodes (optional) + +> **NOTE:** This section only applies to systems with HPE Slingshot Fabric Manager nodes. If the system does not have Fabric Manager nodes, skip this section and proceed to [Configure `kubectl` on the PIT](#24-configure-kubectl-on-the-pit). + +HPE Slingshot Fabric Manager nodes have hostnames like `fmn001`, `fmn002`, etc., with corresponding BMC names like `fmn001-mgmt`, `fmn002-mgmt`, etc. + +1. (`pit#`) Verify that Fabric Manager nodes are present in the system. + + ```bash + grep -oP "${ftoken}" /etc/dnsmasq.d/statics.conf | sort -u + ``` + + If this command returns no output, there are no Fabric Manager nodes to deploy. Skip the remaining steps in this section. + +1. (`pit#`) Check power status of Fabric Manager nodes. + + ```bash + grep -oP "${ftoken}" /etc/dnsmasq.d/statics.conf | sort -u | xargs -t -i ipmitool -I lanplus -U "${USERNAME}" -E -H {} power status + ``` + +1. (`pit#`) Boot the **Fabric Manager nodes**. + + ```bash + grep -oP "${ftoken}" /etc/dnsmasq.d/statics.conf | sort -u | xargs -t -i ipmitool -I lanplus -U "${USERNAME}" -E -H {} power on + ``` + +1. (`pit#`) Observe the installation through the console of the first Fabric Manager node. + + ```bash + conman -j fmn001-mgmt + ``` + + > **NOTES:** + > + > - If the nodes have PXE boot issues (e.g. getting PXE errors, not pulling the `ipxe.efi` binary), then see [Troubleshooting PXE Boot](troubleshooting_pxe_boot.md). + > - To exit a conman console, press `&` followed by a `.` (e.g. keystroke `&.`) + +1. (`pit#`) Wait for the Fabric Manager nodes to complete `cloud-init`. + + The following text should appear in the console: + + ```text + The system is finally up, after XXXX.XX seconds cloud-init has come to completion. + ``` + + > **NOTE:** The duration reported will vary. + +### 2.4 Configure `kubectl` on the PIT 1. (`pit#`) This was done in a previous step, but if the user is resuming/starting here then the first master needs to be redefined. diff --git a/operations/fm_on_baremetal/Configure_FM_On_Baremetal.md b/operations/fm_on_baremetal/Configure_FM_On_Baremetal.md new file mode 100644 index 0000000000000..789cc0170fcad --- /dev/null +++ b/operations/fm_on_baremetal/Configure_FM_On_Baremetal.md @@ -0,0 +1,489 @@ +# Configure FM (Fabric Manager) On `Baremetal` + +This document describes the procedure for customizing and deploying the base FMN OS image along with provisioning storage volumes, +and configuring the necessary networking to support Fabric Manager on `baremetal` following the CSM upgrade. + +## Requirements + +* Hardware requirements - 2 bare-metal nodes with dedicated boot and data disks +* Software requirements - OS (SLES SP7), CSM services like CANU, HSM, SLS, BSS, CSI, CFS, Ansible playbooks for FMN + +## Notes + +* Fabric Manager Nodes (`FMNs`) can be added only after the CSM upgrade has been completed. +* By default, Fabric Manager would be running on Kubernetes as a Kubernetes pod +* After Fabric Manager is migrated from a Kubernetes pod to bare-metal infrastructure, it cannot be reverted. + +## Post upgrade of CSM to 1.7.1 + +Post CSM Upgrade to CSM 1.7.1, if an administrator wishes to enable Fabric Manager on baremetal, they must follow below procedure. + +* Step 1: [FMN Prerequisites](#fmn-prerequisites) +* step 2: [FMN Pre Boot](#fmn-pre-boot) + * [FMN Base Image Creation](#fmn-base-image-creation) + * [Add FMN nodes to CSM](#add-fmn-nodes-to-csm) +* Step 3: [FMN Booting](#fmn-booting) +* Step 4: [FMN Post Boot](#fmn-post-boot) + * [Join Fabric Manager nodes to Spire](#join-fabric-manager-nodes-to-spire) + * [Validation](#validation) + * [Install Fabric Manager on FM baremetal nodes](#install-fabric-manager-on-fm-baremetal-nodes) + +## FMN Prerequisites + +### Update SHCD with FMN (Fabric Manager Node) Information + +The administrator must update the SHCD to include the placement and cabling details of the new FMNs. + +### Configure FMN BMC + +Verify that the BMC of each FMN is configured with the correct root user credentials. + +## FMN Pre Boot + +### FMN Base Image Creation + +The FMN base image creation process includes node discovery, configuration, and base image customization. The base image contains only the essential artifacts required for deployment. + +The following steps detail the process for generating the FMN image. + +#### Create FMN base image (only base OS; no Fabric Manager) + +Adapt and customize the current NCN Kubernetes image for compatibility with FMN node requirements. + +##### FMN Boot Preparation + +Create `sat bootprep` configuration file (`fmn_bootprep.yaml`) for FMN as below. + +**Note:** Ensure that the `fmn_bootprep.yaml` configuration file is updated with the official CSM released versions and the appropriate commits on playbooks before proceeding. + +For Example: + +```bash +ncn-m001:~ # cat fmn_bootprep.yaml +``` + +```yaml +schema_version: 1.0.2 +configurations: +- name: fmn-bm-default-configuration + layers: + - name: fmn-nodes-bm + playbook: ncn_nodes.yml + git: + commit: 64c8753fbc3143ec8b889a755a445b5bbc8007fd + url: https://api-gw-service-nmn.local/vcs/cray/csm-config-management.git + - name: fmn-initrd-bm + playbook: ncn-initrd.yml + git: + commit: 64c8753fbc3143ec8b889a755a445b5bbc8007fd + url: https://api-gw-service-nmn.local/vcs/cray/csm-config-management.git +images: +- name: fabricmanager-bm-node-image-1.0.0 + base: + product: + name: csm + version: 1.7.1-beta.10 + type: image + filter: + prefix: secure-kubernetes + configuration: fmn-bm-default-configuration + configuration_group_names: + - Management_Fabric +``` + +##### New FMN base image creation and upload to S3 + +Execute the commands below on any master node to generate the new FMN image and upload it to the S3 storage. + +(ncn-m#) First set `bootprep` file path: + +```bash +# BOOTPREP_FILE_PATH=./fmn_bootprep.yaml +``` + +(ncn-m#) Now execute the `sat bootprep run` command below to generate the new base image and upload it to S3. + +```bash +sat bootprep run \ + --limit images --limit configurations \ + --overwrite-images --overwrite-configs \ + --format json \ + --cfs-version v3 + --bos-version v2 \ + $BOOTPREP_FILE_PATH +``` + +**Note:** Using the `--overwrite-images` option in the command above will overwrite any previously uploaded images in S3. + +### Add FMN nodes to CSM + +Follow the steps below to register FMNs in CSM (SLS/HSM/BSS) and configure the required network, storage, and cloud-init settings in BSS. +These configurations will be provisioned automatically during node boot. + +#### Allocate NCN IP Addresses + +Follow [`Step-1 of NCN Add Procedure`](../node_management/Add_Remove_Replace_NCNs/Add_Remove_Replace_NCNs.md#add-ncn-procedure) for allocating NCN IP addresses. + +#### Add NCN data + +Follow [`Step-3 of NCN Add Procedure`](../node_management/Add_Remove_Replace_NCNs/Add_Remove_Replace_NCNs.md#add-ncn-procedure) for adding NCN data. + +#### Generate Switch Configuration With CANU + +For Example: + +```bash +canu generate network config -a TDS --csm 1.7 --custom-config custom_switch_config.yaml --edge Arista --sls-file sls_input_file.json --ccj surtur-ccj.json --folder output (--enable-nmn-isolation --nmn-pvlan ) +``` + +#### Validate the generated switch configuration against the network switches + +* TDS style systems have the management nodes plugged directly into the spine switches, most will only have a single leaf-bmc switch. +* Systems that use the "Full" architecture will have the management nodes plugged into the leaf switches. + +The configuration generated here will contain updates for the leaf-bmc switch(`es`) for the Fabric Manager node BMCs and updates to either the spine switches or the leaf switches for the bonded connection. + +For Example: + +```bash +canu validate switch config --ip 10.254.0.4 --generated output/sw-leaf-bmc-001.cfg +``` + +**Note:** CANU will likely suggest the removal of the `snmpv3` user, this is because the SNMP configuration is not held in the `custom_config.yaml` file because it's not permitted to store secrets in GitHub. Do NOT remove this configuration from the switch. + +Take extreme care when manipulating ACLs, if CANU suggests moving a "permit any ..." rule be sure to create the new rule before removing the old one. It is possible to lose access to the switch if the ACLs are not applied in the correct order. + +## FMN Booting + +Upon completion of the FMNs add procedure, the corresponding FMN entries will be populated in SLS, HSM, and BSS. The required network, storage and other cloud-init configurations are added to BSS and would be applied when the FMN node boots. + +Proceed to boot the FMN nodes (using iPXE boot commands) with the FMN bare-metal base image [Boot NCN](../node_management/Add_Remove_Replace_NCNs/Boot_NCN.md#boot-ncn). + +## FMN Post Boot + +### Join Fabric Manager nodes to Spire + +After the Fabric Manager nodes have been deployed and are running, join them to Spire to avoid issues with Spire tokens. + +```bash +ncn-m001:~ # /opt/cray/platform-utils/spire/fix-spire-on-fmn.sh +``` + +### Validation + +#### Validate the successful bring-up of the base FMNs + +1. Check if we are able to access both FMN nodes (`fmn001` and `fmn002`): + +```bash +ncn-m001:~ # ssh fmn001 +Last login: Thu Dec 4 11:25:30 2025 from 10.252.1.10 +... +``` + +```bash +ncn-m001:~ # ssh fmn002 +Last login: Thu Dec 4 05:03:46 2025 from 10.252.1.10 +... +``` + +1. Check if both FMN nodes are shown under `sat status`: + +```bash +ncn-m001:~ # sat status | grep fmn +``` + +```text +INFO: All values for 'Most Recent Session Template' are 'MISSING', omitting key. +| x3000c0s28b0n0 | fmn001 | Node | 100011 | On | OK | True | X86 | River | Management | FabricManager | Sling | True | fmn-bm-default-configuration | configured | 0 | stable | MISSING | MISSING | +| x3000c0s29b0n0 | fmn002 | Node | 100012 | On | OK | True | X86 | River | Management | FabricManager | Sling | True | fmn-bm-default-configuration | configured | 0 | stable | MISSING | MISSING | +``` + +1. Optionally check more details on the FMN nodes + +For Example: + +```bash +ncn-m001:~ # XNAME=x3000c0s28b0n0 +``` + +```bash +ncn-m001:~ # cray hsm state components describe "${XNAME}" --format toml +``` + +```text +ID = "x3000c0s28b0n0" +Type = "Node" +State = "On" +Flag = "OK" +Enabled = true +Role = "Management" +SubRole = "FabricManager" +NID = 100011 +NetType = "Sling" +Arch = "X86" +Class = "River" +``` + +#### Validate FMN required networking configuration + +Check NMN, CMN, HMN, CHN, metal and virtual IP configuration for both FMN nodes (`fmn001` and `fmn002`). +**Note:** NMN and HMN should be having additional FMN VIPs also allocated. + +```bash +ncn-m001:~ # cray sls networks list +``` + +```text +... +[[results.ExtraProperties.Subnets.IPReservations]] +Aliases = [ "fmn001-cmn", "time-cmn", "time-cmn.local",] +Comment = "x3000c0s28b0n0" +IPAddress = "10.102.193.42" +Name = "fmn001" + +... +[[results.ExtraProperties.Subnets.IPReservations]] +Aliases = [ "fmn001-mtl", "time-mtl", "time-mtl.local",] +Comment = "x3000c0s28b0n0" +IPAddress = "10.1.1.10" +Name = "fmn001" +... + +[[results.ExtraProperties.Subnets.IPReservations]] +Aliases = [ "fmn001-nmn", "time-nmn", "time-nmn.local", "x3000c0s28b0n0", "fmn001.local",] +Comment = "x3000c0s28b0n0" +IPAddress = "10.252.1.13" +Name = "fmn001" + +[[results.ExtraProperties.Subnets.IPReservations]] +Aliases = [ "fmn-vip.local",] +Comment = "fmn-virtual-ip" +IPAddress = "10.252.1.4" +Name = "fmn-vip" +... + +[[results.ExtraProperties.Subnets.IPReservations]] +Aliases = [ "fmn001-mgmt",] +Comment = "x3000c0s28b0" +IPAddress = "10.254.1.21" +Name = "x3000c0s28b0" + +[[results.ExtraProperties.Subnets.IPReservations]] +Aliases = [ "fmn001-hmn", "time-hmn", "time-hmn.local",] +Comment = "x3000c0s28b0n0" +IPAddress = "10.254.1.22" +Name = "fmn001" + +[[results.ExtraProperties.Subnets.IPReservations]] +Comment = "fmn-virtual-ip" +IPAddress = "10.254.1.2" +Name = "fmn-vip" +... + +[[results.ExtraProperties.Subnets.IPReservations]] +Aliases = [ "fmn001-chn", "time-chn", "time-chn.local",] +Comment = "x3000c0s28b0n0" +IPAddress = "10.102.193.206" +Name = "fmn001" +``` + +#### SLS hardware should list the new nodes + +For Example: + +```bash +cray sls hardware describe x3000c0s28b0n0 +``` + +Example Output: + +```text +Parent = "x3000c0s28b0" +Xname = "x3000c0s28b0n0" +Type = "comptype_node" +Class = "River" +TypeString = "Node" +LastUpdated = 1770352943 +LastUpdatedTime = "2026-02-06 04:42:23.048807 +0000 +0000" + +[ExtraProperties] +Aliases = [ "fmn001",] +NID = 100011 +Role = "Management" +SubRole = "FabricManager" +``` + +#### HSM `ethernetInterfaces` should be updated with the same allocated IPs + +For Example: + +```bash +cray hsm inventory ethernetInterfaces list --component-id x3000c0s28b0n0 --format json +``` + +Example Output: + +```json +[ + { + "ID": "1423f200029a", + "Description": "", + "MACAddress": "14:23:f2:00:02:9a", + "LastUpdate": "2026-02-06T12:57:53.593753Z", + "ComponentID": "x3000c0s28b0n0", + "Type": "Node", + "IPAddresses": [] + }, + { + "ID": "1423f2028e93", + "Description": "", + "MACAddress": "14:23:f2:02:8e:93", + "LastUpdate": "2026-02-06T12:57:53.47447Z", + "ComponentID": "x3000c0s28b0n0", + "Type": "Node", + "IPAddresses": [] + }, + { + "ID": "1423f200029b", + "Description": "", + "MACAddress": "14:23:f2:00:02:9b", + "LastUpdate": "2026-02-06T12:57:53.515843Z", + "ComponentID": "x3000c0s28b0n0", + "Type": "Node", + "IPAddresses": [] + }, + { + "ID": "00e0ed3210ed", + "Description": "CSI Handoff MAC", + "MACAddress": "00:e0:ed:32:10:ed", + "LastUpdate": "2026-02-06T12:57:53.362998Z", + "ComponentID": "x3000c0s28b0n0", + "Type": "Node", + "IPAddresses": [] + }, + { + "ID": "1423f2028e92", + "Description": "Bond0 - bond0.nmn0- kea", + "MACAddress": "14:23:f2:02:8e:92", + "LastUpdate": "2026-02-06T13:00:14.114892Z", + "ComponentID": "x3000c0s28b0n0", + "Type": "Node", + "IPAddresses": [ + { + "IPAddress": "10.252.1.13" + }, + { + "IPAddress": "10.102.193.42" + }, + { + "IPAddress": "10.1.1.10" + }, + { + "IPAddress": "10.102.193.205" + }, + { + "IPAddress": "10.254.1.22" + } + ] + }, + { + "ID": "00e0ed3210ec", + "Description": "CSI Handoff MAC", + "MACAddress": "00:e0:ed:32:10:ec", + "LastUpdate": "2026-02-06T12:57:53.327619Z", + "ComponentID": "x3000c0s28b0n0", + "Type": "Node", + "IPAddresses": [] + } +] +``` + +#### BSS should be updated with new hosts entries for FMN with proper configurations + +**Note:** BSS global parameters also should have been populated with FMN IPs and FMN VIP + +For Example: + +```bash +cray bss bootparameters list --format json --name x3000c0s28b0n0 +``` + +```bash +cray bss bootparameters list --hosts Global --format json +``` + +#### Validate FMN required storage configuration (LVM partitions) + +Check if both LVM partitions `/dev/mapper/metalvg0-SCFIRMWARE` and `/dev/mapper/metalvg0-SLINGSHOT` created and mounted under `/opt/cray/FW/sc-firmware` and `/opt/slingshot` respectively on both FMN nodes (`fmn001` and `fmn002`). + +```bash +fmn001:~ # lsblk +``` + +```text +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS +loop0 7:0 0 2.2G 1 loop /run/rootfsbase +sda 8:0 0 3.5T 0 disk +├─sda1 8:1 0 476M 0 part +│ └─md127 9:127 0 475.9M 0 raid1 /metal/recovery +├─sda2 8:2 0 22.8G 0 part +│ └─md125 9:125 0 22.8G 0 raid1 /run/initramfs/live +├─sda3 8:3 0 139.7G 0 part +│ └─md124 9:124 0 139.6G 0 raid1 /run/initramfs/overlayfs +└─sda4 8:4 0 139.7G 0 part + └─md126 9:126 0 279.1G 0 raid0 + ├─metalvg0-SCFIRMWARE 254:0 0 80G 0 lvm /opt/cray/FW/sc-firmware + └─metalvg0-SLINGSHOT 254:1 0 120G 0 lvm /opt/slingshot +sdb 8:16 0 3.5T 0 disk +├─sdb1 8:17 0 476M 0 part +│ └─md127 9:127 0 475.9M 0 raid1 /metal/recovery +├─sdb2 8:18 0 22.8G 0 part +│ └─md125 9:125 0 22.8G 0 raid1 /run/initramfs/live +├─sdb3 8:19 0 139.7G 0 part +│ └─md124 9:124 0 139.6G 0 raid1 /run/initramfs/overlayfs +└─sdb4 8:20 0 139.7G 0 part + └─md126 9:126 0 279.1G 0 raid0 + ├─metalvg0-SCFIRMWARE 254:0 0 80G 0 lvm /opt/cray/FW/sc-firmware + └─metalvg0-SLINGSHOT 254:1 0 120G 0 lvm /opt/slingshot +sdc 8:32 0 3.5T 0 disk +sdd 8:48 0 3.5T 0 disk +``` + +```bash +fmn001:~ # mount | grep /opt/cray/FW/sc-firmware +/dev/mapper/metalvg0-SCFIRMWARE on /opt/cray/FW/sc-firmware type ext4 (rw,relatime,stripe=256) +``` + +```bash +fmn001:~ # mount | grep /opt/slingshot +/dev/mapper/metalvg0-SLINGSHOT on /opt/slingshot type ext4 (rw,relatime,stripe=256) +``` + +#### Validate addition of FM required repositories + +Check if all the required repos are added on both FMN nodes (`fmn001` and `fmn002`) in order to install prerequisite OS RPMs required during Slingshot Software installation. + +For Example: + +```bash +fmn001:~ # zypper lr +``` + +```text +Repository priorities are without effect. All enabled repositories share the same priority. + +# | Alias | Name | Enabled | GPG Check | Refresh +---+-----------------------------------------------------------------------+--------------------------------------------------+---------+-----------+-------- + 1 | SUSE-25.7.250709-SLE-Module-Development-Tools-15-SP6-x86_64-Pool | SUSE-25.7.250709-SLE-Module-Development-Tools--> | Yes | ( ) No | Yes + 2 | SUSE-25.7.250709-SLE-Module-Legacy-15-SP7-x86_64-Updates | SUSE-25.7.250709-SLE-Module-Legacy-15-SP7-x86_-> | Yes | ( ) No | Yes + 3 | SUSE-25.7.250709-SLE-Module-Server-Applications-15-SP7-x86_64-Pool | SUSE-25.7.250709-SLE-Module-Server-Application-> | Yes | ( ) No | Yes + 4 | SUSE-SLE-Module-Basesystem-15-SP6-x86_64-Pool | SUSE-SLE-Module-Basesystem-15-SP6-x86_64-Pool | Yes | ( ) No | Yes + 5 | SUSE-SLE-Module-Containers-15-SP7-x86_64-Updates | SUSE-SLE-Module-Containers-15-SP7-x86_64-Updates | Yes | ( ) No | Yes + 6 | csm-embedded | csm-embedded | Yes | ( ) No | Yes + ... +``` + +### Install Fabric Manager on FM baremetal nodes + +For install/ upgrade Fabric Manager on the FMNs please refer section "3 Install HPE Slingshot Fabric Manager software on bare metal servers" in _HPE Slingshot Installation Guide for CSM_ PDF. diff --git a/operations/fm_on_baremetal/README.md b/operations/fm_on_baremetal/README.md new file mode 100644 index 0000000000000..2c9887002f2e6 --- /dev/null +++ b/operations/fm_on_baremetal/README.md @@ -0,0 +1,62 @@ +# Slingshot Fabric Manager on baremetal + +- [Introduction](#introduction) +- [Terminology and Components](#terminology-and-components) +- [Architecture](#architecture) +- [Configure FM on baremetal](#configure-fm-on-baremetal) +- [Slingshot Switch Firmware Update](#slingshot-switch-firmware-update) + +## Introduction + +The Slingshot Fabric Manager (FM) on bare-metal enablement within the Cray System Management (CSM) framework introduces dedicated Fabric Manager Nodes (FMNs) +that manage and monitor Slingshot fabric operations outside of the Kubernetes environment. + +CSM 1.7.1 includes bare-metal FM support, which provides the necessary base OS image, networking, and storage configurations for running the Slingshot Fabric Manager natively within the CSM environment. + +Please note that this feature is only available from CSM 1.7.1 onwards. + +**NOTE**: + +- `FMNs` are considered Management nodes. +- After Fabric Manager is migrated from a Kubernetes pod to bare-metal infrastructure, it cannot be reverted. +- The two `FMNs` must be part of two different management racks to support Rack Resiliency. +- This feature will not be supported on systems with Dell/ Mellanox based management networks. + +## Terminology and Components + +| *Component* | *Reference* | +| --------------------------------------------- | ------------------------------------------------------------------------------------- | +| SHS | [Slingshot Host Software](../../glossary.md#slingshot-host-software-shs) | +| FM | [Fabric Manager](../../glossary.md#fabric-manager) | +| FMN | [Fabric Manager Node](../../glossary.md#fabric-manager-node) | +| SLS | [System Layout Service](../../glossary.md#system-layout-service-sls) | +| HSM | [Hardware State Manager](../../glossary.md#hardware-state-manager-hsm) | +| BSS | [Boot Script Service](../../glossary.md#boot-script-service-bss) | +| CANU | [CSM Automatic Network Utility](../../glossary.md#csm-automatic-network-utility-canu) | +| SAT | [System Admin Toolkit](../../glossary.md#system-admin-toolkit-sat) | +| SMA | [System Monitoring Application](../../glossary.md#system-monitoring-application-sma) | + +## Architecture + +In CSM versions <= 1.7.0, the deployment of the Fabric Manager within CSM uses native Kubernetes capabilities—both during upgrades and in failure/HA scenarios. +Kubernetes itself provides health checks and a scheduler that can rebalance workloads across nodes based on load, administrative policies, and other criteria. +The Fabric Manager is deployed as a single pod in Kubernetes. A traditional HA model for Fabric Manager doesn’t map cleanly into Kubernetes, so instead, Kubernetes’ +built-in mechanisms detect failures and spin up a replacement pod, minimizing downtime. + +In theory, this model should satisfy HA requirements: if the pod fails (or needs to be moved during an upgrade), Kubernetes can detect the fault and +recreate the Fabric Manager on another node, providing continuity. + +In practice, however, this approach does not meet the contractual HA obligations. Because of Kubernetes "best‑effort" scheduling and the resource demands +of the Fabric Manager, real service outages can exceed 5 minutes. + +To address these issues, CSM 1.7.1 includes FM on baremetal support, which provides the necessary base OS image, networking, and storage configurations +for running the Slingshot Fabric Manager natively within the CSM environment to achieve HA. + +## Configure FM on baremetal + +To configure FM on baremetal please follow the [procedure](Configure_FM_On_Baremetal.md). + +## Slingshot Switch Firmware Update + +- For clusters using the FM pod: CSM will continue to handle switch firmware uploads and updates as [before](../../operations/iuf/workflows/slingshot_management_network_switch_updates.md#perform-slingshot-switch-and-management-network-switch-firmware-updates). +- For clusters with bare-metal FM: FMN will host the switch firmware, and FM will be responsible for managing switch updates. Refer section "3.2.6 (Optional) Update HPE Slingshot switch firmware" in HPE Slingshot Installation Guide for CSM PDF. diff --git a/operations/fm_on_baremetal/Redeploy_Fabric_Manager_Nodes.md b/operations/fm_on_baremetal/Redeploy_Fabric_Manager_Nodes.md new file mode 100644 index 0000000000000..82655dd7379a9 --- /dev/null +++ b/operations/fm_on_baremetal/Redeploy_Fabric_Manager_Nodes.md @@ -0,0 +1,130 @@ +# Redeploy Fabric Manager Nodes + +> **OPTIONAL:** This procedure is only applicable if Fabric Manager nodes were deployed during the CSM installation. + +Although Fabric Manager Nodes (FMNs) were deployed during the CSM installation, the initial deployment would not include the final image +with all necessary components. Once the other HPE Cray EX software products have been installed via the Install and Upgrade Framework (IUF), +the FMNs need to be redeployed with the new customized image. + +## Prerequisites + +- CSM installation has been completed +- Additional HPE Cray EX software products have been installed via IUF +- The Cray CLI is configured and authenticated +- SAT is configured and authenticated + +## Procedure + +### 1. Build the FMN image + +Follow the procedure in the **FMN Base Image Creation** section of [Configure FM (Fabric Manager) On Baremetal](Configure_FM_On_Baremetal.md#fmn-base-image-creation) to build the new FMN base image. + +This procedure will: + +- Create a `sat bootprep` configuration file for FMN +- Execute `sat bootprep run` to generate the new FMN image and upload it to S3 +- Produce a new IMS image ID for the customized FMN image + +### 2. Update BSS with the new FMN image + +Once the new FMN image has been built and uploaded to S3, update the boot parameters in the Boot Script Service (BSS) to point the FMNs to the new image. + +1. (`ncn-mw#`) Set an environment variable for the new IMS image ID. + + After running `sat bootprep run`, obtain the IMS resultant image ID from the output or from the CFS session: + + ```bash + NEW_IMS_IMAGE_ID="" + ``` + +1. (`ncn-mw#`) Determine the component names (xnames) of the FMNs. + + ```bash + cray hsm state components list --role Management --subrole FabricManager --format json | jq -r '.Components[].ID' + ``` + + Example output: + + ```text + x3000c0s28b0n0 + x3000c0s29b0n0 + ``` + +1. (`ncn-mw#`) Update the boot parameters for the FMNs. + + Replace the `` placeholders with the actual xnames of the FMNs. + + ```bash + /usr/share/doc/csm/scripts/operations/node_management/assign-ncn-images.sh \ + -p "${NEW_IMS_IMAGE_ID}" + ``` + + For example: + + ```bash + /usr/share/doc/csm/scripts/operations/node_management/assign-ncn-images.sh \ + -p "${NEW_IMS_IMAGE_ID}" x3000c0s28b0n0 x3000c0s29b0n0 + ``` + +1. (`ncn-mw#`) Verify the boot parameters have been updated. + + ```bash + cray bss bootparameters list --name --format json | jq -r '.[0].params' | grep metal.server + ``` + + The output should show the new IMS image ID in the `metal.server` parameter. + +1. (`ncn-mw#`) Set `metal.no-wipe=0` to allow the disk to be wiped during redeployment. + + For each FMN, set `metal.no-wipe=0`: + + For Example: + + ```bash + TARGET_XNAME=x3000c0s28b0n0 + csi handoff bss-update-param --set metal.no-wipe=0 --limit ${TARGET_XNAME} + ``` + + Expected output: + + ```bash + 2026/06/05 11:33:35 TOKEN was not set. Attempting to read API token from Kubernetes directly ... + 2026/06/05 11:33:35 Getting management NCNs from SLS... + 12 + 2026/06/05 11:33:35 Done getting management NCNs from SLS. + 2026/06/05 11:33:35 Updating NCN kernel parameters... + 2026/06/05 11:33:35 Successfully PUT BSS entry for x3000c0s28b0n0 + 2026/06/05 11:33:35 Done updating NCN kernel parameters. + ``` + + Repeat for each FMN. + +1. (`ncn-mw#`) Verify the change: + + ```bash + cray bss bootparameters list --name ${TARGET_XNAME} --format=json | jq -r '.[0].params' | grep metal.no-wipe + ``` + + The output should show `metal.no-wipe=0`. + +### 3. Redeploy the FMNs + +After updating BSS with the new image and setting `metal.no-wipe=0`, redeploy the FMNs to apply the new image. + +Follow the [Boot NCN](../node_management/Add_Remove_Replace_NCNs/Boot_NCN.md) procedure for each Fabric Manager node. This procedure will: + +- Set the PXE boot option and power on the node +- Monitor the boot process +- Set `metal.no-wipe=1` after successful boot to preserve data on future reboots + +**Note:** Skip the sections in Boot NCN that are specific to master, worker, or storage nodes (such as verifying cluster membership or Ceph operations). + +### 4. Join Fabric Manager nodes to Spire + +After the Fabric Manager nodes have been redeployed and are running with the new image, join them to Spire to avoid issues with Spire tokens. + +1. (`ncn-mw#`) Join Spire on the Fabric Manager nodes. + + ```bash + /opt/cray/platform-utils/spire/fix-spire-on-fmn.sh + ``` diff --git a/operations/node_management/Add_Remove_Replace_NCNs/Add_NCN_Data.md b/operations/node_management/Add_Remove_Replace_NCNs/Add_NCN_Data.md index 05e7cf6545c73..0265e5e7ac8f7 100644 --- a/operations/node_management/Add_Remove_Replace_NCNs/Add_NCN_Data.md +++ b/operations/node_management/Add_Remove_Replace_NCNs/Add_NCN_Data.md @@ -357,6 +357,23 @@ The NCN MAC addresses need to be collected using the [Collect NCN MAC Addresses] --mac-lan1 b8:59:9f:d9:9d:e9 ``` + * For FMNs (Fabric Manager Nodes), where alias is fmn00*, we need to pass on additional `--fmn-image-id` parameter with FMN base image ID + generated in the [FMN base image creation stage](../../fm_on_baremetal/Configure_FM_On_Baremetal.md#fmn-base-image-creation). + + For Example: Base image id of FMN is `06135c73-bcd9-4d38-928f-ada20bdf6a6` + + ```bash + cd /usr/share/doc/csm/scripts/operations/node_management/Add_Remove_Replace_NCNs/ + ./add_management_ncn.py ncn-data \ + --xname "${XNAME}" \ + --alias "${NODE}" \ + --fmn-image-id 06135c73-bcd9-4d38-928f-ada20bdf6a6f \ + --mac-mgmt0 a4:bf:01:65:6a:aa \ + --mac-mgmt1 a4:bf:01:65:6a:ab \ + --mac-lan0 b8:59:9f:d9:9d:e8 \ + --mac-lan1 b8:59:9f:d9:9d:e9 + ``` + 1. (`ncn-mw#`) Run the `add_management_ncn.py` script again, adding the `--perform-changes` argument to the command run in the previous step: > ***NOTE*** Depending on the networking configuration of the system the CMN or CAN networks diff --git a/operations/node_management/Add_Remove_Replace_NCNs/Add_Remove_Replace_NCNs.md b/operations/node_management/Add_Remove_Replace_NCNs/Add_Remove_Replace_NCNs.md index d8c5487c99a3e..81f2d30dd403b 100644 --- a/operations/node_management/Add_Remove_Replace_NCNs/Add_Remove_Replace_NCNs.md +++ b/operations/node_management/Add_Remove_Replace_NCNs/Add_Remove_Replace_NCNs.md @@ -11,7 +11,7 @@ Add, remove, replace, or move non-compute nodes (NCNs). This applies to worker, The following workflows are available: * [Prerequisites](#prerequisites) -* [Add worker, storage, or master NCNs](#add-worker-storage-or-master-ncns) +* [Add worker, storage, master or FMN NCNs](#add-worker-storage-master-or-fmn-ncns) * [Add NCN prerequisites](#add-ncn-prerequisites) * [Add NCN procedure](#add-ncn-procedure) * [Remove worker, storage, or master NCNs](#remove-worker-storage-or-master-ncns) @@ -38,7 +38,15 @@ The latest CSM documentation has been installed on the master nodes. See [Check ./ncn_add_pre-req.py ``` - The script will ask the following question: + Note: For adding FMNs (Fabric Manager Nodes) to CSM there is a new prompt added to confirm if the node getting added is an FMN or not: + + ```text + Please answer with yes or no. + Are the NCNs to be added are Fabric Manager Nodes (FMNs)? [y/N] + y + ``` + + Overall, the `ncn_add_pre-req.py` script prompts the user with the following questions: ```text How many NCNs would you like to add? Do not include NCNs to be removed or moved. @@ -53,6 +61,10 @@ The latest CSM documentation has been installed on the master nodes. See [Check How many NCNs would you like to add? Do not include NCNs to be removed or moved. 10 + Please answer with yes or no. + Are the NCNs to be added are Fabric Manager Nodes (FMNs)? [y/N] + N + You are about to make DESTRUCTIVE changes to the system. If you are sure you want to proceed. Please type: PROCEED @@ -150,9 +162,9 @@ The latest CSM documentation has been installed on the master nodes. See [Check Restarting cray-dhcp-kea ``` -## Add worker, storage, or master NCNs +## Add worker, storage, master or FMN NCNs -Use this procedure to add a worker, storage, or master NCN. +Use this procedure to add a worker, storage, master or FMN (Fabric Manager Node) NCNs. ### Add NCN prerequisites @@ -255,4 +267,4 @@ In general, scaling master nodes is not recommended because it can cause Etcd la The following is a high-level overview of the replace NCN workflow: 1. [Remove Worker, Storage, or Master NCNs](#remove-worker-storage-or-master-ncns) -1. [Add Worker, Storage, or Master NCNs](#add-worker-storage-or-master-ncns) +1. [Add worker, storage, master or FMN NCNs](#add-worker-storage-master-or-fmn-ncns) diff --git a/operations/node_management/Add_Remove_Replace_NCNs/Boot_NCN.md b/operations/node_management/Add_Remove_Replace_NCNs/Boot_NCN.md index 0b46e98dea2b4..7fc447304b469 100644 --- a/operations/node_management/Add_Remove_Replace_NCNs/Boot_NCN.md +++ b/operations/node_management/Add_Remove_Replace_NCNs/Boot_NCN.md @@ -262,3 +262,7 @@ Follow [Add Ceph Node](../../utility_storage/Add_Ceph_Node.md) to join the added Proceed to [Redeploy Services](Redeploy_Services.md) or return to the main [Add, Remove, Replace, or Move NCNs](Add_Remove_Replace_NCNs.md) page. + +**Note:** + +* For FMN nodes we can skip rest of the steps. diff --git a/scripts/operations/configuration/apply_csm_configuration.sh b/scripts/operations/configuration/apply_csm_configuration.sh index 4887fff783335..90c100f9a7a7e 100755 --- a/scripts/operations/configuration/apply_csm_configuration.sh +++ b/scripts/operations/configuration/apply_csm_configuration.sh @@ -2,7 +2,7 @@ # # MIT License # -# (C) Copyright 2021-2025 Hewlett Packard Enterprise Development LP +# (C) Copyright 2021-2026 Hewlett Packard Enterprise Development LP # # Permission is hereby granted, free of charge, to any person obtaining a # copy of this software and associated documentation files (the "Software"), @@ -196,7 +196,8 @@ BACKUP_NCN_CONFIG_FILE=$(run_mktemp --tmpdir="${TMPDIR}" "backup-${CONFIG_NAME}- if [[ -z ${XNAMES} ]]; then echo "Retrieving a list of all management node component names (xnames)" - XNAMES=$(cray hsm state components list --role Management --type Node --format json | jq -r '.Components | map(.ID) | join(",")') + echo "NOTE: FabricManager nodes are excluded from configuration" + XNAMES=$(cray hsm state components list --role Management --type Node --format json | jq -r '.Components | map(select(.SubRole != "FabricManager")) | map(.ID) | join(",")') [[ -n ${XNAMES} ]] || err_exit "No management nodes found in HSM" fi XNAME_LIST=${XNAMES//,/ } diff --git a/upgrade/README.md b/upgrade/README.md index fa6cf626c5c63..161c3f9db30cd 100644 --- a/upgrade/README.md +++ b/upgrade/README.md @@ -9,6 +9,7 @@ software. Choose the appropriate procedure from the sections below. * [Option 2: Upgrade only additional HPE Cray EX software products](#option-2-upgrade-only-additional-hpe-cray-ex-software-products) * [Option 3: Upgrade only CSM](#option-3-upgrade-only-csm) * [CSM patch version upgrade](#csm-patch-version-upgrade) +* [FM On Baremetal](#fm-on-baremetal) ## Release Notes @@ -56,3 +57,7 @@ If there are multiple patch versions available, note that there is no need to pe CSM 1.7.1 patch upgrades. Instead, consider upgrading to the latest CSM 1.7.1 patch release. * [CSM 1.7.1 Patch Installation Instructions](1.7.1/README.md) + +## FM On Baremetal + +Post CSM Upgrade from 1.7.0 to CSM 1.7.1, if an administrator wishes to enable Fabric Manager on baremetal, they must follow the [procedure](../operations/fm_on_baremetal/README.md#fm-fabric-manager-on-baremetal).