Enterprise SAN Switch Upgrade
08/10/2022

Introduction
In an Enterprise setting upgrading storage infrastructure is quite different from running updates on your home PC; or at least it should be. While updates expand functionality, simplify interfaces, fix bugs and close vulnerabilities they can also introduce new bugs and vulnerabilities. Sometimes the new bugs are contingent upon factors which exist in your environment and can result in encountering the issue the bug creates. In an Enterprise environment where many users and sometimes customers rely upon the storage infrastructure the impact of an issue caused by an upgrade can be broad and affect business credibility with potentially even legal ramifications. Therefore, having a process to mitigate as many risks as possible is a necessity. The process presented here rests in a general framework with specific steps related to Cisco and Brocade SAN switch upgrades.
Overview
The process described at a high-level here is a good general framework for any shared infrastructure upgrade in an Enterprise environment.
- Planning
- Document current environment cross section from CMDB and/or direct system inquiry.
- (Server Hardware Model, OS and Adapter Model/Firmware/Driver as well as SAN Switch Model/Firmware and current Storage Model/Code Level)
- Ensure the SAN infrastructure is under vendor support so that code may be downloaded and support may be engaged if any problems are encountered.
- Download and Review Release notes for the top 3 recent code releases.
- Use vendor interoperability documents or web applications to validate supportability in your environment using this previously gathered information.
- Choose the target code level. (Often N-1 is preferred over N, bleeding edge latest releases, unless significant vulnerabilities or incompatibility with your environment exists.)
- Preparation
- Download the target release installation code and any upgrade test utilities provided by the vendor.
- Upload the target code and test utility and run test utility. (clean up old diagnostics and install images no longer needed to provide necessary space for new code and upgrade process)
- Run initial health checks on the storage systems.
- Gather connectivity information from SAN and Storage devices and verify connection and path redundancy.
- Initiate a resolution plan before scheduling the upgrade for any identified issues.
- Submit change control and obtain approval for upgrade.
- Upgrade
- Rerun the upgrade test utility to verify issues are still resolved.
- Perform health checks and gather interface status showing pre-upgrade connectivity
- Clear stats and logs so that all events will be related to the upgrade
- Run configuration backup, diagnostic snapshot and list logs to a file downloading each to a central configuration repository.
- Initiate any prerequisite components microcode upgrades (transceiver firmware, etc) and validate completion.
- Initiate system update and monitor upgrade process
- Upon completion validate upgrade, perform health checks and gather post-upgrade interface status and validate the dependent systems connectivity.
SAN Switch Upgrade Planning
1. The first step is to identify all Cisco and Brocade Storage switches by querying CMDB or inventory lists and document them along with their current code levels. Verify that the switches are supported under a vendor support and maintenance contract.
| Switch Name | Access URL/IP | MFG Type-Model | Location | Serial Number | Version |
|---|---|---|---|---|---|
2. Next query the SAN Switches for lists of the hosts attached to them and import this list into a spreadsheet. Then query the cmdb to obtain a list of the system in the environment along with their OS and hardware model information and pull this information into the same spreadsheet. Cross reference between these lists and then create a report by OS and Hardware.
Device, Software, Host OS, and SAN
| Component | Type | Vendor | Model-Type | Code Levels | HBA Models | HBA Drivers |
|---|---|---|---|---|---|---|
| Power VC | App | IBM | 1.3.2.1 | |||
| HMC | Appliance | IBM | 7042-CR7 | 8.2.0 | ||
| Power 750 | Server | IBM | 8408-E8D | VIOS 2.2.3.3 VIOS 2.2.4.10 VIOS 2.2.4.22 | FC 5273 FC 5735 | 10DF:F100-202307 10DF:F100-202307 10DF:F100-203305 |
| Power S824 | Server | IBM | 8286-428 | FC 5273 FC 5735 | 10DF:F100-202307 10DF:F100-202307 10DF:F100-203305 | |
| Redhat Linux | OS | Redhat | RHEL CENTOS | 7.9 8.2 8.4 | ||
| Fibre Switch | SAN Switch | Cisco | MDS 9148 | 8.4(2c) | ||
| FS900 | Storage | IBM | 9840-AE2 | 1.6.4.1 | ||
3. Download the release notes from the three latest releases of microcode released by the vendors supporting the SAN infrastructure identified previously.
Cisco – MDS SAN Switch NX-OS
Cisco MDS 9000 Recommended Releases
All Cisco MDS 9000 NX-OS Documentation
4. Use vendor interoperability documents or web applications to validate supportability in your environment use the previously gathered information to cross reference with support matrices or to enter into interoperability databases to determine supportability of the target SAN microcode as well as any potential code requirements for host adapters and storage arrays.
Cisco MDS 9000 SAN Switch Interoperability
IBM Storage, SAN and Server Interoperability Database
Dell EMC eLab Interoperability Database
FlashArray Compatibility Matrix – Pure Technical Services (purestorage.com)
5. Review the documentation including release notes, interoperability data and upgrade path information. Determine the target code level based upon the releases which support your hardware giving priority to (N -1) code levels unless significant vulnerabilities are fixed by latest (N) code levels.
6. Review documentation on best practices for SAN switch upgrade published by the vendors and determine if any updates to existing procedures need to be made.
| Current Release | Nondisruptive Upgrade Paths and Ordered Upgrade Steps |
| MDS NX-OS | |
| 9.3(1) | Upgrade directly to MDS NX-OS Release 9.3(2) |
| 9.2(x) | Upgrade directly to MDS NX-OS Release 9.3(2) |
| 8.1(x) and above releases[1] | Upgrade directly to MDS NX-OS Release 9.3(2) |
| All 7.3(x) releases | Step 1. Upgrade directly to MDS NX-OS Release 8.1(1b) Step 2. Upgrade to MDS NX-OS Release 9.3(2) |
NX-OS upgrade Best Practices for MDS switches – Cisco Community
SAN Switch Upgrade Preparation
- Prior to upgrade use the Cisco Device Manager to gather the latest Interfaces->FC-All and ->Flogi output saving to a directory under (%UserProfile% %OneDrive%)/{Org|Client}/reference/{data-center}/(san_fc-all|san_flogi|zones-all)/ with the file name {switchname}_(san_fc-all|san_flogi|zones)_YYYY-MM-DD.txt. Import these into an excel spreadsheet to verify hosts have redundant connections to the fabric.
- Perform Health Checks on redundant switches in the fabric to ensure that alternate fabrics are healthy. This includes listing the last 200 entries in the log looking for dormant issues, listing hardware to determine it is online and listing any locks that may need to be cleared. Also note the count of up interfaces and flogi logins for comparison after upgrade. The same number of connections should persist after upgrade.
<code>terminal length 0
show interface brief | grep up | wc
show flogi database | wc
show version
show hardware
show environment
show system redundancy status
show module
show cfs lock
show zone status
show zone pending-diff
show log last 200 | include 'error|warning|alarm|failure|critical'
## OR
show logging | include 'error|warning|alarm|failure|critical' </code>
- Compare the count of up interfaces and flogi database entries between switch pairs; they should be the same or very close
- Check the current versions of Bios, Kickstart and System to the Upgrade Path
SAN01# show version |
Use show hardware, module and system redundancy status to check that all status are Ok or active. Any failures will need to be resolved before upgrades
SAN01# show system redundancy status |
- Ensure zones are applied and zfs locks are cleared
- check for any pending uncommitted zones
- check the logs for any indication of failing components
3. Verify that you have a copy of the current firmware on your TFTP/scp server so that you have a backup in the event that you must return to the original version. If you do not, copy it from the switch to the TFTP/scp server at this time.
4. Upload microcode both kickstart and system file to the switch using scp or download from the switch using copy tftp/scp. List the bootflash and check the md5sum to ensure the microcode is valid.
copy scp://{USER}@{HostIP}/var/mds/depot/m9100-s5ek9-kickstart-mz.8.4.2c.bin bootflash:
copy scp://{USER}@{HostIP}/var/mds/depot/m9100-s5ek9-mz.8.4.2c.bin bootflash:
dir bootflash:
show file bootflash:m9100-s5ek9-mz.8.1.1b.bin md5sum
show file bootflash:m9100-s5ek9-kickstart-mz.8.4.2c.bin md5sum
5. Run the impact and incompatibility analysis against these files to see if there are any issues with your current switch hardware being targeted for upgrade.
show install all impact kickstart bootflash:m9100-s5ek9-kickstart-mz.8.4.2c.bin system bootflash:m9100-s5ek9-mz.8.4.2c.bin
show incompatibility system bootflash:m9100-s5ek9-mz.8.1.1b.bin
6. Check if there are any custom port monitoring configuration making note of them and then remove them from the ports indicated.
show running-config | beg port-monitor
## Remove any port monitors
config t
no port-monitor name < policy from show running>
End
SAN Switch Upgrade Preparation for Smart Licensing
Starting with Cisco MDS NX-OS version 9.3 and above support for standard licensing is deprecated and Smart Licensing is required. Your standard licenses need to be converted into Smart licenses following the instructions in the link under "Registering Smart Licenses" . Other considerations are that the switches need to be able to communicate with the Cisco licensing service over the internet. Either the switches need to have https port 443 access over the internet or a local Cisco SSM On-Prem proxy server needs to be setup with access to the Cisco CSSM servers. Additionally a switch may be using a permanent PAK port license which will not be affected by the smart licensing, only port COD licenses and other licenses such as Enterprise features will be affected.
Default and PAK port licenses that were enabled in SL 1.0 will continue to work after upgrading to SLP. Any new port licenses will need authorization code to be installed. If DLC was not performed in SL 1.0, automatic DLC will not trigger in SLP. Contact Cisco TAC for migrating the licenses..
Registering Smart Licenses and Configuring Switches
Cisco MDS 9000 Series Licensing Guide, Release 9.x – Smart Licensing Using Policy [Cisco MDS 9000 NX-OS and SAN-OS Software] – Cisco
Cisco MDS Smart Licensing Using Policy Data Sheet – Cisco
Check the License Inventory Here:
https://software.cisco.com/software/smart-licensing/inventory
Enabling Smart Licensing on MDS Switches
If the switches have direct access through https to the CSSM Servers on the internet then the switches will need to be configured using the following steps.
1. Configure DNS on the MDS switches so they can resolve the CSSM server names.
show host
## If there are incorrect entries or default entries
## These will have to be removed with the no prefix before applying the following configurations
config t
no ip domain-name {incorrect.domain.name}
no domin-list {incorrect.domain.name}
end
copy r s
config t
ip domain-lookup
ip domain-name my.domain.com
ip domain-list my.other.domain.com
ip name-server 10.10.10.1 10.10.10.2
2. Gather current license configuration and store a copy on another system
## default replaced by summary, host-id by udi on new NX-OS version
show license default
show license usage
show license feature package mapping
show license host-id
show port-lic
copy licenses bootflash:$(SWITCHNAME)_licenses.tar
copy $(SWITCHNAME)_licenses.tar scp://{USER}@{HostIP}/var/mds/cfgs/
3. Enable smart licensing
config t
license smart enable
feature license smart
4. Obtain the Smart License trust token and install it on the switch
## Install the smart trust token
license smart trust idtoken YXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX all
license smart sync all
5. SLP Configuration – Connected Directly to CSSM Topology. (Note: Older NX-OS code versions may not support enabling smart transport and this may need to be enabled after upgrade)
config t
license smart transport smart
license smart url smart https://smartreceiver.cisco.com/licservice/license
copy r s
exit
## Enable the ivr feature on core switches for the Enterprise Feature set
config t
feature ivr
copy r s
exit
## NX-OS 9.x and above; check status of smart connection
show license summary
show license status
show license usage
SAN Switch Upgrade Implementation
1. Make backups of the running configuration, flogi database, Interface stats, Event Logs and gather a diagnostic snapshot. The diagnostic may be used if the switch becomes inaccessible when opening a case with support. Copy the files generated by the output of these commands to a central repository server or gather them to your local system using WinScp.
show logging logfile > $(SWITCHNAME)-$(TIMESTAMP)_logs.log
show flogi database > $(SWITCHNAME)-$(TIMESTAMP)_flogi.log
show interface > $(SWITCHNAME)-$(TIMESTAMP)_int.log
show interface counters > $(SWITCHNAME)-$(TIMESTAMP)_count.log
show tech detail > $(SWITCHNAME)-$(TIMESTAMP)_tech.log
copy running-config bootflash:$(SWITCHNAME)-$(TIMESTAMP).cfg
dir bootflash:
## Examples for Sterling and Dallas
# {Location}
copy $(SWITCHNAME)-*_flogi.log scp://{USER}@{HostIP}/var/mds/logs/
copy $(SWITCHNAME)-*_logs.log scp://{USER}@{HostIP}/var/mds/logs/
copy $(SWITCHNAME)-*_tech.log scp://{USER}@{HostIP}/var/mds/logs/
copy $(SWITCHNAME)-*_int.log scp://{USER}@{HostIP}/var/mds/logs/
copy $(SWITCHNAME)-*_count.log scp://{USER}@{HostIP}/var/mds/logs/
copy $(SWITCHNAME)-*.cfg scp://{USER}@{HostIP}/var/mds/logs/
2. Clear the logs, cores, counters and diagnostics from the bootflash to free up space.
clear logging logfile
clear cores
clear counters interface all
delete bootflash:*_tech.log
3. Make sure the configuration has been saved by copying the running configuration to the startup configuration.
copy r s
4. Perform switch upgrade. Replace the kickstart and system files with those specific to the model of switch being upgraded and the target code level. Review the install validation and respond accordingly yes or no to continue the upgrade.
dir bootflash:
install all kickstart bootflash:m9100-s5ek9-kickstart-mz.8.4.2c.bin system bootflash:m9100-s5ek9-mz.8.4.2c.bin
5. Upon upgrade completion verify that connections to the fabric persist
show interface brief | grep up | wc
show flogi database | wc
6. Verify the installation status and version and check the logs for any issues. Save command output artifacts for change validation.
terminal length 0
show install all status
show version
show install all impact
show logging last 100
Back-Out Procedures
https://www.cisco.com/c/en/us/td/docs/dcn/mds9000/sw/9x/upgrade/upgrade.html#pgfId-723981
| To MDS NX-OS Release | Nondisruptive Downgrade Paths and Ordered Downgrade Steps |
| MDS NX-OS: | |
| 9.3(x) | Downgrade to the target release |
| 9.2(x) | Downgrade to the target release |
| 8.1(x) and above releases | Downgrade to the target release |
| All 7.3(x) releases | 1. Downgrade directly to MDS NX-OS Release 8.1(1b) 2. Downgrade to the target release |
| 6.2(29) and above releases | 1. Downgrade directly to MDS NX-OS Release 8.4(2c) 2. Downgrade to the target release |
| 6.2(13a) until 6.2(27) | 1. Downgrade directly to MDS NX-OS Release 8.1(1b) 2. Downgrade to the target release |
| All 6.2(x) releases prior to 6.2(13a) | 1. Downgrade directly to MDS NX-OS Release 8.1(1b) 2. Downgrade to MDS NX-OS Release 6.2(13a) 3. Downgrade to the target release |
Table 8 Nondisruptive Downgrade Paths from NX-OS Release 9.3(2a)
- If you are downgrading from Cisco MDS NX-OS Release 9.x to a release prior to Cisco MDS NX-OS Release 9.2(1), ensure that you use the clear logging onboard txwait command after downgrading. Otherwise, logging to the OBFL TxWait file may cease with an error. For more information, see the Cisco MDS 9000 Series Interfaces Configuration Guide, Release 9.x.
- If you copy firmware using the SFTP or SCP clients after enabling the feature scp-server or feature sftp-server command on your switch, ensure that you close the SFTP or SCP connection using the no feature scp-server or no feature sftp-server command before performing ISSD. Otherwise, ISSD will be disruptive. To avoid this issue, we recommend that you transfer files to the switch using the copy command instead or using the DCNM client.
- Prior to upgrade or downgrade, reset the switch’s logging levels to the system defaults via the no logging level all configuration command. If this is not done, the upgrade or downgrade may be disruptive due to excessive logging causing control plane downtime exceeding 80 seconds.
Before entering the no logging level all command, ensure that the switch’s current logging configuration is saved. This will need to restored after the upgrade or downgrade.
Follow these steps:
1. Enter the show running-config | i “logging level” command and save the output. These are the switch’s current settings.
2. Enter the no logging level all command in configuration mode.
3. Perform upgrade or downgrade.
4. Restore logging level configuration using the output that was saved from Step 1.
- To determine if high-bandwidth capability is enabled, use the show hardware fabric-mode command. The following example shows that the higher bandwidth capability is not activated:
switch# show hardware fabric-mode
Fabric mode supports only one configuration of 8G FC modules – 4/44 Host-Optimized 8G FC module.
switch#
The following example shows that the higher bandwidth capability is activated:
switch# show hardware fabric-mode
fabric mode supports FCoE, Gen2 and above linecards
switch#