Enterprise SAN Switch Upgrade

Introduction

In an Enterprise setting upgrading storage infrastructure is quite different from running updates on your home PC; or at least it should be. While updates expand functionality, simplify interfaces, fix bugs and close vulnerabilities they can also introduce new bugs and vulnerabilities. Sometimes the new bugs are contingent upon factors which exist in your environment and can result in encountering the issue the bug creates. In an Enterprise environment where many users and sometimes customers rely upon the storage infrastructure the impact of an issue caused by an upgrade can be broad and affect business credibility with potentially even legal ramifications. Therefore, having a process to mitigate as many risks as possible is a necessity. The process presented here rests in a general framework with specific steps related to Cisco and Brocade SAN switch upgrades.

Overview

The process described at a high-level here is a good general framework for any shared infrastructure upgrade in an Enterprise environment.

  1. Planning
    • Document current environment cross section from CMDB and/or direct system inquiry.
    • (Server Hardware Model, OS and Adapter Model/Firmware/Driver as well as SAN Switch Model/Firmware and current Storage Model/Code Level)
    • Ensure the SAN infrastructure is under vendor support so that code may be downloaded and support may be engaged if any problems are encountered.
    • Download and Review Release notes for the top 3 recent code releases.
    • Use vendor interoperability documents or web applications to validate supportability in your environment using this previously gathered information.
    • Choose the target code level. (Often N-1 is preferred over N, bleeding edge latest releases, unless significant vulnerabilities or incompatibility with your environment exists.)
  2. Preparation
    • Download the target release installation code and any upgrade test utilities provided by the vendor.
    • Upload the target code and test utility and run test utility.
    • Run initial health checks on the storage systems.
    • Gather connectivity information from SAN and Storage devices and verify connection and path redundancy.
    • Initiate a resolution plan before scheduling the upgrade for any identified issues.
    • Submit change control and obtain approval for upgrade.
  3. Upgrade
    • Rerun the upgrade test utility to verify issues are still resolved.
    • Perform health checks
    • Clear logs and clean diagnostic snapshots
    • Run configuration backup, diagnostic snapshot and list logs to a file downloading each to a central configuration repository.
    • Initiate any prerequisite components microcode upgrades (drive firmware, etc) and validate completion.
    • Initiate system update and monitor upgrade process
    • Upon completion validate upgrade, perform health checks and validate the dependent systems connectivity.

SAN Switch Upgrade Planning

1. The first step is to identify all Cisco and Brocade Storage switches by querying CMDB or inventory lists and document them along with their current code levels.  Verify that the switches are supported under a vendor support and maintenance contract.

Switch NameAccess URL/IPMFG Type-ModelLocationSerial NumberVersion

2. Next query the SAN Switches for lists of the hosts attached to them and import this list into a spreadsheet. Then query the cmdb to obtain a list of the system in the environment along with their OS and hardware model information and pull this information into the same spreadsheet. Cross reference between these lists and then create a report by OS and Hardware.

Device, Software, Host OS, and SAN

ComponentTypeVendorModel-TypeCode LevelsHBA ModelsHBA Drivers
Power VCAppIBM1.3.2.1
HMCApplianceIBM7042-CR78.2.0
Power 750ServerIBM8408-E8DVIOS 2.2.3.3
VIOS 2.2.4.10
VIOS 2.2.4.22
FC 5273
FC 5735
10DF:F100-202307
10DF:F100-202307
10DF:F100-203305
Power S824ServerIBM8286-428FC 5273
FC 5735
10DF:F100-202307
10DF:F100-202307
10DF:F100-203305
Redhat LinuxOSRedhatRHEL
CENTOS
7.9
8.2
8.4
Fibre SwitchSAN SwitchCiscoMDS 91488.4(2c)
FS900StorageIBM9840-AE21.6.4.1
Example Environment Cross Section (CMDB Data in Excel may be similarly summarized using a pivot table)

3. Download the release notes from the three latest releases of microcode released by the vendors supporting the SAN infrastructure identified previously. 

Cisco – MDS SAN Switch NX-OS

Cisco MDS Release Notes

Cisco MDS 9000 Recommended Releases

All Cisco MDS 9000 NX-OS Documentation

Cisco MDS 9000 Code Download

4. Use vendor interoperability documents or web applications to validate supportability in your environment use the previously gathered information to cross reference with support matrices or to enter into interoperability databases to determine supportability of the target SAN microcode as well as any potential code requirements for host adapters and storage arrays.

Cisco MDS 9000 SAN Switch Interoperability

IBM Storage, SAN and Server Interoperability Database

Dell EMC eLab Interoperability Database

5. Review the documentation including release notes, interoperability data and upgrade path information.  Determine the target code level based upon the releases which support your hardware giving priority to (N -1) code levels unless significant vulnerabilities are fixed by latest (N) code levels.

6. Review documentation on best practices for SAN switch upgrade published by the vendors and determine if any updates to existing procedures need to be made.

NX-OS upgrade Best Practices for MDS switches – Cisco Community

SAN Switch Upgrade Preparation

  1. Prior to upgrade use the Cisco Device Manager to gather the latest Interfaces->FC-All and ->Flogi output saving to a directory under (%UserProfile% %OneDrive%)/{​​​Org|Client}​​​​​​​​​​/reference/{​​​​​​​​​​data-center}​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​/(san_fc-all|san_flogi|zones-all)/ with the file name {​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​switchname}​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​_(san_fc-all|san_flogi|zones)_YYYY-MM-DD.txt.  Import these into an excel spreadsheet to verify hosts have redundant connections to the fabric.
     
  2. Perform Health Checks on redundant switches in the fabric to ensure that alternate fabrics are healthy.  This includes listing the last 200 entries in the log looking for dormant issues, listing hardware to determine it is online and listing any locks that may need to be cleared.  Also note the count of up interfaces and flogi logins for comparison after upgrade.  The same number of connections should persist after upgrade.
terminal length 0
show interface brief | grep up | wc
show flogi database | wc
show version
show hardware
show module
show cfs lock
show zone status
show log last 200

3. Verify that you have a copy of the current firmware on your TFTP/scp server so that you have a backup in the event that you must return to the original version. If you do not, copy it from the switch to the TFTP/scp server at this time.

4. Upload microcode both kickstart and system file to the switch using scp or download from the switch using copy tftp/scp.  List the bootflash and check the md5sum to ensure the microcode is valid.

copy scp://c-t9d5@10.234.16.93/var/mds/depot/m9100-s5ek9-kickstart-mz.8.4.2c.bin bootflash:
copy scp://c-t9d5@10.234.16.93/var/mds/depot/m9100-s5ek9-mz.8.4.2c.bin bootflash:

dir bootflash:
show file bootflash:m9100-s5ek9-mz.8.1.1b.bin md5sum
show file bootflash:m9100-s5ek9-kickstart-mz.8.4.2c.bin md5sum

5. Run the impact and incompatibility analysis against these files to see if there are any issues with your current switch hardware being targeted for upgrade.

show install all impact kickstart bootflash:m9100-s5ek9-kickstart-mz.8.4.2c.bin system bootflash:m9100-s5ek9-mz.8.4.2c.bin
show incompatibility system bootflash:m9100-s5ek9-mz.8.1.1b.bin

6. Check if there are any custom port monitoring configuration making note of them and then remove them from the ports indicated.

show running-config | beg port-monitor

## Remove any port monitors
config t
no port-monitor name < policy from show running>
End

SAN Switch Upgrade Implementation

1. Make backups of the running configuration, flogi database, Interface stats, Event Logs and gather a diagnostic snapshot.  The diagnostic may be used if the switch becomes inaccessible when opening a case with support.  Copy the files generated by the output of these commands to a central repository server or gather them to your local system using WinScp.

show logging logfile > $(SWITCHNAME)-$(TIMESTAMP)_logs.log
show flogi database > $(SWITCHNAME)-$(TIMESTAMP)_flogi.log
 
show interface > $(SWITCHNAME)-$(TIMESTAMP)_int.log
show interface counters > $(SWITCHNAME)-$(TIMESTAMP)_count.log
 
show tech detail > $(SWITCHNAME)-$(TIMESTAMP)_tech.log
 
copy running-config bootflash:$(SWITCHNAME)-$(TIMESTAMP).cfg
 
dir bootflash:

## Examples for Sterling and Dallas
Sterling
copy $(SWITCHNAME)-*_flogi.log scp://c-t9d5@10.230.16.99/var/mds/logs/
copy  $(SWITCHNAME)-*_logs.log scp://c-t9d5@10.230.16.99/var/mds/logs/
copy  $(SWITCHNAME)-*_tech.log scp://c-t9d5@10.230.16.99/var/mds/logs/
copy  $(SWITCHNAME)-*_int.log scp://c-t9d5@10.230.16.99/var/mds/logs/
copy  $(SWITCHNAME)-*_count.log scp://c-t9d5@10.230.16.99/var/mds/logs/
copy  $(SWITCHNAME)-*.cfg scp://c-t9d5@10.230.16.99/var/mds/logs/
 
Dallas
copy $(SWITCHNAME)-*_flogi.log scp://c-t9d5@10.234.16.93/var/mds/logs/
copy  $(SWITCHNAME)-*_logs.log scp://c-t9d5@10.234.16.93/var/mds/logs/
copy  $(SWITCHNAME)-*_tech.log scp://c-t9d5@10.234.16.93/var/mds/logs/
copy  $(SWITCHNAME)-*_int.log scp://c-t9d5@10.234.16.93/var/mds/logs/
copy  $(SWITCHNAME)-*_count.log scp://c-t9d5@10.234.16.93/var/mds/logs/
copy  $(SWITCHNAME)-*.cfg scp://c-t9d5@10.234.16.93/var/mds/logs/

2. Clear the logs, cores, counters and diagnostics from the bootflash to free up space.

clear logging logfile
clear cores
clear counters interface all

delete bootflash:*_tech.log
 

3. Make sure the configuration has been saved by copying the running configuration to the startup configuration.

copy r s

4. Perform switch upgrade.  Replace the kickstart and system files with those specific to the model of switch being upgraded and the target code level.  Review the install validation and respond accordingly yes or no to continue the upgrade.

dir bootflash:
install all kickstart bootflash:m9100-s5ek9-kickstart-mz.8.4.2c.bin system bootflash:m9100-s5ek9-mz.8.4.2c.bin

5. Upon upgrade completion verify that connections to the fabric persist

show interface brief | grep up | wc
show flogi database | wc

6. Verify the installation status and version and check the logs for any issues.  Save command output artifacts for change validation.

terminal length 0
show install all status
show version
show install all impact
show logging last 100

About Last Fiddle
I have always had many interests; technology, science, philosophy, theology, politics, history, etc... Currently, life for the past twelve years has placed me in the area of technology fulfilling roles in System Administration and Architecture. But I have always been involved in the local church and enjoy researching and discussing issues of theology, philosophy, history and politics...

Comments are closed.

%d bloggers like this: