CP switch is down due to partition image failure
Applies to
Brocade Switch
Issue
- While upgrading Brocade switch FOS from version v9.1.1c3 to v9.2.1a, the IP on the CP went offline for 30-35 minutes and CP switch went down due to partition image failure.
- Standby Partition failover failed during upgrade.
- Slot 2 was the active CP, while the switch became inaccessible, and access is regained after the forced failover to slot 1.
firmwaredownloadstatus:
[1]: Fri Aug 16 21:58:10 2024
Slot 2 (CP1, active): Firmware is being downloaded to standby CP. This step may take up to 30 minutes.
[2]: Fri Aug 16 22:12:38 2024
Slot 2 (CP1, active): Firmware has been downloaded successfully to Standby CP.
[3]: Fri Aug 16 22:12:39 2024
Slot 2 (CP1, active): Standby CP is going to reboot with new firmware.
[4]: Fri Aug 16 22:15:00 2024
Slot 2 (CP1, active): Standby CP booted successfully with new firmware.
[5]: Fri Aug 16 22:16:55 2024
Slot 1 (CP0, active): Forced failover succeeded. New Active CP is running new firmware
[6]: Fri Aug 16 22:17:30 2024
Slot 1 (CP0, active): Firmware is being downloaded to standby CP. This step may take up to 30 minutes.
[7]: Fri Aug 16 22:20:13 2024
Slot 1 (CP0, active): Firmware has been downloaded successfully on Standby CP.
[8]: Fri Aug 16 22:20:14 2024
Slot 1 (CP0, active): Standby CP reboots.
[9]: Fri Aug 16 22:22:10 2024
Slot 1 (CP0, active): Firmware commit operation has started on both active and standby CPs.
[10]: Fri Aug 16 22:22:11 2024
Slot 1 (CP0, active): Standby CP booted successfully with new firmware.
[11]: Fri Aug 16 22:22:11 2024
Slot 1 (CP0, active): The firmware commit operation has started. This may take up to 10 minutes.
[12]: Fri Aug 16 22:26:41 2024
Slot 1 (CP0, active): The commit operation has completed successfully.
[13]: Fri Aug 16 22:26:42 2024
Slot 1 (CP0, active): Firmware commit operation has completed successfully on active CP.
[14]: Fri Aug 16 22:26:42 2024
Slot 1 (CP0, active): Firmwaredownload command has completed successfully. Use firmwareshow to verify the firmware versions.
- There are faults on CP Blade in Slot 2 in the
emtraceshow2
output:
EMTRACESHOW2
Slot 2 FLT(20035) ON(10014) ---------- FLT(20035) 020003 1 0 Aug 16 22:15:50
Slot 2 ON(20000) ON(10035) ---------- ON(20000) 020003 1 0 Aug 16 22:20:30
Slot 2 FLT(20035) ON(10014) ---------- FLT(20035) 020003 1 0 Aug 16 22:20:33
-
A hung process can be caused due to multiple reasons, some of them are listed below-
- Software defect (memory leak)
- Excessive polling of the switch over the IP interface (tenable for example)
- Through hardware faults.