NS224 modules firmware mismatch and multiple environmental errors
- Views:
- 689
- Visibility:
- Public
- Votes:
- 0
- Category:
- disk-shelves
- Specialty:
- hw
- Last Updated:
- 1/30/2025, 4:59:26 PM
Applies to
- NS224 shelf
- NSM100 Firmware upgrade
Issue
- Both modules are working and the system is in MultiPath serving data properly.
- During a firmware upgrade on a NSM100 module, the automatic upgrade process never completes: Example:
Tue Jun 14 18:42:52 [node_name-02: dsa_disc: ses.mismatch.fw.version:error]: The disk shelf modules on disk shelf 0x.0 are running two different firmware versions. Disk shelf module A is running 0121, and disk shelf module B is running .
Tue Jun 14 18:42:52 [node_name-02: dsa_disc: sfu.firmwareDownrev.shelf:error]: Shelf 0x.shelf0 has downrev firmware.
- Multiple errors related to just one NSM100 B. Example:
::> storage shelf show -shelf 1.0 -instance
Shelf Name: 1.0
Stack ID: 1
Shelf ID: 0
...
Shelf State: Online
Status: Normal
Boot device "2" error detected.
Temperature reported by temperature sensor "17" exceeds the specifications for the disk shelf or its components.
Temperature reported by temperature sensor "16" exceeds the specifications for the disk shelf or its components.
Temperature reported by temperature sensor "15" exceeds the specifications for the disk shelf or its components.
Temperature reported by temperature sensor "14" exceeds the specifications for the disk shelf or its components.
Temperature reported by temperature sensor "13" exceeds the specifications for the disk shelf or its components.
Temperature reported by temperature sensor "12" exceeds the specifications for the disk shelf or its components.
DIMM "8" error detected. DIMM is located in the DIMM slot 4 in the bottom shelf module (B).
DIMM "7" error detected. DIMM is located in the DIMM slot 3 in the bottom shelf module (B).
DIMM "6" error detected. DIMM is located in the DIMM slot 2 in the bottom shelf module (B).
DIMM "5" error detected. DIMM is located in the DIMM slot 1 in the bottom shelf module (B).
Critical error detected in module "2".
Coin cell battery "2" error detected.
- Module B not being reported in the "
sysconfig -M
" output. Example:
::> system node run -node node_name-01 -command sysconfig -M
...
!NS224NSM100-MODULE!012345678910!111-04256+B3!1D!!
!NS224NSM100-MODULE!!!!!
- The Issue remains after a NSM100 B re-boot, re-seat and replacement.
- A soft reboot of the NSM100 A, allows the module B to be discovered and without issues, temporary.
- A hard reboot of the NSM100 A replicates the initial issue, with module B reporting errors.