Disks missing through IOM12E and PSU, FAN and temperature errors
Applies to
- AFF A220
- FAS2750
- FAS2720
- IOM12E
Issue
- Disks missing from one or both paths
- If drives are missing from both paths, both nodes could reboot and fail to boot up
- Both nodes down after a physical disk re-seat.
- One IOM12E module missing in
sysconfig -A
output::
Shelf 0: DS224-12 Firmware rev. A: ---- IOM12E B: 0220
- Unexpected IOM12E reboot. Shelf log message example:
Thu Jan 1 00:00:00 1970 ( 0+00:00:00.497); 02000027; U?; HAL; hal; 04; NDU Data Invalid.
Thu Jan 1 00:00:00 1970 ( 0+00:00:00.497); 02000093; U?; HAL; hal; 04; Module Reboot: Startup type 1-Cold boot
Or
Sun Dec 31 03:04:33 +0900 [Node-01: storlog_admin: sla.shelf.mod.reboot.unexp:error]: Unexpected reboot event reported by module A in shelf: 0b.00.99.0, log: Thu Jan 1 00:00:00 1970 ( 0+00:00:00.474); 02000093; U?; HAL; hal; 04; Module Reboot: Startup type 5-Crash reset
Sun Dec 31 03:04:33 +0900 [Node-01: storlog_admin: sla.shelf.message:debug]: params: {'type': 'SEVERITY', 'log': 'Thu Jan 1 00:00:00 1970 ( 0+00:00:00.681); 0200013D; U?; HAL; hal; 02; Failure info:EXT NMI Handler'}
- EMS Environmental errors examples:
Thu Apr 08 06:12:57 +0000 [node_name: dsa_worker2: ses.status.electronicsWarn:error]: DS224-12 (S/N SHFNC0123456789) shelf 0 on channel 0b environmental monitoring warning for SES electronics 1: not installed. This module is on the rear of the shelf at the top left.
Thu Apr 08 06:12:57 +0000 [node_name: dsa_worker2: ses.status.procCplxError:alert]: DS224-12 (S/N SHFNC0123456789) shelf 0 on channel 0b Processor Complex error for Processor Complex 1: PCM on partner not installed This module is on the unknown location.
Mon Apr 12 08:17:00 +0000 [node_name: dsa_worker4: ses.status.temperatureWarning:alert]: DS224-12 (S/N SHFNC0123456789) shelf 0 on channel 0a temperature warning for Temperature sensor 11: not operating. Current temperature: 49 C (120 F). This module is on the rear of the shelf at the top left, on shelf module A.
Mon Apr 12 08:17:00 +0000 [node_name: dsa_worker4: ses.status.temperatureWarning:alert]: DS224-12 (S/N SHFNC0123456789) shelf 0 on channel 0a temperature warning for Temperature sensor 12: not operating. Current temperature: 35 C (95 F). This module is on the rear of the shelf at the top left, on shelf module A.
Thu Apr 22 08:40:42 +0000 [node_name: dsa_worker3: ses.status.ModuleError:alert]: DS224-12 (S/N SHFNC0123456789) shelf 0 on channel 0a SAS expander error for SAS shelf electronics 1: critical status. This module is on the rear of the shelf at the top left, on shelf module A.
Thu Apr 22 10:17:28 +0000 [node_name: dsa_worker5: ses.status.psWarning:error]: DS224-12 (S/N SHFNC0123456789) shelf 0 on channel 0b power warning for Power supply 1: not installed. This module is on the rear of the shelf at the bottom left.
Thu Apr 22 10:17:28 +0000 [node_name: dsa_worker5: ses.status.psWarning:error]: DS224-12 (S/N SHFNC0123456789) shelf 0 on channel 0b power warning for Power supply 2: not installed. This module is on the rear of the shelf at the bottom right.
Thu Apr 22 10:27:50 +0000 [node_name: dsa_worker5: ses.status.temperatureWarning:alert]: DS224-12 (S/N SHFNC0123456789) shelf 0 on channel 0b temperature warning for Temperature sensor 14: not installed or failed. Current temperature: 28 C (82 F). This module is on the rear of the shelf at the top right, on shelf module B.