EF-Series unexpected controller reboot due to HIC PCI-E errors
Applies to
- E-series
- SANtricity OS 11.70 or later
Issue
- The controller rebooted unexpectedly and/or locked down.
- Corresponding Major Event Log (MEL) :
B:2/7/21, 11:46:27 AM (11:46:27) 2320 4010 Controller reset - Shelf 99, Bay B
----> Reboot Reason : REBOOT_SX_NMI_PCI_AND_OTHER_ERRORS
- And
excLogshow
entries:
STATE-CAPTURE-DATA Executing excLogShow(0,0,0,0,0,0,0,0,0,0) on controller A: ・・・・ ---- Log Entry #4 (Core 0) Jul-27-2023 11:16:16 PM ---- WARNING: Reset by alternate controller ---- Log Entry #14 (Core 0) Jul-27-2023 11:24:12 PM ---- PCI SERR Exception SERR Summary: 0x00000002 SERR Return Status: 0x00000003 SERR Collection Count: 0x2 PCI-E Root Port to Host Card (Unit 6) VID 0xxxxx DID 0xxxxx B0:D1:F0 PCI-E AER Root Error Status = 0x00000001 PCI-E XP Global Status = 0x000xxxx SERR Summary Flags = 0x2 Host-side Fibre (Unit 7) VID 0x2222 DID 0x2222 B16:D0:F0 PCI-E Device Status = 0x0001 PCI-E AER Correctable Status = 0x00000001 SERR Summary Flags = 0x2 Host-side Fibre (Unit 6) VID 0x1077 DID 0x2871 B16:D0:F1 PCI-E Device Status = 0x0001 PCI-E AER Correctable Status = 0x00000001 SERR Summary Flags = 0x2 Host-side Fibre (Unit 5) VID 0x1077 DID 0x2871 B16:D0:F2 PCI-E Device Status = 0x0001 PCI-E AER Correctable Status = 0x00000001 SERR Summary Flags = 0x2 Host-side Fibre (Unit 4) VID 0x1077 DID 0x2871 B16:D0:F3 PCI-E Device Status = 0x0001 PCI-E AER Correctable Status = 0x00000001 SERR Summary Flags = 0x2 NOTE: Recovered from Correctable PCI SERR Processor Global Correctable Error: 40 Processor reported PCI SERR ---- Log Entry #15 (Core 0) Jul-27-2023 11:24:34 PM ---- PCI correctable error flood detected!