Procedure to troubleshoot fault LEDs on various ONTAP platforms
Applies to
- AFF Systems
- FAS Systems
- ONTAP 9
- Clustered Data ONTAP 8
- Data ONTAP 8 7-Mode
Issue
Introduction
Modern FAS systems include a number of amber fault LEDs strategically located to assist the operator in identifying Field Replaceable Unit (FRUs) that are in need of attention.
Most FRUs are contained within another FRU. For example:
- Controller, IOXM, power supply, fan and disk drive FRUs are contained within a chassis FRU
- PCI card FRUs are contained in both controller and IOXM FRUs
- DIMM and boot device FRUs are contained in controller FRUs
When an FRU is in need of operator attention, its corresponding FRU fault LED will be illuminated. If that FRU is inside another FRU, the outer FRU's fault LED will also be illuminated. This process of illuminating each FRU's outer FRU fault LED is repeated until the outermost FRU is reached; resulting in a path of amber fault LEDs that can be followed to find the innermost FRU that needs attention.
While not all products have FRUs of all types, the hierarchy of FRUs is the same on all FAS products. Starting with the outermost FRU, the hierarchy appears similar to the following:
Chassis
\
+- Power Supply
+- Fan
+- Disk Drive
+- Controller
| \
| + PCI Card
| + DIMM
| + NV-DIMM
| + Boot Device
| + Coin Cell Battery
| + NVMEM Battery
+- IOXM
\
+ PCI Card
For example, if a DIMM in the controller requires attention, its fault LED will be lit along with the controller's fault LED and the chassis fault LED.
FRU fault LEDs that are not visible from the outside of the system remain on when the containing FRU (usually a controller or IOXM) is removed from the chassis. This allows the FRU requiring attention to be easily located. Current versions of Data ONTAP do not detect when FRUs have been serviced and therefore do not turn off the FRU fault LEDs after the faulty FRU has been replaced. As a result, even after replacing a faulty FRU that is not visible from outside, the path of fault LEDs will remain on until Data ONTAP is explicitly commanded to turn them off. Typically, the fault LEDs can be turned off running the disruptive halt -s
command or non-disruptively running the (diag privilege) command fru_led off all
.
Current versions of Data ONTAP do not maintain a database of faults that have occurred. Instead, when a fault occurs, a notification message is logged and the hierarchy of fault LEDs are lit to create a path of LEDs to the faulty FRU. As such, determining the cause of the fault requires some investigation.