SwitchFanNotPresent or SwitchPowerNotPresent reported by CSHM for Cisco cluster network switches
Applies to
- ONTAP 9
- Cisco cluster network switches
Issue
- “SwitchFanNotPresent_Alert” is reported by CSHM for one or more fan modules of the cluster network switch:
Wed Apr 29 15:53:23 AEST [nodename: mgwd: callhome.hm.alert.major:alert]: Call home for Health Monitor process cshm: SwitchFanNotPresent_Alert[switch(XXXXXXXXXXX)/Fan Module-1].
Wed Apr 29 15:28:23 AEST [nodename: mgwd: callhome.hm.alert.major:alert]: Call home for Health Monitor process cshm: SwitchFanNotPresent_Alert[switch(XXXXXXXXXXX)/Fan Module-2]
- "SwitchPowerNotPresent_Alert" is reported by CSHM for one or more PSUs of the cluster network switch:
Wed Apr 29 15:28:23 AEST [nodename: mgwd: callhome.hm.alert.major:alert]: Call home for Health Monitor process cshm: SwitchPowerNotPresent_Alert[switch(XXXXXXXXXX)/PowerSupply-1].
- No maintenance activity was performed during the time the alert was reported by CSHM.
- The alerts are cleared after some time, in the event logs:
cluster::> event log show
Time Node Severity Event
------------------- ---------------- ------------- ---------------------------
4/29/2020 15:49:12 nodename ALERT hm.alert.raised: Alert Id = SwitchFanNotPresent_Alert , Alerting Resource = switch(XXXXXXXXXXX)/Fan Module-1 raised by monitor cluster-switch
4/29/2020 16:10:20 nodename NOTICE hm.alert.cleared: Alert Id = SwitchFanNotPresent_Alert , Alerting Resource = switch(XXXXXXXXXXX)/Fan Module-1 cleared by monitor cluster-switch
- Removing the switch from monitoring, then re-adding it to poll the sensors doesn't resolve the issue.
::> system switch ethernet delete -device <switch_name>
::> system switch ethernet create -device <switch_name> -address <ip_address> -snmp-version <version> -community-or-username cshm1! -model OTHER -type cluster-network
- Fan modules are operating in good condition on both switches.
Switch> enable
Switch#show environment fan detail
Fan:
---------------------------------------------------------------------------
Fan Model Hw Direction Status
---------------------------------------------------------------------------
Fan1(sys_fan1) NXA-FAN-30CFM-F 0.0 front-to-back Ok
Fan2(sys_fan2) NXA-FAN-30CFM-F 0.0 front-to-back Ok
Fan3(sys_fan3) NXA-FAN-30CFM-F 0.0 front-to-back Ok
Fan4(sys_fan4) NXA-FAN-30CFM-F 0.0 front-to-back Ok
Fan_in_PS1 N2200-PAC-400W -- front-to-back Ok
Fan_in_PS2 N2200-PAC-400W -- front-to-back Ok
Fan Zone Speed: Zone 1: 0x32
- From the switch logs("
show tech-support"), we can see the switches have issues in fetching fan related information, during the time the alert was reported:
`show system internal platform all`
1)Event:E_DEBUG, length:98, at 034398 usecs after Wed Apr 29 11:07:55 2020
[103] pfm_pss_fan_restore_cfg_info_from_startup(2028):
pss fan config fetch from startup failed
Cause
- This issue is likely to occur when the cluster switches fail to fetch fan related information.
- The cluster switch health monitor (CSHM) can only detect that a fan or a power supply is missing when the queries to the switch fail to return information about all the fans or power supplies.
Solution
- If this is a one time event and the alerts are cleared on its own, then this can be safely ignored and no further action is required.
- This issue could be caused by congestion on the manamgemnt network that run between nodes and cluster switch. Be sure this network is isolated from data traffic.
- If the issue persists,
