SP module reports THRIFT_POLL timed out error
Applies to
- A800
- ONTAP 9.15.1P7
- BMC firmware 10.10
Issue
The SP module on the source node frequently experiences THRIFT_POLL (timed out), causing the partner node to often fail the KeepAlive check
Example:
node-01(SP-MGMT-MLOG)
ERR: Servprocd::CLI: sp_get_sensor_info_worker : TTX ERROR: 2 (THRIFT_POLL (timed out))
ERR: Servprocd::CLI: sp_get_sensor_info_worker : EXCEPTION Occured in closing the spcc transport
Thrift: Thu Jun 12 00:14:21 2025 SSL_shutdown: shutdown while in init (SSL_error_code = 1)
node-02(EMS)
node-02 ERROR cf.hwassist.missedKeepAlive: HW-assisted takeover missing keep-alive messages from HA partner (node-01).
node-02 INFORMATIONAL cf.hwassist.recvKeepAlive: hw_assist: Received hw_assist KeepAlive alert from partner(node-01).