Skip to main content
NetApp Knowledge Base

An error message "Cf.hwassist.missedKeepAlive" appears after SP Firmware update failure

Views:
2,376
Visibility:
Public
Votes:
1
Category:
fas-systems
Specialty:
hw
Last Updated:

Applies to

  • AFF A300
  • ONTAP 9
  • Upgrade Service Processor (SP) 5.6P2 to 5.8

Issue

  • Auto-update to SP 5.8 version fails at 5%.

Example:

[Sat Jun 20 01:23:06.047 2020] netapp::*> system service-processor image update -node *2 -package 308-03991_A0-FAS26X0-FAS8200_5.8_SP_FW.zip
[Sat Jun 20 01:23:24.218 2020] 
[Sat Jun 20 01:23:24.218 2020] Note: Firmware update will need to reboot the SP on completion. If your console
[Sat Jun 20 01:23:24.234 2020]       connection is through the SP, it will be disconnected
[Sat Jun 20 01:23:24.246 2020]       Do you want to proceed with the firmware update ? {y|n}: y
[Sat Jun 20 01:23:34.613 2020] SP firmware update has been successfully scheduled.                                                                   
[Sat Jun 20 01:23:34.649 2020] 1 entry was acted on.
[Sat Jun 20 01:23:34.665 2020] 
[Sat Jun 20 01:23:34.665 2020] netapp::*> system service-processor image update-progress show
[Sat Jun 20 01:23:38.711 2020]                  In                           Percent
[Sat Jun 20 01:23:38.751 2020] Node             Progress Start Time          Done    End Time
[Sat Jun 20 01:23:38.751 2020] ---------------- -------- ------------------- ------- -------------------
[Sat Jun 20 01:23:38.755 2020] netapp-01     no       6/20/2020 01:20:53  5       6/20/2020 01:23:32
[Sat Jun 20 01:23:38.755 2020] netapp-02     yes      6/20/2020 01:24:34  1       -
[Sat Jun 20 01:23:38.755 2020] 2 entries were displayed.

system service-processor image update-progress show
[Sat Jun 20 01:41:58.776 2020] netapp::*> system service-processor image update-progress show
[Sat Jun 20 01:42:09.978 2020]                  In                           Percent
[Sat Jun 20 01:42:09.978 2020] Node             Progress Start Time          Done    End Time
[Sat Jun 20 01:42:10.002 2020] ---------------- -------- ------------------- ------- -------------------
[Sat Jun 20 01:42:10.002 2020] netapp-01     no       6/20/2020 01:20:53  5       6/20/2020 01:23:32
[Sat Jun 20 01:42:10.030 2020] netapp-02     no       6/20/2020 01:24:34  5       6/20/2020 01:27:13
[Sat Jun 20 01:42:10.038 2020] 2 entries were displayed.

  • "event log show" shows the following event.

Example:

6/20/2020 01:23:32  netapp-01     DEBUG sp.servprocd.upd.unexpt.evts: reason="Unable to transfer SP firmware image using network interface"

  • "hwassist stats show" indicates the system didn't receive KeepAlive after the update failure.

Example:

[Sat Jun 20 01:13:51.163 2020] netapp::> hwassist stats show
[Sat Jun 20 01:13:53.185 2020]   (storage failover hwassist stats show)
[Sat Jun 20 01:13:53.185 2020] 
[Sat Jun 20 01:13:53.201 2020]                    Node: netapp-01
[Sat Jun 20 01:13:53.201 2020]           Local Enabled: true
[Sat Jun 20 01:13:53.201 2020] Partner Inactive Reason: -
[Sat Jun 20 01:13:53.217 2020] 
[Sat Jun 20 01:13:53.217 2020] Alert Type   Alert Event           Count Takeover  Last Received
[Sat Jun 20 01:13:53.217 2020] ------------ -------------------- ------ --------- --------------------
[Sat Jun 20 01:13:53.237 2020] system_down  power_loss                0 Yes       ---
[Sat Jun 20 01:13:53.256 2020] system_down  l2_watchdog_reset         0 Yes       ---
[Sat Jun 20 01:13:53.256 2020] system_down  power_off_via_rlm         0 Yes       ---
[Sat Jun 20 01:13:53.273 2020] system_down  power_cycle_via_rlm       0 Yes       ---
[Sat Jun 20 01:13:53.273 2020] system_down  reset_via_rlm             0 Yes       ---
[Sat Jun 20 01:13:53.292 2020] system_down  power_off_via_sp          0 Yes       ---
[Sat Jun 20 01:13:53.292 2020] system_down  power_cycle_via_sp        0 Yes       ---
[Sat Jun 20 01:13:53.312 2020] system_down  reset_via_sp              0 Yes       ---
[Sat Jun 20 01:13:53.312 2020] system_down  post_error                0 No        ---
[Sat Jun 20 01:13:53.328 2020] system_down  abnormal_reboot           0 No        ---
[Sat Jun 20 01:13:53.328 2020] system_down  loss_of_heartbeat         0 No        ---
[Sat Jun 20 01:13:53.344 2020] keep_alive   periodic_message     130800 No        Fri Jun 19 22:59:52 JST 2020
[Sat Jun 20 01:13:53.364 2020] test         test                      0 No        ---
[Sat Jun 20 01:13:53.364 2020] ID_mismatch  ---                       0 ---       ---
[Sat Jun 20 01:13:53.384 2020] Key_mismatch ---                       0 ---       ---
[Sat Jun 20 01:13:53.384 2020] Unknown      ---                       0 ---       ---
[Sat Jun 20 01:13:53.396 2020]              alerts_throttled          0 ---       ---
[Sat Jun 20 01:13:53.396 2020] 
[Sat Jun 20 01:13:53.396 2020]                    Node: netapp-02
[Sat Jun 20 01:13:53.400 2020]           Local Enabled: true
[Sat Jun 20 01:13:53.400 2020] Partner Inactive Reason: -
[Sat Jun 20 01:13:53.404 2020] 
[Sat Jun 20 01:13:53.404 2020] Alert Type   Alert Event           Count Takeover  Last Received
[Sat Jun 20 01:13:53.416 2020] ------------ -------------------- ------ --------- --------------------
[Sat Jun 20 01:13:53.428 2020] system_down  power_loss                0 Yes       ---
[Sat Jun 20 01:13:53.436 2020] system_down  l2_watchdog_reset         0 Yes       ---
[Sat Jun 20 01:13:53.448 2020] system_down  power_off_via_rlm         0 Yes       ---
[Sat Jun 20 01:13:53.456 2020] system_down  power_cycle_via_rlm       0 Yes       ---
[Sat Jun 20 01:13:53.468 2020] system_down  reset_via_rlm             0 Yes       ---
[Sat Jun 20 01:13:53.476 2020] system_down  power_off_via_sp          0 Yes       ---
[Sat Jun 20 01:13:53.484 2020] system_down  power_cycle_via_sp        0 Yes       ---
[Sat Jun 20 01:13:53.495 2020] system_down  reset_via_sp              0 Yes       ---
[Sat Jun 20 01:13:53.503 2020] system_down  post_error                0 No        ---
[Sat Jun 20 01:13:53.515 2020] system_down  abnormal_reboot           0 No        ---
[Sat Jun 20 01:13:53.529 2020] system_down  loss_of_heartbeat         0 No        ---
[Sat Jun 20 01:13:53.533 2020] keep_alive   periodic_message     130995 No        Sat Jun 20 01:14:49 JST 2020
[Sat Jun 20 01:13:53.547 2020] test         test                      0 No        ---
[Sat Jun 20 01:13:53.559 2020] ID_mismatch  ---                       0 ---       ---
[Sat Jun 20 01:13:53.567 2020] Key_mismatch ---                       0 ---       ---
[Sat Jun 20 01:13:53.575 2020] Unknown      ---                       0 ---       ---
[Sat Jun 20 01:13:53.587 2020]              alerts_throttled          0 ---       ---
[Sat Jun 20 01:13:53.599 2020] 2 entries were displayed.

  • "system health alert show" has SP config error like the following.

Example:

[Sat Jun 20 03:01:44.937 2020] netapp::> system health alert show
[Sat Jun 20 03:01:56.184 2020]                Node: netapp-01
[Sat Jun 20 03:01:56.184 2020]            Resource: SP Config
[Sat Jun 20 03:01:56.200 2020]            Severity: Major
[Sat Jun 20 03:01:56.200 2020]     Indication Time: Sat Jun 20 02:56:30 2020
[Sat Jun 20 03:01:56.200 2020]            Suppress: false
[Sat Jun 20 03:01:56.216 2020]         Acknowledge: false
[Sat Jun 20 03:01:56.216 2020]      Probable Cause: Service Processor is not properly configured.
[Sat Jun 20 03:01:56.232 2020]     Possible Effect: You might not be able to use the Service Processor to
[Sat Jun 20 03:01:56.232 2020]                      remotely access, monitor, and troubleshoot your
[Sat Jun 20 03:01:56.256 2020]                      storage system.
[Sat Jun 20 03:01:56.276 2020]  Corrective Actions: 1. Use the "system service-processor network modify" command to configure the network interface of the Service Processor.
[Sat Jun 20 03:01:56.292 2020]                      2. Use the "system service-processor image modify -node netapp-01 -autoupdate true" to configure AutoUpdate feature of the Service Processor.
[Sat Jun 20 03:01:56.310 2020]                      3. Contact the technical support if the alert persists.

  • After setting SP again, SP is temporarily recovered.

Example:

[Sat Jun 20 03:06:35.524 2020] netapp::> system service-processor network modify -node netapp-02 -enable true -address-family IPv4 -ip-address 192.168.132.214 -netmask 255.255.255.0 -gateway 192.168.132.1 

[Sat Jun 20 03:07:24.333 2020] netapp::> hwassist show
[Sat Jun 20 03:07:26.325 2020]   (storage failover hwassist show)
[Sat Jun 20 03:07:26.325 2020] Node
[Sat Jun 20 03:07:26.325 2020] -----------------
[Sat Jun 20 03:07:26.337 2020] netapp-01
[Sat Jun 20 03:07:26.337 2020]                              Partner: netapp-02
[Sat Jun 20 03:07:26.357 2020]                     Hwassist Enabled: true
[Sat Jun 20 03:07:26.357 2020]                          Hwassist IP: 192.168.132.211
[Sat Jun 20 03:07:26.378 2020]                        Hwassist Port: 4444
[Sat Jun 20 03:07:26.378 2020]                       Monitor Status: active
[Sat Jun 20 03:07:26.394 2020]                      Inactive Reason: -
[Sat Jun 20 03:07:26.394 2020]                    Corrective Action: -
[Sat Jun 20 03:07:26.394 2020]                    Keep-Alive Status: healthy
[Sat Jun 20 03:07:26.414 2020] netapp-02
[Sat Jun 20 03:07:26.414 2020]                              Partner: netapp-01
[Sat Jun 20 03:07:26.414 2020]                     Hwassist Enabled: true
[Sat Jun 20 03:07:26.430 2020]                          Hwassist IP: 192.168.132.212
[Sat Jun 20 03:07:26.430 2020]                        Hwassist Port: 4444
[Sat Jun 20 03:07:26.442 2020]                       Monitor Status: active
[Sat Jun 20 03:07:26.442 2020]                      Inactive Reason: -
[Sat Jun 20 03:07:26.462 2020]                    Corrective Action: -
[Sat Jun 20 03:07:26.462 2020]                    Keep-Alive Status: healthy
[Sat Jun 20 03:07:26.474 2020] 2 entries were displayed.

  • After a while, the error occurs again.

Example:

[Sat Jun 20 03:27:07.648 2020] netapp::> hwasi  ssist show
[Sat Jun 20 03:27:09.822 2020]   (storage failover hwassist show)
[Sat Jun 20 03:27:09.826 2020] Node
[Sat Jun 20 03:27:09.830 2020] -----------------
[Sat Jun 20 03:27:09.830 2020] netapp-01
[Sat Jun 20 03:27:09.834 2020]                              Partner: netapp-02
[Sat Jun 20 03:27:09.842 2020]                     Hwassist Enabled: true
[Sat Jun 20 03:27:09.850 2020]                          Hwassist IP: 192.168.132.211
[Sat Jun 20 03:27:09.860 2020]                        Hwassist Port: 4444
[Sat Jun 20 03:27:09.868 2020]                       Monitor Status: active
[Sat Jun 20 03:27:09.876 2020]                      Inactive Reason: -
[Sat Jun 20 03:27:09.884 2020]                    Corrective Action: -
[Sat Jun 20 03:27:09.892 2020]                    Keep-Alive Status: Error: did not receive hwassist keep alive alerts from partner.
[Sat Jun 20 03:27:09.908 2020] netapp-02
[Sat Jun 20 03:27:09.912 2020]                              Partner: netapp-01
[Sat Jun 20 03:27:09.920 2020]                     Hwassist Enabled: true
[Sat Jun 20 03:27:09.928 2020]                          Hwassist IP: 192.168.132.212
[Sat Jun 20 03:27:09.940 2020]                        Hwassist Port: 4444
[Sat Jun 20 03:27:09.946 2020]                       Monitor Status: active
[Sat Jun 20 03:27:09.950 2020]                      Inactive Reason: -
[Sat Jun 20 03:27:09.960 2020]                    Corrective Action: -
[Sat Jun 20 03:27:09.968 2020]                    Keep-Alive Status: healthy
[Sat Jun 20 03:27:09.976 2020] 2 entries were displayed.
[Sat Jun 20 03:27:09.980 2020] 

  • SP hungs during SP update. "SP-LATEST CONFIGURATION" in AutoSupport shows the following output.

Example:

Service Processor           Status: Online
  Firmware Version:   5.6P2
  Mgmt MAC Address:   00:A0:XX:XX:XX:XX
  Ethernet Link:      down, full duplex, auto-neg complete  
  Using DHCP:         no
IPv4 configuration:
  IP Address:         unknown  
  Netmask:            unknown  
  Gateway:            unknown  
IPv6 configuration:         Disabled

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.