Skip to main content
NetApp Knowledge Base

An error message "Cf.hwassist.missedKeepAlive" appears after SP Firmware update failure

Views:
1,917
Visibility:
Public
Votes:
1
Category:
fas-systems
Specialty:
hw
Last Updated:

Applies to

  • AFF A300
  • ONTAP 9
  • Upgrade Service Processor (SP) 5.6P2 to 5.8

Issue

  • Auto-update to SP 5.8 version fails at 5%.

Example:

[Sat Jun 20 01:23:06.047 2020] netapp::*> system service-processor image update -node *2 -package 308-03991_A0-FAS26X0-FAS8200_5.8_SP_FW.zip
[Sat Jun 20 01:23:24.218 2020] 
[Sat Jun 20 01:23:24.218 2020] Note: Firmware update will need to reboot the SP on completion. If your console
[Sat Jun 20 01:23:24.234 2020]       connection is through the SP, it will be disconnected
[Sat Jun 20 01:23:24.246 2020]       Do you want to proceed with the firmware update ? {y|n}: y
[Sat Jun 20 01:23:34.613 2020] SP firmware update has been successfully scheduled.                                                                   
[Sat Jun 20 01:23:34.649 2020] 1 entry was acted on.
[Sat Jun 20 01:23:34.665 2020] 
[Sat Jun 20 01:23:34.665 2020] netapp::*> system service-processor image update-progress show
[Sat Jun 20 01:23:38.711 2020]                  In                           Percent
[Sat Jun 20 01:23:38.751 2020] Node             Progress Start Time          Done    End Time
[Sat Jun 20 01:23:38.751 2020] ---------------- -------- ------------------- ------- -------------------
[Sat Jun 20 01:23:38.755 2020] netapp-01     no       6/20/2020 01:20:53  5       6/20/2020 01:23:32
[Sat Jun 20 01:23:38.755 2020] netapp-02     yes      6/20/2020 01:24:34  1       -
[Sat Jun 20 01:23:38.755 2020] 2 entries were displayed.

system service-processor image update-progress show
[Sat Jun 20 01:41:58.776 2020] netapp::*> system service-processor image update-progress show
[Sat Jun 20 01:42:09.978 2020]                  In                           Percent
[Sat Jun 20 01:42:09.978 2020] Node             Progress Start Time          Done    End Time
[Sat Jun 20 01:42:10.002 2020] ---------------- -------- ------------------- ------- -------------------
[Sat Jun 20 01:42:10.002 2020] netapp-01     no       6/20/2020 01:20:53  5       6/20/2020 01:23:32
[Sat Jun 20 01:42:10.030 2020] netapp-02     no       6/20/2020 01:24:34  5       6/20/2020 01:27:13
[Sat Jun 20 01:42:10.038 2020] 2 entries were displayed.

  • "event log show" shows the following event.

Example:

6/20/2020 01:23:32  netapp-01     DEBUG sp.servprocd.upd.unexpt.evts: reason="Unable to transfer SP firmware image using network interface"

  • "hwassist stats show" indicates the system didn't receive KeepAlive after the update failure.

Example:

[Sat Jun 20 01:13:51.163 2020] netapp::> hwassist stats show
[Sat Jun 20 01:13:53.185 2020]   (storage failover hwassist stats show)
[Sat Jun 20 01:13:53.185 2020] 
[Sat Jun 20 01:13:53.201 2020]                    Node: netapp-01
[Sat Jun 20 01:13:53.201 2020]           Local Enabled: true
[Sat Jun 20 01:13:53.201 2020] Partner Inactive Reason: -
[Sat Jun 20 01:13:53.217 2020] 
[Sat Jun 20 01:13:53.217 2020] Alert Type   Alert Event           Count Takeover  Last Received
[Sat Jun 20 01:13:53.217 2020] ------------ -------------------- ------ --------- --------------------
[Sat Jun 20 01:13:53.237 2020] system_down  power_loss                0 Yes       ---
[Sat Jun 20 01:13:53.256 2020] system_down  l2_watchdog_reset         0 Yes       ---
[Sat Jun 20 01:13:53.256 2020] system_down  power_off_via_rlm         0 Yes       ---
[Sat Jun 20 01:13:53.273 2020] system_down  power_cycle_via_rlm       0 Yes       ---
[Sat Jun 20 01:13:53.273 2020] system_down  reset_via_rlm             0 Yes       ---
[Sat Jun 20 01:13:53.292 2020] system_down  power_off_via_sp          0 Yes       ---
[Sat Jun 20 01:13:53.292 2020] system_down  power_cycle_via_sp        0 Yes       ---
[Sat Jun 20 01:13:53.312 2020] system_down  reset_via_sp              0 Yes       ---
[Sat Jun 20 01:13:53.312 2020] system_down  post_error                0 No        ---
[Sat Jun 20 01:13:53.328 2020] system_down  abnormal_reboot           0 No        ---
[Sat Jun 20 01:13:53.328 2020] system_down  loss_of_heartbeat         0 No        ---
[Sat Jun 20 01:13:53.344 2020] keep_alive   periodic_message     130800 No        Fri Jun 19 22:59:52 JST 2020
[Sat Jun 20 01:13:53.364 2020] test         test                      0 No        ---
[Sat Jun 20 01:13:53.364 2020] ID_mismatch  ---                       0 ---       ---
[Sat Jun 20 01:13:53.384 2020] Key_mismatch ---                       0 ---       ---
[Sat Jun 20 01:13:53.384 2020] Unknown      ---                       0 ---       ---
[Sat Jun 20 01:13:53.396 2020]              alerts_throttled          0 ---       ---
[Sat Jun 20 01:13:53.396 2020] 
[Sat Jun 20 01:13:53.396 2020]                    Node: netapp-02
[Sat Jun 20 01:13:53.400 2020]           Local Enabled: true
[Sat Jun 20 01:13:53.400 2020] Partner Inactive Reason: -
[Sat Jun 20 01:13:53.404 2020] 
[Sat Jun 20 01:13:53.404 2020] Alert Type   Alert Event           Count Takeover  Last Received
[Sat Jun 20 01:13:53.416 2020] ------------ -------------------- ------ --------- --------------------
[Sat Jun 20 01:13:53.428 2020] system_down  power_loss                0 Yes       ---
[Sat Jun 20 01:13:53.436 2020] system_down  l2_watchdog_reset         0 Yes       ---
[Sat Jun 20 01:13:53.448 2020] system_down  power_off_via_rlm         0 Yes       ---
[Sat Jun 20 01:13:53.456 2020] system_down  power_cycle_via_rlm       0 Yes       ---
[Sat Jun 20 01:13:53.468 2020] system_down  reset_via_rlm             0 Yes       ---
[Sat Jun 20 01:13:53.476 2020] system_down  power_off_via_sp          0 Yes       ---
[Sat Jun 20 01:13:53.484 2020] system_down  power_cycle_via_sp        0 Yes       ---
[Sat Jun 20 01:13:53.495 2020] system_down  reset_via_sp              0 Yes       ---
[Sat Jun 20 01:13:53.503 2020] system_down  post_error                0 No        ---
[Sat Jun 20 01:13:53.515 2020] system_down  abnormal_reboot           0 No        ---
[Sat Jun 20 01:13:53.529 2020] system_down  loss_of_heartbeat         0 No        ---
[Sat Jun 20 01:13:53.533 2020] keep_alive   periodic_message     130995 No        Sat Jun 20 01:14:49 JST 2020
[Sat Jun 20 01:13:53.547 2020] test         test                      0 No        ---
[Sat Jun 20 01:13:53.559 2020] ID_mismatch  ---                       0 ---       ---
[Sat Jun 20 01:13:53.567 2020] Key_mismatch ---                       0 ---       ---
[Sat Jun 20 01:13:53.575 2020] Unknown      ---                       0 ---       ---
[Sat Jun 20 01:13:53.587 2020]              alerts_throttled          0 ---       ---
[Sat Jun 20 01:13:53.599 2020] 2 entries were displayed.

  • "system health alert show" has SP config error like the following.

Example:

[Sat Jun 20 03:01:44.937 2020] netapp::> system health alert show
[Sat Jun 20 03:01:56.184 2020]                Node: netapp-01
[Sat Jun 20 03:01:56.184 2020]            Resource: SP Config
[Sat Jun 20 03:01:56.200 2020]            Severity: Major
[Sat Jun 20 03:01:56.200 2020]     Indication Time: Sat Jun 20 02:56:30 2020
[Sat Jun 20 03:01:56.200 2020]            Suppress: false
[Sat Jun 20 03:01:56.216 2020]         Acknowledge: false
[Sat Jun 20 03:01:56.216 2020]      Probable Cause: Service Processor is not properly configured.
[Sat Jun 20 03:01:56.232 2020]     Possible Effect: You might not be able to use the Service Processor to
[Sat Jun 20 03:01:56.232 2020]                      remotely access, monitor, and troubleshoot your
[Sat Jun 20 03:01:56.256 2020]                      storage system.
[Sat Jun 20 03:01:56.276 2020]  Corrective Actions: 1. Use the "system service-processor network modify" command to configure the network interface of the Service Processor.
[Sat Jun 20 03:01:56.292 2020]                      2. Use the "system service-processor image modify -node netapp-01 -autoupdate true" to configure AutoUpdate feature of the Service Processor.
[Sat Jun 20 03:01:56.310 2020]                      3. Contact the technical support if the alert persists.

  • After setting SP again, SP is temporarily recovered.

Example:

[Sat Jun 20 03:06:35.524 2020] netapp::> system service-processor network modify -node netapp-02 -enable true -address-family IPv4 -ip-address 192.168.132.214 -netmask 255.255.255.0 -gateway 192.168.132.1 

[Sat Jun 20 03:07:24.333 2020] netapp::> hwassist show
[Sat Jun 20 03:07:26.325 2020]   (storage failover hwassist show)
[Sat Jun 20 03:07:26.325 2020] Node
[Sat Jun 20 03:07:26.325 2020] -----------------
[Sat Jun 20 03:07:26.337 2020] netapp-01
[Sat Jun 20 03:07:26.337 2020]                              Partner: netapp-02
[Sat Jun 20 03:07:26.357 2020]                     Hwassist Enabled: true
[Sat Jun 20 03:07:26.357 2020]                          Hwassist IP: 192.168.132.211
[Sat Jun 20 03:07:26.378 2020]                        Hwassist Port: 4444
[Sat Jun 20 03:07:26.378 2020]                       Monitor Status: active
[Sat Jun 20 03:07:26.394 2020]                      Inactive Reason: -
[Sat Jun 20 03:07:26.394 2020]                    Corrective Action: -
[Sat Jun 20 03:07:26.394 2020]                    Keep-Alive Status: healthy
[Sat Jun 20 03:07:26.414 2020] netapp-02
[Sat Jun 20 03:07:26.414 2020]                              Partner: netapp-01
[Sat Jun 20 03:07:26.414 2020]                     Hwassist Enabled: true
[Sat Jun 20 03:07:26.430 2020]                          Hwassist IP: 192.168.132.212
[Sat Jun 20 03:07:26.430 2020]                        Hwassist Port: 4444
[Sat Jun 20 03:07:26.442 2020]                       Monitor Status: active
[Sat Jun 20 03:07:26.442 2020]                      Inactive Reason: -
[Sat Jun 20 03:07:26.462 2020]                    Corrective Action: -
[Sat Jun 20 03:07:26.462 2020]                    Keep-Alive Status: healthy
[Sat Jun 20 03:07:26.474 2020] 2 entries were displayed.

  • After a while, the error occurs again.

Example:

[Sat Jun 20 03:27:07.648 2020] netapp::> hwasi  ssist show
[Sat Jun 20 03:27:09.822 2020]   (storage failover hwassist show)
[Sat Jun 20 03:27:09.826 2020] Node
[Sat Jun 20 03:27:09.830 2020] -----------------
[Sat Jun 20 03:27:09.830 2020] netapp-01
[Sat Jun 20 03:27:09.834 2020]                              Partner: netapp-02
[Sat Jun 20 03:27:09.842 2020]                     Hwassist Enabled: true
[Sat Jun 20 03:27:09.850 2020]                          Hwassist IP: 192.168.132.211
[Sat Jun 20 03:27:09.860 2020]                        Hwassist Port: 4444
[Sat Jun 20 03:27:09.868 2020]                       Monitor Status: active
[Sat Jun 20 03:27:09.876 2020]                      Inactive Reason: -
[Sat Jun 20 03:27:09.884 2020]                    Corrective Action: -
[Sat Jun 20 03:27:09.892 2020]                    Keep-Alive Status: Error: did not receive hwassist keep alive alerts from partner.
[Sat Jun 20 03:27:09.908 2020] netapp-02
[Sat Jun 20 03:27:09.912 2020]                              Partner: netapp-01
[Sat Jun 20 03:27:09.920 2020]                     Hwassist Enabled: true
[Sat Jun 20 03:27:09.928 2020]                          Hwassist IP: 192.168.132.212
[Sat Jun 20 03:27:09.940 2020]                        Hwassist Port: 4444
[Sat Jun 20 03:27:09.946 2020]                       Monitor Status: active
[Sat Jun 20 03:27:09.950 2020]                      Inactive Reason: -
[Sat Jun 20 03:27:09.960 2020]                    Corrective Action: -
[Sat Jun 20 03:27:09.968 2020]                    Keep-Alive Status: healthy
[Sat Jun 20 03:27:09.976 2020] 2 entries were displayed.
[Sat Jun 20 03:27:09.980 2020] 

  • SP hungs during SP update. "SP-LATEST CONFIGURATION" in AutoSupport shows the following output.

Example:

Service Processor           Status: Online
  Firmware Version:   5.6P2
  Mgmt MAC Address:   00:A0:XX:XX:XX:XX
  Ethernet Link:      down, full duplex, auto-neg complete  
  Using DHCP:         no
IPv4 configuration:
  IP Address:         unknown  
  Netmask:            unknown  
  Gateway:            unknown  
IPv6 configuration:         Disabled

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

Scan to view the article on your device

 

  • Was this article helpful?