Skip to main content
NetApp Knowledgebase

StorageGRID Appliance compute controller loses access to ESeries storage controller due to a memory leak reboot

Views:
73
Visibility:
Public
Votes:
0
Category:
storagegrid-webscale
Specialty:
sgrid
Last Updated:

Applies to

StorageGRID Appliances (SGA) models: SG56XX and SG57XX

  • This issue can impact non-SGA E-Series systems running in a simplex configuration.
  • This issue should not impact the SG6000 series as E-Series storage controller shelf uses two controllers running in duplex configuration allowing for storage controller failover support. Additionally, interoperability qualifications require the E-Series storage controller shelf to be running 08.40/11.40 or above.

E-Series SANtricity software releases on 08.30/11.30 or older release

  • The issue has not been observed on 08.40/11.40 and 08.50/11.50 releases

Issue

The SG56XX/SG57XX series can observe the compute controller lose temporary access to the storage controller.

The resulting behavior can vary depending on the severity of the interruption. In most instances, the impact was minimal and limited to alarms only, as node self-recovered. In one occurrence, we did see an impact to the compute controller that was more critical and required rebuilding it from the surviving SGRID storage node cluster.

Below are the related PANIC strings found in the excLogShow:

  • (ProcessEvents): PANIC: Caught bad allocation in processEventsMethod for 0x0 null
  • (bdbmSync): PANIC: Unhandled C++ exception triggered terminate().
  • (symTask0): ASSERT: Assertion failed: response, file sas2PhyErrorMgr.cc, line 1732