CONTAP-520139: System becomes unresponsive due to low FreeBSD memory availability - high dma_buffer allocations
Issue
- When BSD memory is exhausted, critical system processes may become unresponsive, leading to an unexpected reboot with an error:
Process vldb unresponsive for 210 seconds in process nodewatchdog on release 9.16.1
- Customers may notice slow administrative operations, delays in command execution, and frequent system alerts.
- Before the unexpected reboot,
rdma.rliberrors might be observed in the EMS log on the affected node and on all other nodes with which it has NVRAM mirroring relationships.
Example:12/12/2025 09:35:52 ClusterA-01 DEBUG rdma.rlib.connected: misc:DR:A QP is now connected.
12/12/2025 09:35:51 ClusterA-01 DEBUG rdma.rlib.event.error: QP raid event error: client disconnect.
12/12/2025 09:34:12 ClusterA-01 DEBUG rdma.rlib.connected: raid:DR:A QP is now connected.
12/12/2025 09:32:32 ClusterA-01 DEBUG rdma.rlib.connected: misc:DR:A QP is now connected.
12/12/2025 09:32:31 ClusterA-01 DEBUG rdma.rlib.event.error: QP raid event error: client disconnect.
12/12/2025 09:29:12 ClusterA-01 DEBUG rdma.rlib.connected: raid:DR:A QP is now connected.
