CONTAP-687329: ONTAP internal timer's dispatch race causes context timer stall
Issue
- A rare timing condition in ONTAP's internal timer management can cause protocol-level timers on a single network processing thread to stop firing. Once triggered, the condition does not self-resolve and persists until the node is rebooted.
- EMS errors show memory allocation being exceeded:
- Thu Apr 01 00:00:00 -0700 [node-01: kernel: Nblade.cifs.budgetAllocFailure:error]: Memory Allocation failed for SpinNp Manager. The CIFS subsystem on this node has exceeded its allotment of 13999747891 bytes of node memory. CIFS subsystems that have consumed the most memory are FlexGroup Path/MSID Cache '11499898160 bytes', NameStr Objects '2427788176 bytes' and Accessed Path List '67663184 bytes'.
- Thu Apr 01 00:00:00 -0700 [node-01: nblade2: Nblade.cifsMemExceeded:error]: The CIFS subsystem on this node has exceeded its allotment of 13999747891 bytes of node memory with currently 5627 memory allocation failures since boot time. This might result in unexpected CIFS application failures.
