CONTAP-436651: High latency and LUN power on resets due to CG snap failure and 120 second SnapCenter timeout fencing i/o
Issue
- SnapCenter (or any ZAPI client) sends two ZAPI requests similar to below with 120 second timeout:
2025-03-20T12:12:37.6410145-07:00 DEBUG SMCore PID=[7200] TID=[86] <cg-start><snapshot>snapshot1</snapshot><volumes><volume-name>vol1</volume-name></volumes><user-timeout>120</user-timeout></cg-start>
2025-03-20T12:12:38.6009943-07:00 DEBUG SMCore PID=[7200] TID=[86] <cg-start><snapshot>snapshot1</snapshot><volumes><volume-name>vol1</volume-name><volume-name>vol2</volume-name></volumes><user-timeout>120</user-timeout></cg-start>
- The snapshot timeout is 120 seconds.
- Both snapshots have the same name (snapshot1).
- vol1 is listed twice.
- High latency is seen on a volume in the hundreds of milliseconds or seconds.
- Power on resets are seen on a Linux host.
- A disk group, volume group, or some striped set of LUNs span multiple FlexVols.