ONTAP Tools for VMware vSphere: Large number of ONTAP quotas causes OTV discoveries to fail
Applies to
- ONTAP Tools for VMware vSphere (OTV)
- ONTAP 9
Issue
Large number of quotas in ONTAP causes OTV discovery to timeout.
From OTV's vsc.log, similar errors can be seen:
finished loading : [svms] [2024-04-10T15:40:49,727Z] [Controllercluster1] [ INFO] [192.168.0.36 - cluster1] Updated the task progress to 96% [2024-04-10T15:40:49,733Z] [Controllercluster1] [ INFO] [192.168.0.36 - cluster1] Updated the task progress to 96% [2024-04-10T15:40:49,740Z] [Controllercluster1] [ INFO] [192.168.0.36 - cluster1] Updated the task progress to 100%
[2024-04-10T15:40:49,746Z] [Controllercluster1] [DEBUG] [192.168.0.36 - cluster1] Successfully set the task to success
[2024-04-10T15:40:49,749Z] [Controllercluster1] [ INFO] [192.168.0.36 - cluster1] Updated the task progress to 100%
[2024-04-10T15:40:49,755Z] [Controllercluster1] [DEBUG] [192.168.0.36 - cluster1] Successfully set the task to success
[2024-04-10T15:40:49,756Z] [Controllercluster1] [DEBUG] CachingControllerProxy.create exit.
[2024-04-10T15:40:49,756Z] [Controllercluster1] [DEBUG] addConnection: add controller end
[2024-04-10T15:40:49,756Z] [Controllercluster1] [ERROR] addConnection: timedout - Lock duration and processing addController went beyond 45000
[2024-04-10T15:40:49,756Z] [Controllercluster1] [ERROR] discoverWithAeonFlux: discovery failed with exception. java.util.concurrent.TimeoutException: Excesive lock duration and processing addController: 190689 at com.netapp.offtap3.factories.ControllerFactory.addConnection(ControllerFactory.java:168) ~[aeonflux.jar:?] at com.netapp.vscv.server.controller.VscControllers.addController(VscControllers.java:490) ~[classes/:?] at com.netapp.vscv.server.controller.VscControllers.discoverController(VscControllers.java:310) ~[classes/:?] at com.netapp.vscv.server.ControllerDiscoverer.discover(ControllerDiscoverer.java:185) [classes/:?] at com.netapp.vscv.server.ControllerDiscoverer.call(ControllerDiscoverer.java:154) [classes/:?] at com.netapp.vscv.server.ControllerDiscoverer.call(ControllerDiscoverer.java:40) [classes/:?] at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:835) [?:?] [2024-04-10T15:40:49,757Z] [Controllercluster1] [DEBUG] Discovery finished for controller cluster1 [2024-04-10T15:40:49,761Z] [controllerPersistencePool-1-thread-1] [DEBUG] Did not refresh the following resources on object "AbstractController: id: b8170700-c0a0-11e7-8475-00a098afcdc1name: cluster1_vidsecip address: 192.168.0.36" because they do not appear to be loaded: [svm_peers] [2024-04-10T15:40:49,761Z] [controllerPersistencePool-1-thread-1] [DEBUG] AbstractController: id: b8170700-c0a0-11e7-8475-00a098afcdc1name: cluster1_vidsecip address: 192.168.0.36 is going to start loading: [svm_peers]
Timeout is happening because cluster discovery is taking approximately 2 and half minutes where the default value of timeout (i,e timeout.controller.connection) is 45 secs.