CONTAP-82578: Slow DCs can cause CIFS and NFS authentication to fail or high latency
Issue
- Slow DCs can cause CIFS and NFS authentication to fail or high latency
- secd logs:
[kern_secd:info:10667] [SECD MASTER THREAD] SecD RPC Server: Too many outstanding Generic RPC requests: sending System Error to RPC 151 [kern_secd:info:10667] | [xxx.xxx.xxx] CRIT : _maxConnectionSem held for 405 seconds for cache ref ID xxxxxxxxx, conn type LDAP (Active Directory) { in semaphoreReleased() at src/include/secd_thread_data_manager.h:219 }
- In combination with the _maxConnectionSem entry in secd, the following symptoms may be observed as well
- CIFS shares are inaccessible on one or more nodes
- EMS logs:
[node1: secd: secd.cifsAuth.problem:error]: vserver (SVM1) General CIFS authentication problem. Error: User authentication procedure failed CIFS SMB2 Share mapping - Client Ip = 1.2.3.4 [ 0 ms] Login attempt by domain user 'DOMAIN\user1' using NTLMv2 style security [ 0] Unable to connect to NetLogon service on dc2.domain.com (Error: RESULT_ERROR_SPINCLIENT_SOCKET_CONNECT_ERROR) [ 0] Unable to connect to NetLogon service on dc1.domain.com (Error: RESULT_ERROR_SPINCLIENT_SOCKET_CONNECT_ERROR) [ 1] No servers available for MS_NETLOGON, vserver: 8, domain: domain.com. **[ 1] FAILURE: Unable to make a connection (NetLogon:domain.com), Result: RESULT_ERROR_SECD_NO_SERVER_AVAILABLE [ 2] CIFS authentication failed - secd logs: [kern_secd:info:10331] [ 0] Unable to connect to NetLogon service on dc1.domain.com (Error: RESULT_ERROR_SPINCLIENT_SOCKET_CONNECT_ERROR)
- secd may lack memory
- EMS logs:
ERROR Nblade.CifsOperationTimedOut: Detected a timed out CIFS operation. SMB command for this operation: SMB2_COM_SESSION_SETUP, Number of times this command was suspended: 1, Number of times this command was restarted: 0, Last CSM error during this operation: CSM_OK, Remote blade UUID: 00000000-0000-0000-0000-000000000000, Is QoS enabled: QoS_disabled, Last nBlade error during this operation: SPINNP_NO_FO_ERROR, Client IP address: 10.11.12.188, Local IP address: 10.11.12.111, Target Vserver ID: 4, Target disk's DSID: 0, Target Vserver Name: svm1
[node1: secd: secd.rpc.server.request.dropped:debug]: The RPC secd_rpc_auth_extended sent from NBLADE_CIFS was dropped by SecD due to memory pressure. - secd logs: [kern_secd:info:14025] | [000.000.007] debug: [SECD SERVER THREAD] SecD RPC Server received RPC from NBLADE_NFS. RPC 206: secd_rpc_auth_user_id_to_name { in secd_prog_1() at src/server/secd_rpc_server.cpp:1649 } [kern_secd:info:14025] | [000.000.017] ERR : RESULT_ERROR_GENERAL_OUT_OF_MEMORY:15 in pushRpcTask() at src/server/secd_rpc_server.cpp:1473 [kern_secd:info:14025] | [000.000.021] ERR : RESULT_ERROR_GENERAL_OUT_OF_MEMORY:15 in secd_prog_1() at src/server/secd_rpc_server.cpp:1680 [kern_secd:info:10039] [SECD MASTER THREAD] SecD RPC Server: Too many outstanding Generic RPC requests: sending System Error to RPC 151:secd_rpc_auth_extended Request ID:33830.
- High CIFS latencies may be observed
- Client observes high latencies only if network goes over certain node
- CIFS session_setup_latency peaks over 50s