NDMPCopy fails with "job aborted" or "Write to socket failed"
Applies to
- ONTAP 9
- NDMPCopy
- FSx
Issue
- The dump side(source) of the NDMPCopy failed with “Write to socket failed.”
- The restore portion(destination) of NDMPCopy failed with "received CLOSE command from [10.36.42.229].x"
- In NDMPCopy the DMA is the node-mgmt LIF of the node on which the command is executed – in this case, 10.36.42.229.
- Simultaneous packet traces was collected on both systems on the interfaces hosting IPs used by NDMPCopy
- It indicated that a device in the network is blocking the restore after it has started - multiple restransmissions by the source of the copy
- The abort from the DMA(10.36.42.229) is in response to the source of the copy resetting the connection after multiple retransmissions
Backup log excerpt:
rst Tue Jan 21 10:07:31 EST 2025 ndmpcopy:/src_svm/src_vol//dst_svm/dst_vol10.47.65.12610.36.43.112 Error (job aborted)NDMP log excerpt:
Tue Jan 21 2025 10:07:31 -05:00 [kern_ndmpd:info:7757] [19259] DEBUG: DMA>>S V4 sequence=9 (0x9)Tue Jan 21 2025 10:07:31 -05:00 [kern_ndmpd:info:7757] [19259] DEBUG: Time_stamp=0x678fb833 (Jan 21 10:07:31 2025)Tue Jan 21 2025 10:07:31 -05:00 [kern_ndmpd:info:7757] [19259] DEBUG: message type=0 (NDMP4_MESSAGE_REQUEST)Tue Jan 21 2025 10:07:31 -05:00 [kern_ndmpd:info:7757] [19259] DEBUG: message_code=0x902 (NDMP4_CONNECT_CLOSE)Tue Jan 21 2025 10:07:31 -05:00 [kern_ndmpd:info:7757] [19259] DEBUG: reply_sequence=0 (0x0)Tue Jan 21 2025 10:07:31 -05:00 [kern_ndmpd:info:7757] [19259] DEBUG: error_code=0 (NDMP4_NO_ERR)Tue Jan 21 2025 10:07:31 -05:00 [kern_ndmpd:info:7757] [19259] DEBUG: service_terminate calledTue Jan 21 2025 10:07:31 -05:00 [kern_ndmpd:info:7757] [19259] DEBUG: received CLOSE command from [10.36.42.229].x