StorageGRID slow progress on Decommissioning Replicated Data or ILM evaluation
Applies to
NetApp StorageGRID
Issue
- The decommissioning process for StorageGRID Storage Nodes experiences slow or no progress at a specific stage
- Login to the Grid Manager.
- Navigate to Maintenance > Decommission.
- For StorageGRID versions older than 11.4, the decommissioning stage is "Evaluating ILM".
- For StorageGRID versions newer than 11.4, the decommissioning stage is "Decommissioning Replicated Data".
- The completion percentage of the decommissioning task is progressing slowly or not at all
- On the Grid Manager, navigate to Support > Grid topology > Primary Admin Node > CMN > Grid Tasks.
- Check the Stage and the "% Complete" value of the Storage Node Decommissioning task.
- The decommissioning node is waiting for the truncation of the same object data file:
- Login to the decommissioning node via an SSH session
- Switch to the root user:
su -
- Enable debug logging:
(echo "moduledebuglevel DCOM 1";sleep 1) | telnet 0 1402
- Monitor the decommissiong logs:
tail -f /var/local/log/bycast.log | grep DCOM
- Verify if the logs repeatedly show messages similar to the following:
Feb 24 14:48:59 <nodename> ADE: |12983731 0734583769 DCOM CSRT 2023-02-24T14:48:59.113673| INFO 0405 e3e6699e31d46b7e DCOM: Waiting for data file /var/local/rangedb/0/p/02/1F/00qLyqjT>z-sgPRdr4$h to be truncated
- Disable debug logging:
(echo "moduledebuglevel DCOM -1";sleep 1) | telnet 0 1402
- After confirming that the decommissioning node is repeatedly attempting to truncate the same object file, wait for at least two scan periods, and then repeat the above steps to check the decommission debug logs. The scan period can be viewed from the Dashboar in the Grid Manager.
- If after multiple scan periods, the node is still waiting for the same object file to be truncated, this KB applies.