ONTAP upgrade paused on error after self-encrypting drive failure

Last updated
Save as PDF
Share
1. Share
2. Tweet
3. Share

Views:: 524

Visibility:: Public

Votes:: 0

Category:: disk-drives

Specialty:: CORE

Last Updated:

Applies to

ONTAP 9
AFF system with NVMe self-encrypting drives (SEDs), including, but not limited to:
- X4014S173315TNTE
- X4014S173A15TNTE
- X4013S17337T6NTE
- X4013WBORA7T6NTF

Issue

During ONTAP upgrade, one or more partitioned disks fail with following error during boot:

[cluster-01:nse.op.failed:error]: Control failure on self-encrypting drive 0n.30; security provider: None, authority: None, during operation "tcg_tper_properties_sm".

[cluster-01:disk.init.failure.error:error]: Drive 0n.30 failed initialization due to error 5, sense code(5 2c 0 c).

[cluster-01:disk.init.failure.error:error]: Drive 0n.36 failed initialization due to error 5, sense code(5 2c 0 c).

[cluster-01:disk.init.failure.error:error]: Drive 0n.25 failed initialization due to error 5, sense code(5 2c 0 c).

[cluster-01:disk.init.failure.error:error]: Drive 0n.31 failed initialization due to error 5, sense code(5 2c 0 c).

[cluster-01:disk.init.failure.error:error]: Drive 0n.26 failed initialization due to error 5, sense code(5 2c 0 c).

[cluster-01:disk.init.failure.error:error]: Drive 0n.24 failed initialization due to error 5, sense code(5 2c 0 c).

This may result in disk inventory mismatch on partner node:

[cluster-01: svc_queue_thread: cf.disk.inventory.mismatch:error]: Status of the disk 0n.30P2 (xxxxxxxx:xxxxxxxx:xxxxxxxx:xxxxxxxx:xxxxxxxx:xxxxxxxx:00000000:00000000:00000000:00000000) has recently changed or the node (cluster-01) is missing the disk.

Partner can fail to giveback due to missing disks:

[cluster-02: svc_queue_thread: cf.giveback.disk.check.fail:alert]: cf giveback failed: Partner is missing disks.

Giveback may also be vetoed due to fabric pools:

sfo.giveback.failed: Giveback of aggregate aggr1 failed due to destination check failed.

sfo.sendhome.subsystemAbort: The giveback operation of 'aggr1' was aborted by 'fabric pools'.

gb.netra.ca.check.failed: Giveback of aggregate 'aggr1' (uuid: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx) failed due to Object store is not reachable on destination preventing object store access on the destination node.

In situations where multiple disks are failed in the root aggregate, on boot, node may boot to:

Waiting for giveback...(Press Ctrl-C to abort wait)Entering FM state:5 because mbFound:0 local in headswap:0

Entering FM state:5 because mbFound:0 local in headswap:0