Duplicate aggregates are displayed after ONTAP upgrade
Applies to
- FAS2750
- Automated nondisruptive upgrade(ANDU)
- Background disk firmware update(BDFU)
- Upgrade from ONTAP 9.7P15 to 9.10.1P17 via 9.8P21
Issue
- During an ONTAP upgrade from 9.8P21 to 9.10.1P17, a disk (
0b.00.11
) went offline and is marked as missing.- The disk firmware update caused the disk to be taken offline.
- The aggregate
aggr01
was degraded and missing disks.
node04 EMS log:
[?] Thu Dec 12 21:17:46 +0900 [node04: cf_giveback: ha.giveback.sysCommit:info]: Subsystem qos_ll_sfo_giveback took 151 msecs to commit giveback of aggregate 'aggr01'.
[?] Thu Dec 12 21:17:46 +0900 [node04: config_thread: raid.disk.assign.offline_ref:debug]: aggregate /aggr01/plex0/rg0/0b.00.5 assigned as an offline reference storage for /aggr01/plex0/rg0/0b.00.11.
[?] Thu Dec 12 21:17:46 +0900 [node04: config_thread: raid.disk.assign.offline_ref:debug]: aggregate /aggr01/plex0/rg0/0a.01.3 assigned as an offline reference storage for /aggr01/plex0/rg0/0b.00.11.
[?] Thu Dec 12 21:17:46 +0900 [node04: config_thread: raid.rg.degraded:notice]: : Raid group /aggr01/plex0/rg0 is degraded
[?] Thu Dec 12 21:17:46 +0900 [node04: config_thread: raid.disk.offline:notice]: Marking Disk /aggr01/plex0/rg0/0b.00.11 Shelf 0 Bay 11 [NETAPP X343_SSKBE1T8A10 NA02] S/N [WXXXXXXN] UID [5000C500:DE81263B:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000] offline.
[?] Thu Dec 12 21:17:46 +0900 [node04: bg_disk_fw_update_admin: bdfu.selected:info]: Disk 0b.00.11 [NETAPP X343_SSKBE1T8A10 NA02] S/N [WXXXXXXN] selected for background disk firmware update.
[?] Thu Dec 12 21:17:46 +0900 [node04: config_thread: raid.disk.online:notice]: Onlining Disk /aggr01/plex0/rg0/0b.00.11 Shelf 0 Bay 11 [NETAPP X343_SSKBE1T8A10 NA02] S/N [WXXXXXXN] UID [5000C500:DE81263B:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]
- After Givback, it will be reconstructed using spare disk
0b.00.23
.
node03 EMS log:
[?] Thu Dec 12 21:17:47 +0900 [node03: config_thread: raid.rg.recons.missing:notice]: RAID group /aggr01/plex0/rg0 is missing 1 disk(s).
[?] Thu Dec 12 21:17:47 +0900 [node03: config_thread: raid.rg.recons.info:notice]: Spare disk 0b.00.23 will be used to reconstruct one missing disk in RAID group /aggr01/plex0/rg0.
[?] Thu Dec 12 21:17:47 +0900 [node03: config_thread: raid.rg.recons.start:notice]: Disk /aggr01/plex0/rg0/0b.00.23 Shelf 0 Bay 23 [NETAPP X343_SSKBE1T8A10 NA02] S/N [WXXXXXXG] UID [5000C500:DE8204D7:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]: starting reconstruction, using disk 0b.00.23, disk block 5248.
[?] Thu Dec 12 21:17:47 +0900 [node03: config_thread: raid.vol.undestroy.info.missing:info]: params: {'disk_info': 'Disk /aggr01/plex0/rg0/0b.00.23 Shelf 0 Bay 23 [NETAPP X343_SSKBE1T8A10 NA02] S/N [WXXXXXXG] UID [5000C500:DE8204D7:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]', 'shelf': '0', 'bay': '23', 'vendor': 'NETAPP ', 'model': 'X343_SSKBE1T8A10', 'firmware_revision': 'NA02', 'serialno': 'WXXXXXXG', 'disk_type': '4', 'disk_rpm': '10000', 'carrier': '', 'site': 'Local'}
- After replaced another failed disk, the
node04
failover status has changed to partial giveback.
::> storage failover show
Takeover
Node Partner Possible State Description
-------------- -------------- -------- -------------------------------------
node03 node04 true Connected to node04
node04 node03 true Connected to node03, Partial giveback
2 entries were displayed.
- On both HA nodes,
aggr01
is displayed, and onnode04
only the missing disk is shown while the others are marked asFAILED
.
node04 sysconfig -r:
Aggregate aggr01 (failed, raid_dp, partial, fast zeroed) (block checksums) Plex /aggr01/plex0 (offline, failed, inactive) RAID group /aggr01/plex0/rg0 (partial, block checksums)
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
dparity FAILED N/A 1713523/ -
parity FAILED N/A 1713523/ -
data FAILED N/A 1713523/ -
data FAILED N/A 1713523/ -
data FAILED N/A 1713523/ -
data FAILED N/A 1713523/ -
data FAILED N/A 1713523/ -
data 0b.00.11 0b 0 11 SA:B 0 SAS 10000 1713523/3509295616 1716957/3516328368 (fast zeroed)
data FAILED N/A 1713523/ -
data FAILED N/A 1713523/ -
data FAILED N/A 1713523/ -
data FAILED N/A 1713523/ -
data FAILED N/A 1713523/ -
data FAILED N/A 1713523/ -
data FAILED N/A 1713523/ -
data FAILED N/A 1713523/ -
data FAILED N/A 1713523/ -
data FAILED N/A 1713523/ -
data FAILED N/A 1713523/ -
Raid group is missing 18 disks.