CFDISK-1350: X359_S165330TATE drives failing after a power cycle on NA50
Issue
- newly configured system with no user data saved, have multiple disk failed randomly, upon system power off and on.
EMS
[hamsg: disk.partner.diskFail:debug]: The partner with sysid xxxxxxxxx has failed 0d.11.15P1.
[hamsg: disk.partner.diskFail:debug]: The partner with sysid xxxxxxxxx has failed 0d.11.15P2.
[disk_server_0: scsi.debug:debug]: shm_setup_for_failure disk 0d.11.15 (S/N xxxxxxxxx) error 40000000h
[config_thread: raid.disk.offline:notice]: Marking Disk /aggr1/plex0/rg0/0d.11.15P3 Shelf 11 Bay 15 [NETAPP X359_S165330TATE NA50] S/N [xxxxxxxxx] UID [60025380:0470D710:500A0981:00000003:00000000:00000000:00000000:00000000:00000000:00000000] offline.
[disk_server_0: disk.fail.ssdstats:info]: Disk 0d.11.15 (xxxxxxxxx) failed with rated life used 0 %, percent spare blocks 0 %, spare blocks N/A.
- node may panic due to multi disk failure
[config_thread: cf.multidisk.fatalProblem:error]: Node encountered a multidisk error or other fatal error while waiting to be taken over. aggr aggr_aggr2: raid volfsm, fatal multi-disk error.. Raid type - raid_dp Group name plex0/rg0 state DOUBLERECONS. 1 disk failed in the group. Disk 0d.11.14P1 Shelf 11 Bay 14 [NETAPP X359_S165330TATE NA50] S/N [XXXXXXXXXXXX] UID [60025380:0470DB90:500A0981:00000001:00000000:00000000:00000000:00000000:00000000:00000000] error: adapter error prevents command from being sent to device. Raid type - raid_dp Group name plex0/rg1 state DOUBLERECONS. 1 disk failed in the group. Disk /aggr_aggr2/plex0/rg1/0d.11.14P2 Shelf 11 Bay 14 [NETAPP X359_S165330TATE NA50] S/N [XXXXXXXXXXXX] UID [60025380:0470DB90:500A0981:00000002:00000000:00000000:00000000:00000000:00000000:00000000] error: adapter error prevents command from being sent to device..