H610S node reboots unexpectedly with critical medium errors on multiple drives
Applies to
- NetApp H610S
- NetApp Element software
Issue
- NetApp SolidFire Active IQ reports alert:
- Error Code:
nodeOffline - Details:
The SolidFire Application cannot communicate with node ID <#>
- Error Code:
- When the node starts up,
kern.logshows errors:
kernel: [ 11.576397] nvme nvme1: Shutdown timeout set to 8 seconds
kernel: [ 11.596298] nvme nvme3: Shutdown timeout set to 8 seconds
kernel: [ 11.596312] nvme nvme2: Shutdown timeout set to 8 seconds
kernel: [ 11.602910] nvme nvme1: Not submitting Async Event
kernel: [ 11.615649] No UUID available providing old NGUID
kernel: [ 11.616298] nvme nvme4: Shutdown timeout set to 8 seconds
kernel: [ 11.616583] print_req_error: critical medium error, dev nvme1n1, sector 0
kernel: [ 11.616585] Buffer I/O error on dev nvme1n1, logical block 0, async page read
$ grep "print_req_error: critical medium error" kern.log
kernel: [ 11.456135] print_req_error: critical medium error, dev nvme0n1, sector 0
kernel: [ 11.468133] print_req_error: critical medium error, dev nvme0n1, sector 0
kernel: [ 11.616583] print_req_error: critical medium error, dev nvme1n1, sector 0
kernel: [ 11.619333] print_req_error: critical medium error, dev nvme0n1, sector 3750748672
kernel: [ 11.628581] print_req_error: critical medium error, dev nvme1n1, sector 0
kernel: [ 11.631331] print_req_error: critical medium error, dev nvme0n1, sector 3750748672
kernel: [ 11.637082] print_req_error: critical medium error, dev nvme3n1, sector 0
kernel: [ 11.637110] print_req_error: critical medium error, dev nvme2n1, sector 0
kernel: [ 11.649082] print_req_error: critical medium error, dev nvme3n1, sector 0
kernel: [ 11.649110] print_req_error: critical medium error, dev nvme2n1, sector 0
