Skip to main content
NetApp Knowledgebase

Host-side SCSI timeouts when using Microsoft iSCSI software initiator

Views:
855
Visibility:
Public
Votes:
0
Category:
data-ontap-8
Specialty:
san
Last Updated:

 

Applies to

  • SAN 
  • FlexPod 
  • Data ONTAP 8 7-Mode 
  • Data ONTAP 7 and earlier 

Answer

ERROR_IO_DEVICE

  • This article discusses the general host-side SCSI timeout issues when using iSCSI.
  • In summary, if the storage system is heavily loaded, it is possible for the storage system to exhibit long latency while processing incoming requests.
  • Then host-side SCSI timeout handling mechanisms are triggered.
  • Occasional timeouts are not necessarily a major issue, because these requests will be re-submitted several times before the application receives IO errors. But repeated timeouts might cause application errors. Depending on the configurable SCSI timeout values, applications might receive IO errors after several minutes.
  • On Windows, this error is typically ERROR_IO_DEVICE (The request could not be performed because of an I/O device error).
  • This can be viewed in the application's log, if the application logs such messages.
  • With the Microsoft iSCSI software initiator, repeated messages similar to the following can be observed in the system event log, because of host-side SCSI timeouts:

iScsiPrt Information None 34 N/A SOUTHPOINT
A connection to the target was lost, but the Initiator successfully reconnected to the target. Dump data contains the target name.
iScsiPrt Error None 39 N/A SOUTHPOINT
The Initiator sent a task management command to reset the target. The target name is given in the dump data.
iScsiPrt Error None 9 N/A SOUTHPOINT
Target did not respond in time for a SCSI request. The CDB is given in the dump data.

And you might see repeated messages like the followings in storage system's /etc/messages file because of the host-side SCSI timeouts:

Error message: [iscsi.notice:notice]: ISCSI: iswta, LUN Reset (from initiator iqn.1991-05.com.microsoft:exch), aborting all SCSI commands on lun 0
Error message: [iscsi.notice:notice]: ISCSI: iswta, New session from initiator iqn.1991-05.com.microsoft:exch at IP addr 10.0.0.1

To mitigate the host-side SCSI timeout issue, as described above using Microsoft iSCSI initiator against NetApp storage appliances.

Note: If a Disk Removal event is observed in the Windows system event log, probably, a loss of TCP connections can be experienced, which is discussed in KB Parameters that control how MS iSCSI survives lost TCP connections without causing applications harm

 

Summary

  • Increase either one of the two registry values to increase the host-side tolerance for longer latencies of SCSI requests:
    • Disk class driver's TimeOutValue
      -OR-
    • Microsoft iSCSI initiator specific SrbTimeoutDelta.
  • NetApp recommends increasing  SrbTimeoutDelta. Increase the registry value to 60 (seconds) first, then monitor your host/application behavior, as well as the storage system performance. If 60 is not good enough for the workload, gradually increase it to 120, 180, etc. until the issue is resolved.

Configurable parameters

If the Microsoft iSCSI software initiator is being used in a Windows 2000 or Windows Server 2003 host, there are two configurable registry values that can affect the SCSI timeout handling:

  1. System wide disk class driver TimeOutValue
    • This registry value, TimeOutValue, is under registry key:
      HKEY_LOCAL_MACHINESYSTEMCurrentControlSetServicesDisk
      The default value is 10 (seconds).
  2. Microsoft iSCSI initiator specific SrbTimeoutDelta
    • This registry value, SrbTimeoutDelta, is under registry key:
      HKEY_LOCAL_MACHINESYSTEMCurrentControlSetControlClass{4D36E97B-E325-11CE-BFC1-08002BE10318}0001Parameters
    • For Windows 2008, the SrbTimeoutDelta was in HKEY_LOCAL_MACHINESYSTEMCurrentControlSetControlClass{4D36E97B-E325-11CE-BFC1-08002BE10318}0005Parameters.
      The number will be 0001 or some other iteration (i.e. 0002 , 0003, 0004, etc).

There might be a different number depending on the number of SCSI controllers in the host. Its default value is 15 (seconds). With Microsoft iSCSI initiator, this SrbTimeoutDelta will be added up to the disk class driver's TimeOutValue, when SCSI requests are being built. Hence, the default SCSI timeout value is 25 (seconds) now.

What should be modified?

  • If the timeout values needs to be changed, modify either one of the above registry values. It is not necessary to modify both of them.
  • If the disk class driver's TimeOutValue is modified, it will affect all SCSI requests to the disk device in the host.
  • If you modify SrbTimeoutDelta, only those disks connected through Microsoft iSCSI initiator will be affected.
  • So, it can be modified depending on the requirement. It is recommended to modify SrbTimeoutDelta, so that only those disks connected through Microsoft iSCSI initiator will be affected.
  • However, if there are multiple SCSI initiators/HBAs in the host and it is required to change the SCSI timeout handling for all of them, modify the TimeOutValue.

Note: After modifying SrbTimeoutDelta, if the Microsoft iSCSI initiator is removed from the host and installed again, its value will be reset to default.

How to modify?

  • If the Microsoft iSCSI initiator specific SrbTimeoutDelta is to be modified, perform the following steps:
    1. Open windows registry using regedit.
    2. Locate SrbTimeoutDelta under registry key:
      HKEY_LOCAL_MACHINESYSTEMCurrentControlSetControlClass{4D36E97B-E325-11CE-BFC1-08002BE10318}\Parameters
    3. Double-click and change to the value recommended below.
  • If the disk class driver's TimeoutOutValue is to be modified, perform the following steps:
    1. Open windows registry using regedit.
    2. Add or modify under registry key:
      HKEY_LOCAL_MACHINESYSTEMCurrentControlSetServicesDisk
      • If TimeOutValue already exists, double click and change it to the value recommended below.
      • If TimeOutValue does not exist, select Edit > New > DWord Value, enter the name as TimeOutValue, then double-click and change it to the value recommended below.

Note: Remember to reboot the host for the new value to take effect.

What value to recommend?

  • It is recommended to keep their default values unless it is clear that you are running into the SCSI timeout issue described in KB Host-side SCSI Timeouts when using iSCSI, and this one.
  • Larger values will increase the host-side tolerance for longer latencies of SCSI requests.
  • It is recommended to begin by increasing the registry value to 60 (seconds), then monitor the application and host behavior.
  • Also use the other means that are described in KB Host-side SCSI Timeouts when using iSCSI to monitor the controller performance as well.
  • If 60 is not good enough for the workload, increase it to 120, 180, etc.

Additional Information

N/A