Host-side SCSI timeouts when using Microsoft iSCSI software initiator
Applies to
- SAN
- FlexPod
- Data ONTAP 8 7-Mode
- Data ONTAP 7 and earlier
Answer
ERROR_IO_DEVICE
- This article discusses the general host-side SCSI timeout issues when using iSCSI.
- In summary, if the storage system is heavily loaded, it is possible for the storage system to exhibit long latency while processing incoming requests.
- Then host-side SCSI timeout handling mechanisms are triggered.
- Occasional timeouts are not necessarily a major issue, because these requests will be re-submitted several times before the application receives IO errors. But repeated timeouts might cause application errors. Depending on the configurable SCSI timeout values, applications might receive IO errors after several minutes.
- On Windows, this error is typically
ERROR_IO_DEVICE
(The request could not be performed because of an I/O device error). - This can be viewed in the application's log, if the application logs such messages.
- With the Microsoft iSCSI software initiator, repeated messages similar to the following can be observed in the system event log, because of host-side SCSI timeouts:
iScsiPrt Information None 34 N/A SOUTHPOINT
A connection to the target was lost, but the Initiator successfully reconnected to the target. Dump data contains the target name.
iScsiPrt Error None 39 N/A SOUTHPOINT
The Initiator sent a task management command to reset the target. The target name is given in the dump data.
iScsiPrt Error None 9 N/A SOUTHPOINT
Target did not respond in time for a SCSI request. The CDB is given in the dump data.
And you might see repeated messages like the followings in storage system's /etc/messages
file because of the host-side SCSI timeouts:
Error message: [iscsi.notice:notice]: ISCSI: iswta, LUN Reset (from initiator iqn.1991-05.com.microsoft:exch), aborting all SCSI commands on lun 0
Error message: [iscsi.notice:notice]: ISCSI: iswta, New session from initiator iqn.1991-05.com.microsoft:exch at IP addr 10.0.0.1
To mitigate the host-side SCSI timeout issue, as described above using Microsoft iSCSI initiator against NetApp storage appliances.
Note: If a Disk Removal
event is observed in the Windows system event log, probably, a loss of TCP connections can be experienced, which is discussed in KB Parameters that control how MS iSCSI survives lost TCP connections without causing applications harm
Summary
- Increase either one of the two registry values to increase the host-side tolerance for longer latencies of SCSI requests:
- Disk class driver's
TimeOutValue
-OR- - Microsoft iSCSI initiator specific
SrbTimeoutDelta
.
- Disk class driver's
- NetApp recommends increasing
SrbTimeoutDelta
. Increase the registry value to 60 (seconds) first, then monitor your host/application behavior, as well as the storage system performance. If 60 is not good enough for the workload, gradually increase it to 120, 180, etc. until the issue is resolved.
Configurable parameters
If the Microsoft iSCSI software initiator is being used in a Windows 2000 or Windows Server 2003 host, there are two configurable registry values that can affect the SCSI timeout handling:
- System wide disk class driver
TimeOutValue
- This registry value,
TimeOutValue
, is under registry key:
HKEY_LOCAL_MACHINESYSTEMCurrentControlSetServicesDisk
The default value is 10 (seconds).
- This registry value,
- Microsoft iSCSI initiator specific
SrbTimeoutDelta
- This registry value,
SrbTimeoutDelta
, is under registry key:
HKEY_LOCAL_MACHINESYSTEMCurrentControlSetControlClass{4D36E97B-E325-11CE-BFC1-08002BE10318}0001Parameters
- For Windows 2008, the
SrbTimeoutDelta
was inHKEY_LOCAL_MACHINESYSTEMCurrentControlSetControlClass{4D36E97B-E325-11CE-BFC1-08002BE10318}0005Parameters
.
The number will be 0001 or some other iteration (i.e. 0002 , 0003, 0004, etc).
- This registry value,
There might be a different number depending on the number of SCSI controllers in the host. Its default value is 15 (seconds). With Microsoft iSCSI initiator, this SrbTimeoutDelta
will be added up to the disk class driver's TimeOutValue
, when SCSI requests are being built. Hence, the default SCSI timeout value is 25 (seconds) now.
What should be modified?
- If the timeout values needs to be changed, modify either one of the above registry values. It is not necessary to modify both of them.
- If the disk class driver's
TimeOutValue
is modified, it will affect all SCSI requests to the disk device in the host. - If you modify
SrbTimeoutDelta
, only those disks connected through Microsoft iSCSI initiator will be affected. - So, it can be modified depending on the requirement. It is recommended to modify
SrbTimeoutDelta
, so that only those disks connected through Microsoft iSCSI initiator will be affected. - However, if there are multiple SCSI initiators/HBAs in the host and it is required to change the SCSI timeout handling for all of them, modify the
TimeOutValue
.
Note: After modifying SrbTimeoutDelta
, if the Microsoft iSCSI initiator is removed from the host and installed again, its value will be reset to default.
How to modify?
- If the Microsoft iSCSI initiator specific
SrbTimeoutDelta
is to be modified, perform the following steps:- Open windows registry using
regedit
. - Locate
SrbTimeoutDelta
under registry key:
HKEY_LOCAL_MACHINESYSTEMCurrentControlSetControlClass{4D36E97B-E325-11CE-BFC1-08002BE10318}\Parameters
- Double-click and change to the value recommended below.
- Open windows registry using
- If the disk class driver's
TimeoutOutValue
is to be modified, perform the following steps:- Open windows registry using
regedit
. - Add or modify under registry key:
HKEY_LOCAL_MACHINESYSTEMCurrentControlSetServicesDisk
- If
TimeOutValue
already exists, double click and change it to the value recommended below. - If
TimeOutValue
does not exist, select Edit > New > DWord Value, enter the name asTimeOutValue
, then double-click and change it to the value recommended below.
- If
- Open windows registry using
Note: Remember to reboot the host for the new value to take effect.
What value to recommend?
- It is recommended to keep their default values unless it is clear that you are running into the SCSI timeout issue described in KB Host-side SCSI Timeouts when using iSCSI, and this one.
- Larger values will increase the host-side tolerance for longer latencies of SCSI requests.
- It is recommended to begin by increasing the registry value to 60 (seconds), then monitor the application and host behavior.
- Also use the other means that are described in KB Host-side SCSI Timeouts when using iSCSI to monitor the controller performance as well.
- If 60 is not good enough for the workload, increase it to 120, 180, etc.
Additional Information
N/A