What do FCP Partner Path Misconfigured messages mean?

Last updated

Mar 31, 2022
Save as PDF
Share
1. Share
2. Tweet
3. Share

Views:: 4,195

Visibility:: Public

Votes:: 4

Category:: data-ontap-8

Specialty:: san

Last Updated:: 3/31/2022, 1:16:37 AM

Applies to

SAN
ONTAP 9
Data ONTAP 8 7-Mode
Data ONTAP 7 and earlier

Answer

AutoSupport message: FCP PARTNER PATH MISCONFIGURED

Syslog and EMS messages
[hostname: scsitarget.partnerPath.misconfigured:error]: FCP Partner Path Misconfigured. [hostname: scsitarget.partnerPath.misconfigured:error]: FCP Partner Path Misconfigured - Host I/O access through a non-primary and non-optimal path was detected.

Terminology

Partner Path: Any path to LUNs that utilizes the partner node to access a LUN hosted by the local node. The LUNs are in an active-active cluster.
Non-primary path: Synonymous with partner path, proxy path and secondary path. All are examples of a non-primary path.
FCP target port: The fibre channel interface that provides FCP service to hosts.
Virtual Target Interconnect (VTIC): The virtual FCP target interface seen in the initiator group list. VTIC is used to indicate that the initiator has access to a secondary path.

Problem Description
NetApp active-active clustered storage controllers allow access to logical units (LUNs) through FCP ports on both nodes of the cluster. Hosts should, under normal circumstances, only access LUNs through ports on the cluster node which hosts the LUN. I/O paths that utilize the ports of the cluster node that host the LUN are referred to as primary paths or optimized paths. I/O paths that utilize the partner cluster node are known as secondary paths, partner paths or non-optimized paths. A LUN should only be accessed through the partner cluster node when the primary ports are unavailable.

I/O access to LUNs using a secondary path indicates one or both of the following conditions: the primary path(s) between host and storage controller have failed, or host MPIO software is not configured correctly. These conditions indicate that the redundancy and performance of the SAN has been compromised. Corrective action should be taken immediately to restore primary paths to the storage controllers.

In some circumstances, this error may also be triggered by non-I/O activity such as MPIO path management operations or host clustering software performing status checks to LUNs. If either of these situations is determined to be responsible for triggering the error, steps may be taken to decrease this activity or specify custom thresholds for the trigger conditions. Beginning with Data ONTAP 7.2.2, non-read and non-write operations will not trigger this error message.

FCP Partner Path Misconfigured errors may spuriously occur following storage controller boots, storage controller cluster takeover and giveback operations or host reboots. These instances of the error are normal and are usually corrected once the host's MPIO software detects the changed path status. These occurrences of the error may be ignored if they are not continuous. Data ONTAP 7.2.2 and later releases have enhancements to prevent spurious triggers of this error following storage controller boot or cluster takeover or giveback operation.

NetApp now recommends adjusting the thresholds that control the trigger for the FCP Partner Path Misconfigured message to avoid spurious and unnecessary occurrences. Please issue the following commands on both nodes of the storage controller cluster.

options lun.use_partner.cc.warn_limit 300 options lun.use_partner.cc.bytes 2457600

The first option increases the time interval from 10 seconds to 300 seconds and the second option increases the bytes transferred threshold from 512000 bytes to 2457600 bytes.
If the warning message continues after making these changes then please follow the steps outlined in the remainder of this document.

The following diagrams illustrate I/O access through a primary path and non-primary path.

Identifying Affected LUNs and Hosts
The following steps provide a procedure that can be used to identify the I/O access that is responsible for triggering the error. The process of identifying the I/O access begins by identifying the LUNs receiving the I/O through the partner node's FCP target port. With the LUNs identified, the next step is to identify the initiators performing the I/O. Next, identify the FCP target ports on the storage controller that host the LUNs. The host should have access to at least one port on each node of the storage controller cluster. Once the host initiators and primary FCP target ports have been identified, the fabric can be inspected for correct and functioning connections and finally the hosts' multipath configurations can be inspected for correct operation.

Identify the LUNs being accessed through the partner node's FCP Target port.
a. lun stats -o (LUN STATISTICS)
Identify the host initiators that are performing the I/O through the partner path
1. lun config_check -A (LUN CONFIG CHECK) - output is only valid when viewed in AutoSupport
2. lun show -v (LUN CONFIGURATION)
3. igroup show -v (INITIATOR GROUPS)
Identify the primary storage controller FCP target ports available for access to the LUN:
1. fcp show cfmode (FCP CFMODE)
2. fcp show adapters (FCP TARGET ADAPTERS)
Verify the host initiator connectivity to primary FCP target ports and the host MPIO software configuration.
Verify use of the partner path has ceased from both cluster nodes:
sysstat -b 1

Procedure and Example Data

Identify the LUNs being accessed through the partner node's FCP target port as well as the type of operation.

The error threshold may be triggered by either the number of kilobytes read and written or the number of non-read and non-write operations performed. Examples of non-read and non-write SCSI operations are Inquiry, Persistent Reserve, Report LUNS, and Test Unit Ready. Data ONTAP 7.2.2 and later releases will not trigger this warning for non-read and non-write operations.

The LUN STATISTICS section of the Autosupport will display the read and write operations for both local and partner paths per LUN. This output may also be obtained using the command lun stats -o. The counters may also be zeroed using the command lun stats -z which is useful for determining how often the counters are increasing. The output of lun stats might show that no counters have exceeded the threshold. In this case, the LUNs that are being accessed by a partner path may reside on the partner node. Once the affected LUNs have been identified, continue to Step 2 to locate the responsible host.

Example of lun stats -o output from Autosupport:

===== LUN STATISTICS ===== /vol/esx_luns/guest001.lun (32 minutes, 39 seconds) Read (kbytes) Write (kbytes) Read Ops Write Ops Other Ops QFulls Partner Ops Partner KBytes 13510011357 12648494826 707504214 932701251 251651 0 263445977 2123914089

In the example above, both Partner Ops and Partner Kbytes have exceeded the threshold in the given time interval. The hosts accessing the LUN in this way should be identified and the reasoning for the access evaluated. Possible solutions are to restrict access, or tune the host MPIO software so that it will not attempt access through the partner path.
Identify the host initiators that are performing the I/O through the partner path.

Using the LUNs identified in Step 1, locate the suspect initiators by cross-referencing the LUN configuration with the initiator group mapping. This information can be found in the AutoSupport section titled LUN CONFIGURATION and by using the command lun show -v. Once the LUN's initiator groups have been identified, the initiator WWPNs that are members of the initiator group can be found in the Autosupport section INITIATOR GROUPS and with the command igroup show -v.

Example of lun show -v and igroup show -v from the AutoSupport:

===== LUN CONFIGURATION ===== /vol/esx_luns/guest001.lun 2.0t (2194459852800) (r/w, online, mapped) Comment: Serial#: XXXXXXXXX Share: none Space Reservation: enabled Multiprotocol Type: linux Maps: igroupA=0 igroupB=0 ===== INITIATOR GROUPS ===== igroupA (FCP) (ostype: vmware): 21:00:00:e0:8b:92:da:ef (logged in on: 0a, vtic) igroupB (FCP) (ostype: vmware): 21:00:00:e0:8b:82:d0:09 (logged in on: 0c, vtic)

In this example, the LUN /vol/esx_luns/guest001.lun is mapped to two initiator groups, igroupA and igroupB. Each initiator group contains one WWPN. These WWPNs belong to different ports which may be on the same host or on different hosts, depending on the deployment. The WWPNs that were identified in the initiator groups comprise a list of suspect initiators that could be accessing the LUN through a non-primary path. This list of suspect initiators will need to be checked in Step 4.

To reduce the suspect list, Data ONTAP 7.2.2 and later releases provide a list of initiators performing I/O access using the partner path. This information can be found in the LUN CONFIG CHECK section of the AutoSupport and the output of the command lun config_check -A (Caution: The output of lun config_check -A is only valid when included in AutoSupport). In addition to listing initiator access through the partner path, lun config_check -v will check and display a variety of other possible misconfigurations. Any issues presented in the output should be addressed before proceeding.

Example of lun config_check -A from the AutoSupport:

===== LUN CONFIG CHECK ===== The following FCP Initiators are sending Read/Write i/o over the FCP Partner Paths during the last 15 seconds WWPN Partner's Port ops bytes 21:00:00:e0:8b:25:0c:10 0c 34 17408 21:00:00:e0:8b:25:03:66 0c 186 1117696 21:00:00:e0:8b:25:0c:18 0c 1618 10866688 21:00:00:e0:8b:25:0b:b0 0c 1693 13290496

In this example, four initiators are accessing LUNs through the partner path. Some amount of operations are acceptable and normal for MPIO path management. The high operation count and byte count initiators are the ones that must be examined for MPIO configuration issues.
Identify the primary storage controller FC target ports available for access to the LUN.

In the active-active storage controller configuration, primary versus secondary ports will depend on the cluster failover mode (cfmode) employed. For example, the single_image cfmode allows each FC target port to be used for primary path access to LUNs hosted on the local node and partner path access for LUNs hosted on the partner node. For information about cfmode and port configurations, please see the Data ONTAP Block Access Management Guide for your version of Data ONTAP..

The cfmode might be identified using the command fcp show cfmode. It can also be located in the FCP CFMODE section of the AutoSupport.

===== FCP CFMODE ===== fcp show cfmode: single_image

The FCP target ports can be displayed using the command fcp show adapter and in the AutoSupport section FCP TARGET ADAPTERS.

This example shows the target adapters when using the cfmode single_image. When using single_image cfmode, either port may be used to access LUNs hosted on both the local and partner nodes.

> fcp show adapters Slot: 0c Description: Fibre Channel Target Adapter 0c (Dual-channel, QLogic 2322 (2362) rev. 3) Adapter Type: Local Status: ONLINE FC Nodename: 50:0a:09:80:86:17:c3:ac (500a09808617c3ac) FC Portname: 50:0a:09:81:96:17:c3:ac (500a09819617c3ac) Slot: 0d Description: Fibre Channel Target Adapter 0d (Dual-channel, QLogic 2322 (2362) rev. 3) Adapter Type: Local Status: ONLINE FC Nodename: 50:0a:09:80:86:17:c3:ac (500a09808617c3ac) FC Portname: 50:0a:09:82:96:17:c3:ac (500a09829617c3ac)
Copied

Verify the host initiator connectivity to primary FCP target ports and the host MPIO software configuration, using the suspect list generated in Step 2.

Once the primary and secondary FCP target ports have been identified, confirm that the host initiators are logged in. Connectivity of host initiators to the fabric and FCP target ports can be checked from the host. While the storage controller is capable of determining which initiators have been logged in, this information is not always current. Only the host initiators are able to provide current login status information. Tools such as Emulex's HBAnywhere and QLogic's SANSurfer may be used to determine the state of the initiators. If the initiator is no longer logged into the primary FCP target port, then the fabric should be inspected for link failures or zone restrictions that would prevent connectivity to the FCP target ports.

The following articles provide detailed procedures to verify host initiator connectivity and MPIO configuration for each host operating system:
How to verify HP-UX fibre channel configurations with multipathing I/O (MPIO)
What are the recommended settings for VMware ESX/ESXi 4.x & ESX/ESXi 3.5 when connected to NetApp Storage Systems?
How to verify Windows fibre channel configurations with multipathing I/O (MPIO)
How to verify Solaris fibre channel configurations with multipathing I/O (MPIO)
How to verify Linux fibre channel configurations with multipathing I/O (MPIO)
How to verify AIX fibre channel configurations with multipathing I/O (MPIO)

Verify use of the partner path has ceased from both cluster nodes.

The sysstat -b 1 command might be used to monitor SAN-related summary performance counters. The 'Partner' counter monitors I/O operations and data transfers that use the partner path. Monitoring these counters is an effective method to verify that the source of partner path traffic has been corrected. Be sure to check for partner path access from both cluster nodes as partner path traffic could be entering the cluster from both nodes.

CPU FCP iSCSI Partner Total FCP kB/s iSCSI kB/s Partner kB/s Disk kB/s CP CP Disk in out in out in out read write time ty util 0% 0 0 0 0 0 0 0 0 0 0 8 24 0% - 2% 0% 0 0 0 0 0 0 0 0 0 0 16 8 0% - 1% 0% 0 1 0 1 0 0 1 0 0 0 0 0 0% - 0%

Related Links: