Skip to main content
NetApp Knowledge Base

How to troubleshoot pBlk exhaustion due to vscan server on Data ONTAP 8 7-Mode

Views:
582
Visibility:
Public
Votes:
0
Category:
data-ontap-8
Specialty:
7dot
Last Updated:

Applies to

  • Data ONTAP 8 operating in 7-Mode
  • Data ONTAP 7

Description

  • The response time of an external Vscan server directly impacts the ability of a Storage Controller to respond to client requests.
  • In Data ONTAP 7 and Data ONTAP 8-7 Mode, Vscan servers are external to the Storage Controller.
  • pBlks usage increases because the request from the client accounts for pBlk usage.
  • Scanning of the file (via opening the file) from the Vscan server accounts for additional pBlk usage.
  • If Vscan server can complete the file scan faster, Data ONTAP can respond to the original client request and free up pBlk(s) faster.

Consider the following four aspects of pBlk exhaustion and external Vscan servers:

  • Number of Vscan servers:
    • Storage Controller can send a maximum of 50 scan requests to a Vscan server at any given time.
      (If Multistore is used,100 requests can be sent to one server with two vfilers; 50 per vfiler)
    • If 100 requests come in at the same time, one Vscan server will have to process the first 50 requests before it can start the second 50 requests.
    • In this scenario, the Max gOffloadQueue depth becomes 50, since the storage controller has to wait for some of the first block of 50 to finish before sending requests held in the second block of 50.
    • In this example, pBlk exhaustion might not have occurred, but this is highlighted as more clients are added to a Storage Controller, and there is a need for optimal performance from the AV infrastructure.
  • Speed of Vscan Servers:
    • Since speed of external Vscan servers is critical, it is recommended to run Vscan servers on dedicated hardware rather than running them as Virtual Machines (for the current up to date information on Vscan in Data ONTAP 7.x environments, see TR-3107: Antivirus Scanning Best Practices Guide).
    • If performance of the external Vscan server is degraded, it will take longer to respond to Storage Controller Vscan requests, resulting in pBlks being held for longer periods of time.
    • If the speed of the Vscan server is so degraded and enough clients send requests in a short period of time, pBlk exhaustion can occur
  • Configuration of Vscan servers:
    • Vscan vendors control the tunable options for their application.
    • The best place to start is the installation and configuration guides from the Vscan server vendor to make sure that the best practices are being met for the Vscan product.
    • Configurations not meeting vendor's best practices will likely result in decreased performance, putting the Storage Controller at risk of pBlk exhaustion
  • Vscan options timeout:
    • In addition to a properly sized Vscan infrastructure, Data ONTAP has options that control the amount of time it will wait for virus scans to complete.
    • It is imperative that these values are set to the specification for the Vscan vendor and are defined based on best practices. 

Example of pBlk consumption (the assumption here is that the file needs to be scanned by the Vscan server):   

This is a very high level overview of the process. The numbers reflected below are not real world and are for example purposes only.
  • Client issues read request for fileA.txt on the storage controller. The storage controller will allocate a pBlk for the client read request.
    Total pBlks consumed = 1
  • Filer issues RPC call to the Vscan server (VSCAN01) requesting a scan of fileA.txt, total pBlks consumed = 1
  • Vscan server, VSCAN01, will then make a request over a share on the filer, ONTAP_ADMIN$, to retrieve the file to be scanned. In order to scan the file, the Vscan server has to read all or part of the file.
    Total pBlks consumed = 2. Note the increase.
  • Vscan server, VSCAN01, finishes reading the file and then satisfies the scan operation by sending back a reply to the storage controller.
    Total pBlks consumed = 1. Note the decrease.
  • Storage Controller does internal accounting to mark fileA.txt as being scanned.
    Total pBlks consumed = 1
  • The filer responds to the clients read request.
    Total pBlks consumed = 0
During the entire time between steps 1 and 6, the client is holding a pBlk until the virus scan of the file is complete.

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.