Skip to main content

NetApp_Insight_2020.png 

NetApp Knowledgebase

How to troubleshoot pBlk exhaustion due to vscan server on Data ONTAP 8 7-Mode

Views:
99
Visibility:
Public
Votes:
0
Category:
ontap-9
Specialty:
nas
Last Updated:

Applies to

  • Data ONTAP 7
  • Data ONTAP 8 operating in  7-Mode

Description

The response time of an external Vscan server directly impacts the ability of a Storage Controller to respond to client requests. In Data ONTAP 7 and Data ONTAP 8-7 Mode, Vscan servers are external to the Storage Controller. The usage of pBlks increases because the request from the client accounts for pBlk usage and in addition the scanning of the file (via opening the file) from the Vscan server accounts for additional pBlk usage. The sooner the Vscan server can complete the scanning of the file, the sooner Data ONTAP can respond to the original client request and free up pBlk(s). There are four aspects to be considered when looking at pBlk exhaustion and external Vscan servers:

  • The number of Vscan servers: The maximum number of scan requests the Storage Controller (or per vfiler if Multistore is used, meaning if you have two vfilers you can send 100 requests total to one server, 50 per vfiler) can send to a Vscan server at any given time is 50. If 100 requests come in at the same time, one Vscan server will have to process the first 50 requests before it can start the second 50 requests. In this scenario, the Max gOffloadQueue depth would become 50, since the storage controller has to wait for some of the first block of 50 to finish before sending requests held in the second block of 50. In this example pBlk exhaustion might not have occurred, but this highlights as more clients are added to a Storage Controller there is a need for optimal performance from the AV infrastructure.
  • The speed of Vscan Servers: The speed of external Vscan servers is critical; for this reason it is recommended to run Vscan servers on dedicated hardware rather than running them as Virtual Machines (for the current up to data information on Vscan in Data ONTAP 7.x environments see TR-3107: Antivirus Scanning Best Practices Guide). If the performance of the external Vscan server is degraded, it will take longer to respond to Storage Controller Vscan requests, resulting in pBlks being held for longer periods of time. If the speed of the Vscan server is so degraded and enough clients send requests in a short period of time, pBlk exhaustion can occur.
  • The configuration of Vscan servers: Vscan vendors control the tunable options for their application. The best place to start is the installation and configuration guides from the Vscan server vendor to make sure that the best practices are being met for the Vscan product.  Configurations not meeting vendor's best practices will be likely to result in decreased performance, putting the Storage Controller at risk of pBlk exhaustion. 

Vscan options timeout: In addition to a properly sized Vscan infrastructure, Data ONTAP has options that control the amount of time it will wait for virus scans to complete. It is imperative that these values are set to the specification for the Vscan vendor and are defined based on best practices. 

Example of pBlk consumption (the assumption here is that the file needs to be scanned by the Vscan server):   

This is a very high level overview of the process. The numbers reflected below are not real world and are for example purposes only.
  • Client issues read request for fileA.txt on the storage controller. The storage controller will allocate a pBlk for the client read request.
    Total pBlks consumed = 1
  • Filer issues RPC call to the Vscan server (VSCAN01) requesting a scan of fileA.txtTotal pBlks consumed = 1
  • Vscan server, VSCAN01, will then make a request over a share on the filer, ONTAP_ADMIN$, to retrieve the file to be scanned.  In order to scan the file, the Vscan server has to read all or part of the file.
    Total pBlks consumed = 2 **Note the increase
  • Vscan server, VSCAN01, finishes reading the file. Then satisfies the scan operation by sending back a reply to the storage controller.
    Total pBlks consumed = 1  **Note the decrease
  • Storage Controller then does internal accounting to mark fileA.txt as being scanned.
    Total pBlks consumed = 1
  • Finally in step #6, the filer responds to the clients read request.
    Total pBlks consumed = 0
During the entire time between steps 1 and 6, the client is holding a pBlk until the virus scan of the file is complete.

 

CUSTOMER EXCLUSIVE CONTENT

Registered NetApp customers get unlimited access to our dynamic Knowledge Base.

New authoritative content is published and updated each day by our team of experts.

Current Customer or Partner?

Sign In for unlimited access

New to NetApp?

Learn more about our award-winning Support