Home
Hybrid Infrastructure
StorageGRID
Object Management
StorageGRID high CPU utilization by Cassandra database result in increased S3 client latency

StorageGRID high CPU utilization by Cassandra database result in increased S3 client latency

Last updated

Jul 26, 2023
Save as PDF
Share
1. Share
2. Tweet
3. Share

Views:: 670

Visibility:: Public

Votes:: 0

Category:: storagegrid

Specialty:: sgrid

Last Updated:: 7/26/2023, 11:16:51 AM

Applies to

NetApp StorageGRID
Software release 11.6 and later versions

Issue

All the metrics presented below can be found in StorageGRID Grid Manager under Support > Metrics.
Storage node shows Cassandra CPU utilization greater than 85% with minimal I/O wait. This can be found under Node (Internal Use) Grafana dashboard.

High amount of Cassandra Tasks Pending Task Queue Too Large KB CPU Utilization (by service).png

High amount of Cassandra Tasks Pending Task Queue Too Large KB CPU Utilization.png

Cassandra ReadStage is consistently at its maximum level. This can be found under Cassandra Node Overview Grafana dashboard.

High amount of Cassandra Tasks Pending Task Queue Too Large KB Threadpools Active Tasks.png

A single Cassandra table shows extremely high latency in the hundreds of seconds. This can be found under Cassandra Node Overview Grafana dashboard.
1. Below example shows object_by_version with such high latency. Note that it could be any table.

High amount of Cassandra Tasks Pending Task Queue Too Large KB Read Latency By Table.png

In the node's Cassandra directory, the following command returns more than 100 entries. Consider checking the rotated log file (i.e jstack.log.1).

# cd /var/local/log/cassandra/jstack/ # grep -c Murmur3 jstack.log 8393