Skip to main content
NetApp Knowledge Base

After a non-clean reboot or shutdown, StorageGRID Cassandra fails to start

Views:
862
Visibility:
Public
Votes:
0
Category:
storagegrid
Specialty:
sgrid
Last Updated:

Applies to

NetApp StorageGRID 11.3 and above

Issue

  • A non-clean reboot or shutdown of a Storage Node due to power failure, hardware fault, Linux kernel panic, or other such events can lead to Cassandra failing to start.
  • Node is blue/offline in Grid Manager.
  • Cassandra system.log shows:

Example 1:

ERROR [main] 2019-09-26 23:29:20,814 LogTransaction.java (line 492) Unexpected disk state: failed to read transaction log [mc_txn_compaction_69307590-e080-11e9-9e4d-89535f6a3331.log in /var/local/cassandra/data/0/storagegrid/s3_usage_delta-3fdd6e00b8ad11e9814387d19a1fda41]
Files and contents follow:
/var/local/cassandra/data/0/storagegrid/s3_usage_delta-3fdd6e00b8ad11e9814387d19a1fda41/mc_txn_compaction_69307590-e080-11e9-9e4d-89535f6a3331.log
  ADD:[/var/local/rangedb/0/cassandra/storagegrid/s3_usage_delta-3fdd6e00b8ad11e9814387d19a1fda41/mc-86651-big,0,8][2595327302]
  REMOVE:[/var/local/rangedb/0/cassandra/storagegrid/s3_usage_delta-3fdd6e00b8ad11e9814387d19a1fda41/mc-86648-big,1569517652000,8][3604952]
  REMOVE:[/var/local/rangedb/0/cassandra/storagegrid/s3_usage_delta-3fdd6e00b8ad11e9814387d19a1fda41/mc-86647-big,1569517605000,8][1703752985]
  REMOVE:[/var/local/rangedb/0/cassandra/storagegrid/s3_usage_delta-3fdd6e00b8ad11e9814387d19a1fda41/mc-86650-big,1569517773000,8][717261549]
    ***Unexpected files detected for sstable [mc-86650-big-]: last update time [17:09:22] should have been [17:09:33]
  REMOVE:[/var/local/rangedb/0/cassandra/storagegrid/s3_usage_delta-3fdd6e00b8ad11e9814387d19a1fda41/mc-86649-big,1569517723000,8][1388813582]
  COMMIT:[,0,0][2613697770]

Check logs before last shutdown for any errors and ensure transaction log files were not edited manually.
ERROR [main] 2019-09-26 23:29:20,818 CassandraDaemon.java (line 725) Cannot remove temporary or obsoleted files for storagegrid.s3_usage_delta due to a problem with transaction log files. Please check records with problems in the log messages above and fix them. Refer to the 3.0 upgrading instructions in NEWS.txt for a description of transaction log files.

Example 2:

ERROR [main] 2021-08-26 21:32:59,392 LogTransaction.java (line 492) Unexpected disk state: failed to read transaction log [mc_txn_compaction_da1f14d0-05f1-11ec-9de9-7b6fd8319cd5.log in /var/local/cassandra/data/0/accounts/accounts]
Files and contents follow:
/var/local/cassandra/data/0/accounts/accounts/mc_txn_compaction_da1f14d0-05f1-11ec-9de9-7b6fd8319cd5.log
    ADD:[/var/local/rangedb/0/cassandra/accounts/accounts/mc-21121-big,0,8][2511397250]
    REMOVE:[/var/local/rangedb/0/cassandra/accounts/accounts/mc-21117-big,1629929198000,8][2020317670]
    REMOVE:[/var/local/rangedb/0/cassandra/accounts/accounts/mc-21118-big,1629929326000,8][2017647165]
    REMOVE:[/var/local/rangedb/0/cassandra/accounts/accounts/mc-21120-big,1629929684000,8][3438069446]
    REMOVE:[/var/local/rangedb/0/cassandra/accounts/accounts/mc-21119-big,1629929608000,8][746831909]
        ***Unexpected files detected for sstable [mc-21119-big-]: last update time [22:12:38] should have been [22:13:28]
    COMMIT:[,0,0][2613697770]

Check logs before last shutdown for any errors, and ensure txn log files were not edited manually.
ERROR [main] 2021-08-26 21:32:59,396 CassandraDaemon.java (line 725) Cannot remove temporary or obsoleted files for accounts.accounts due to a problem with transaction log files. Please check records with problems in the log messages above and fix them. Refer to the 3.0 upgrading instructions in NEWS.txt for a description of transaction log files.

  • The servermanager.log shows:

2019-09-26 23:28:37 +0000 | cassandra | starting cassandra
2019-09-26 23:28:51 +0000 | cassandra | cassandra ended
2019-09-26 23:28:54 +0000 | cassandra | starting cassandra
2019-09-26 23:29:05 +0000 | cassandra | cassandra ended
2019-09-26 23:29:08 +0000 | cassandra | starting cassandra
2019-09-26 23:29:22 +0000 | cassandra | cassandra ended
2019-09-26 23:29:25 +0000 | cassandra | Too many failed attempts, entering error state
2019-09-26 23:29:25 +0000 | cassandra | cassandra ended

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.