After a non-clean reboot or shutdown, StorageGRID Cassandra fails to start
Applies to
NetApp StorageGRID 11.3 and above
Issue
- A non-clean reboot or shutdown of a Storage Node due to power failure, hardware fault, Linux kernel panic, or other such events can lead to Cassandra failing to start.
- Node is blue/offline in Grid Manager.
- Cassandra
system.log
shows:
Example 1:
ERROR [main] 2019-09-26 23:29:20,814 LogTransaction.java (line 492) Unexpected disk state: failed to read transaction log [mc_txn_compaction_69307590-e080-11e9-9e4d-89535f6a3331.log in /var/local/cassandra/data/0/storagegrid/s3_usage_delta-3fdd6e00b8ad11e9814387d19a1fda41]
Files and contents follow:
/var/local/cassandra/data/0/storagegrid/s3_usage_delta-3fdd6e00b8ad11e9814387d19a1fda41/mc_txn_compaction_69307590-e080-11e9-9e4d-89535f6a3331.log
ADD:[/var/local/rangedb/0/cassandra/storagegrid/s3_usage_delta-3fdd6e00b8ad11e9814387d19a1fda41/mc-86651-big,0,8][2595327302]
REMOVE:[/var/local/rangedb/0/cassandra/storagegrid/s3_usage_delta-3fdd6e00b8ad11e9814387d19a1fda41/mc-86648-big,1569517652000,8][3604952]
REMOVE:[/var/local/rangedb/0/cassandra/storagegrid/s3_usage_delta-3fdd6e00b8ad11e9814387d19a1fda41/mc-86647-big,1569517605000,8][1703752985]
REMOVE:[/var/local/rangedb/0/cassandra/storagegrid/s3_usage_delta-3fdd6e00b8ad11e9814387d19a1fda41/mc-86650-big,1569517773000,8][717261549]
***Unexpected files detected for sstable [mc-86650-big-]: last update time [17:09:22] should have been [17:09:33]
REMOVE:[/var/local/rangedb/0/cassandra/storagegrid/s3_usage_delta-3fdd6e00b8ad11e9814387d19a1fda41/mc-86649-big,1569517723000,8][1388813582]
COMMIT:[,0,0][2613697770]
Check logs before last shutdown for any errors and ensure transaction log files were not edited manually.
ERROR [main] 2019-09-26 23:29:20,818 CassandraDaemon.java (line 725) Cannot remove temporary or obsoleted files for storagegrid.s3_usage_delta due to a problem with transaction log files. Please check records with problems in the log messages above and fix them. Refer to the 3.0 upgrading instructions in NEWS.txt for a description of transaction log files.
Example 2:
ERROR [main] 2021-08-26 21:32:59,392 LogTransaction.java (line 492) Unexpected disk state: failed to read transaction log [mc_txn_compaction_da1f14d0-05f1-11ec-9de9-7b6fd8319cd5.log in /var/local/cassandra/data/0/accounts/accounts]
Files and contents follow:
/var/local/cassandra/data/0/accounts/accounts/mc_txn_compaction_da1f14d0-05f1-11ec-9de9-7b6fd8319cd5.log
ADD:[/var/local/rangedb/0/cassandra/accounts/accounts/mc-21121-big,0,8][2511397250]
REMOVE:[/var/local/rangedb/0/cassandra/accounts/accounts/mc-21117-big,1629929198000,8][2020317670]
REMOVE:[/var/local/rangedb/0/cassandra/accounts/accounts/mc-21118-big,1629929326000,8][2017647165]
REMOVE:[/var/local/rangedb/0/cassandra/accounts/accounts/mc-21120-big,1629929684000,8][3438069446]
REMOVE:[/var/local/rangedb/0/cassandra/accounts/accounts/mc-21119-big,1629929608000,8][746831909]
***Unexpected files detected for sstable [mc-21119-big-]: last update time [22:12:38] should have been [22:13:28]
COMMIT:[,0,0][2613697770]
Check logs before last shutdown for any errors, and ensure txn log files were not edited manually.
ERROR [main] 2021-08-26 21:32:59,396 CassandraDaemon.java (line 725) Cannot remove temporary or obsoleted files for accounts.accounts due to a problem with transaction log files. Please check records with problems in the log messages above and fix them. Refer to the 3.0 upgrading instructions in NEWS.txt for a description of transaction log files.
- The
servermanager.log
shows:
2019-09-26 23:28:37 +0000 | cassandra | starting cassandra
2019-09-26 23:28:51 +0000 | cassandra | cassandra ended
2019-09-26 23:28:54 +0000 | cassandra | starting cassandra
2019-09-26 23:29:05 +0000 | cassandra | cassandra ended
2019-09-26 23:29:08 +0000 | cassandra | starting cassandra
2019-09-26 23:29:22 +0000 | cassandra | cassandra ended
2019-09-26 23:29:25 +0000 | cassandra | Too many failed attempts, entering error state
2019-09-26 23:29:25 +0000 | cassandra | cassandra ended