Storage node slow joining Cassandra cluster, failing S3 client requests
Applies to
- StorageGRID 11.3
Issue
- A StorageGRID node is slow joining Cassandra cluster after boot or servermanager restart.
- All services show as 'Up and Running', but node is failing S3 client requests.
- The
/var/local/log/cassandra/system.log
file indicates a problem with prometheus plugin opening keyspaces. - Messages similar to the following are repeating for all (or many) keyspaces:
ERROR [pool-1-thread-1] 2020-07-10 09:23:51,823 Keyspace.java (line 827) Failed to open keyspace system_distributed due to unexpected exception
java.lang.AssertionError: null
at org.apache.cassandra.db.Keyspace.open(Keyspace.java:137)
at org.apache.cassandra.db.Keyspace.lambda$toKeyspaces$1(Keyspace.java:817)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566)
at org.apache.cassandra.db.Keyspace.toKeyspaces(Keyspace.java:832)
at org.apache.cassandra.db.Keyspace.all(Keyspace.java:783)
at org.apache.cassandra.metrics.TableMetrics$28.getValue(TableMetrics.java:593)
at org.apache.cassandra.metrics.TableMetrics$28.getValue(TableMetrics.java:588)
at org.apache.cassandra.metrics.CassandraMetricsRegistry$JmxGauge.getValue(CassandraMetricsRegistry.java:235)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275)
at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
at com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:83)
at com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:206)
at com.sun.jmx.mbeanserver.MBeanSupport.getAttributes(MBeanSupport.java:213)
at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttributes(DefaultMBeanServerInterceptor.java:709)
at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttributes(JmxMBeanServer.java:705)
at io.prometheus.jmx.shaded.io.prometheus.jmx.JmxScraper.scrapeBean(JmxScraper.java:151)
at io.prometheus.jmx.shaded.io.prometheus.jmx.JmxScraper.doScrape(JmxScraper.java:117)
at io.prometheus.jmx.shaded.io.prometheus.jmx.JmxCollector.collect(JmxCollector.java:460)
at io.prometheus.jmx.shaded.io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.findNextElement(CollectorRegistry.java:183)
at io.prometheus.jmx.shaded.io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.nextElement(CollectorRegistry.java:216)
at io.prometheus.jmx.shaded.io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.nextElement(CollectorRegistry.java:137)
at io.prometheus.jmx.shaded.io.prometheus.client.exporter.common.TextFormat.write004(TextFormat.java:22)
at io.prometheus.jmx.shaded.io.prometheus.client.exporter.HTTPServer$HTTPMetricHandler.handle(HTTPServer.java:59)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:83)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:82)
at sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:675)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:647)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Note: the references to
prometheus.jmx
in the trace dump.