千家信息网

Cassandra集群管理-节点异常重启

发表于:2025-12-03 作者:千家信息网编辑
千家信息网最后更新 2025年12月03日,Cassandra集群管理-节点异常重启登陆一台集群节点,直接重启服务器(172.20.101.166),设置了 cassandra 开机启动。注意:本文档只是体系文档中的一部分,前面文档信息详见:测
千家信息网最后更新 2025年12月03日Cassandra集群管理-节点异常重启

Cassandra集群管理-节点异常重启

登陆一台集群节点,直接重启服务器(172.20.101.166),设置了 cassandra 开机启动。

注意:

本文档只是体系文档中的一部分,前面文档信息详见:
测试准备+下线正常节点:https://blog.51cto.com/michaelkang/2419518
节点异常重启:https://blog.51cto.com/michaelkang/2419524
添加新节点:https://blog.51cto.com/michaelkang/2419521
删除异常节点:https://blog.51cto.com/michaelkang/2419525

场景:

节点被异常重启,对集群引发的反应。

cassandra.log 基本没有输出

tailf /var/log/cassandra/cassandra.log 

system.log

有明显日志报 172.20.101.166 DOWN !!!

172.20.101.165 节点:[root@kubm-03 lib]# tailf /var/log/cassandra/system.log INFO  [GossipStage:1] 2019-07-11 18:19:23,372 Gossiper.java:1026 - InetAddress /172.20.101.166 is now DOWN

查看异常节点

[root@kubm-01 ~]# nodetool describeclusterCluster Information:        Name: pttest        Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch        DynamicEndPointSnitch: enabled        Partitioner: org.apache.cassandra.dht.Murmur3Partitioner        Schema versions:                cfce5a85-19c8-327a-ab19-e1faae2358f7: [172.20.101.164, 172.20.101.165, 172.20.101.167, 172.20.101.160, 172.20.101.157]                UNREACHABLE: [172.20.101.166]

debug.log

大量报无法连接 172.20.101.166

172.20.101.164 节点:tailf /var/log/cassandra/debug.logDEBUG [GossipStage:1] 2019-07-11 18:19:23,374 OutboundTcpConnection.java:205 - Enqueuing socket close for /172.20.101.166DEBUG [MessagingService-Outgoing-/172.20.101.166-Small] 2019-07-11 18:19:23,374 OutboundTcpConnection.java:411 - Socket to /172.20.101.166 closedDEBUG [GossipStage:1] 2019-07-11 18:19:23,374 OutboundTcpConnection.java:205 - Enqueuing socket close for /172.20.101.166DEBUG [MessagingService-Outgoing-/172.20.101.166-Gossip] 2019-07-11 18:19:23,374 OutboundTcpConnection.java:411 - Socket to /172.20.101.166 closedDEBUG [GossipStage:1] 2019-07-11 18:19:23,374 FailureDetector.java:313 - Forcing conviction of /172.20.101.166DEBUG [MessagingService-Outgoing-/172.20.101.166-Gossip] 2019-07-11 18:19:24,740 OutboundTcpConnection.java:425 - Attempting to connect to /172.20.101.166INFO  [HANDSHAKE-/172.20.101.166] 2019-07-11 18:19:24,741 OutboundTcpConnection.java:561 - Handshaking version with /172.20.101.166DEBUG [MessagingService-Outgoing-/172.20.101.166-Gossip] 2019-07-11 18:19:24,742 OutboundTcpConnection.java:533 - Done connecting to /172.20.101.166

验证查询

系统启动后,服务自然启动,能正常加入集群。

cassandra@cqlsh> SELECT * from kevin_test.t_users;  user_id | emails                          | first_name | last_name---------+---------------------------------+------------+-----------       6 | {'k6-6@gmail.com', 'k6@pt.com'} |     kevin6 |      kang       7 | {'k7-7@gmail.com', 'k7@pt.com'} |     kevin7 |      kang       9 | {'k9-9@gmail.com', 'k9@pt.com'} |     kevin9 |      kang       4 | {'k4-4@gmail.com', 'k4@pt.com'} |     kevin4 |      kang       3 | {'k3-3@gmail.com', 'k3@pt.com'} |     kevin3 |      kang       5 | {'k5-5@gmail.com', 'k5@pt.com'} |     kevin5 |      kang       0 | {'k0-0@gmail.com', 'k0@pt.com'} |     kevin0 |      kang       8 | {'k8-8@gmail.com', 'k8@pt.com'} |     kevin8 |      kang       2 | {'k2-2@gmail.com', 'k2@pt.com'} |     kevin2 |      kang       1 | {'k1-1@gmail.com', 'k1@pt.com'} |     kevin1 |      kang

测试结果:

反复重启节点,查询表内容正常。

0