Seven NoSQL Databases in a Week

Nodetool is Cassandra's collection of delivered tools that help with a variety of different operational and diagnostic functions. As previously mentioned, probably the most common nodetool command that you will run is nodetool status, which should produce output similar to this:

$ nodetool status
Datacenter: LakesidePark
========================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address        Load       Tokens  Owns    Host ID                          Rack
UN 192.168.0.100  84.15 MiB  16     100.0%  71700e62-2e28-4974-93e1-a2ad3f... r40
UN 192.168.0.102  83.27 MiB  16     100.0%  c3e61934-5fc1-4795-a05a-28443e... r40
UN 192.168.0.101  83.99 MiB  16     100.0%  fd352577-6be5-4d93-8251-15a74f... r40

Additional information about your node(s) and cluster can be obtained by running commands such as nodetool info or nodetool describecluster:

$ nodetool info 
 
ID                     :71700e62-2e28-4974-93e1-a2ad3f8a38c1 
Gossip active          : true 
Thrift active          : false 
Native Transport active: true 
Load                   :84.15MiB 
Generation No          : 1505483850 
Uptime (seconds)       : 40776 
Heap Memory (MB)       : 422.87 / 4016.00 
Off Heap Memory (MB)   : 0.00 
Data Center            :LakesidePark 
Rack                   :r40 
Exceptions             : 0 
Key Cache              : entries 24, size 2.02 KiB, capacity 100 MiB, 210 hits, 239 requests, 0.879 recent hit rate, 14400 save period in seconds 
Row Cache              : entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds 
Counter Cache          : entries 0, size 0 bytes, capacity 50 MiB, 0

hits, 0 requests, NaN recent hit rate, 7200 save period in seconds 
Chunk Cache            : entries 19, size 1.19 MiB, capacity 480 MiB, 85 misses, 343 requests, 0.752 recent hit rate, 583.139 microseconds miss latency 
Percent Repaired       : 100.0% 
Token                  : (invoke with -T/--tokens to see all 16 tokens)

nodetool info is useful for ascertaining things such as heap usage and uptime. nodetool describecluster is helpful when diagnosing issues such as schema disagreement.

Name: PermanentWaves
Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions: 22a8db59-e998-3848-bfee-a07feedae0d8: [192.168.0.100,192.168.0.101,192.168.0.102]

If there is a node in the cluster that needs to be removed, there are three methods to accomplish this. The determining factor is whether or not the node is still running.

If the node is still functioning and the Cassandra process is still running, you can execute a nodetool decommission from any node in the cluster. For example, decommission 192.168.0.102:

nodetool -h 192.168.0.102 decommission

At this point, the data on 192.168.0.102 will be streamed to the other nodes in the cluster, and control will be returned to the command line when the process is complete. If you were to run a nodetool status during this time, the cluster would look like this:

Datacenter: LakesidePark
========================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address        Load       Tokens  Owns    Host ID                          Rack
UN 192.168.0.100  84.15 MiB  16     100.0%  71700e62-2e28-4974-93e1-a2ad3f... r40
UL 192.168.0.102  83.27 MiB  16     100.0%  c3e61934-5fc1-4795-a05a-28443e... r40
UN 192.168.0.101  83.99 MiB  16     100.0%  fd352577-6be5-4d93-8251-15a74f... r40

The status column for 192.168.102 has changed to UL, indicating that it is both Up and Leaving.

This method is useful for scaling back a cluster that may have too many computing resources available.

If a node happens to crash, and it is unable to be restarted, the decommissioning process will not work. In this case, the node will have to be forcibly removed. First of all, log into another node in the cluster, and run nodetool status:

Datacenter: LakesidePark
========================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address        Load       Tokens  Owns    Host ID                          Rack
UN 192.168.0.100  84.15 MiB  16     100.0%  71700e62-2e28-4974-93e1-a2ad3f... r40
DN 192.168.0.102  83.27 MiB  16     100.0%  c3e61934-5fc1-4795-a05a-28443e... r40
UN 192.168.0.101  83.99 MiB  16     100.0%  fd352577-6be5-4d93-8251-15a74f... r40

The status column for 192.168.102 has changed to DN, indicating that it is both Down and Normal.

In this case, the only recourse is to remove the down node by its host ID:

nodetool removenode c3e61934-5fc1-4795-a05a-28443e2d51da

Once this process has begun, the node processing the removal can be queried with nodetool removenode status. And if the removal is taking a long time, it can be forced with nodetool removenode force.

If for some reason the node will not disappear from the cluster with either decommission or removenode, then nodetool assassinate should do the trick:

nodetool assassinate 192.168.0.102

Always try to use decommission first (or removenode if the Cassandra process cannot be started on the node) and assassinate as a last resort.

Table of Contents for Seven NoSQL Databases in a Week

Table of Contents for
Seven NoSQL Databases in a Week