How does cassandra find the node that contains the data?

Client sends the data to a random node

It might seem that way, but there is actually a non-random way that your driver picks a node to talk to. This node is called a “coordinator node” and is typically chosen based-on having the least (closest) “network distance.” Client requests can really be sent to any node, and at first they will be sent to the nodes which your driver knows about. But once it connects and understands the topology of your cluster, it may change to a “closer” coordinator.

The nodes in your cluster exchange topology information with each other using the Gossip Protocol. The gossiper runs every second, and ensures that all nodes are kept current with data from whichever Snitch you have configured. The snitch keeps track of which data centers and racks each node belongs to.

In this way, the coordinator node also has data about which nodes are responsible for each token range. You can see this information by running a nodetool ring from the command line. Although if you are using vnodes, that will be trickier to ascertain, as data on all 256 (default) virtual nodes will quickly flash by on the screen.

So let’s say that I have a table that I’m using to keep track of ship crew members by their first name, and let’s assume that I want to look-up Malcolm Reynolds. Running this query:

SELECT token(firstname),firstname, id, lastname 
FROM usersbyfirstname  WHERE firstname="Mal";

…returns this row:

 token(firstname)     | firstname | id | lastname
----------------------+-----------+----+-----------
  4016264465811926804 |       Mal |  2 |  Reynolds

By running a nodetool ring I can see which node is responsible for this token:

192.168.1.22  rack1       Up     Normal  348.31 KB   3976595151390728557                         
192.168.1.22  rack1       Up     Normal  348.31 KB   4142666302960897745                         

Or even easier, I can use nodetool getendpoints to see this data:

$ nodetool getendpoints stackoverflow usersbyfirstname Mal
Picked up JAVA_TOOL_OPTIONS: -javaagent:/usr/share/java/jayatanaag.jar 
192.168.1.22

For more information, check out some of the items linked above, or try running nodetool gossipinfo.

Leave a Comment