Was this page helpful?
ScyllaDB Hinted Handoff¶
A typical write in Scylla works according to the scenarios described in our Fault Tolerance documentation.
But what happens when a write request is sent to a Scylla node that is unresponsive due to reasons including heavy write load on a node, network issues, or even hardware failure? To ensure availability and consistency, Scylla implements hinted handoff.
In other words, Scylla saves a copy of the writes intended for down nodes, and replays them to the nodes when they are up later. Thus, the write operation flow, when a node is down, looks like this:
The co-ordinator determines all the replica nodes;
Based on the replication factor (RF) , the co-ordinator attempts to write to RF nodes;
data:image/s3,"s3://crabby-images/2a965/2a965ae4d720d2f5ee23f4c6b5161566b76f8c7a" alt="../../_images/1-write_op_RF_3.jpg"
If one node is down, acknowledgments are only returned from two nodes:
data:image/s3,"s3://crabby-images/0a872/0a8727244d440c6a4c65a9fc4a8bcd8b528fc359" alt="../../_images/hinted-handoff-3.png"
If the consistency level does not require responses from all replicas, the co-ordinator, V in this case, will respond to the client that the write was successful. The co-ordinator will write and store a hint for the missing node:
data:image/s3,"s3://crabby-images/04db8/04db85451378a61413e15b08b8b4f33959e051c3" alt="../../_images/hinted-handoff-4.png"
Once the down node comes up, the co-ordinator will replay the hint for that node. After the co-ordinator receives an acknowledgement of the write, the hint is deleted.
data:image/s3,"s3://crabby-images/45aad/45aad78313e4679bd79496eeb43957d5f4aedae9" alt="../../_images/hinted-handoff-5.png"
A co-ordinator stores hints for a handoff under the following conditions:
For down nodes;
If the replica doesn’t respond within
write_request_timeout_in_ms
.
The co-ordinator will stop creating any hints for a dead node if the node’s downtime is greater than max_hint_window_in_ms
.
Hinted handoff is enabled and managed by these settings in scylla.yaml
:
hinted_handoff_enabled
: enables or disables the hinted handoff feature completely or enumerates data centers where hints are allowed. By default, “true” enables hints to all nodes.max_hint_window_in_ms
: do not generate hints if the destination node has been down for more than this value. If a node is down longer than this period, new hints are not created. Hint generation resumes once the destination node is back up. By default, this is set to 3 hours.hints_directory
: the directory where Scylla will store hints. By default this is$SCYLLA_HOME/hints
.
Storing of the hint can also fail. Enabling hinted handoff therefore does not eliminate the need for repair; a user must recurrently run a full repair to ensure data consistency across the cluster nodes.
Copyright
© 2016, The Apache Software Foundation.
Apache®, Apache Cassandra®, Cassandra®, the Apache feather logo and the Apache Cassandra® Eye logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. No endorsement by The Apache Software Foundation is implied by the use of these marks.