-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extended Raft algorithm with witness support #133
Comments
+1 to witness support. My main concern would be creation of a test plan to ensure correctness. I think the TLC model checker will be crucial here. |
Yes, indeed. We need tests to ensure implementation correctness besides algorithm itself. In our PoC work on etcd, we managed to make all existing tests (e2e, integration, robust) run on cluster with single witness. But additional tests would be needed to cover witness specific functionalities. |
Hi @joshuazh-x, great work. What is the latest status of this project? Did you publish your paper? I see that @ZhouJianMS had a draft PR but it never got merged. I'm interested in distributed system research so wanted to check in what is the latest status here and wanted to see if there are any resources I can follow. Thanks in advance! cc: @serathius in case there is any effort parallel to this |
We got stuck on merging #226, review or help split PR into smaller re-viewable chunks would be appriciated. |
The Raft algorithm requires an odd number of servers to maintain a quorum, meaning a minimum of three for a single point of failure. This isn't an issue for large systems but can be challenging for budget-limited customers needing fewer servers 1.
Efforts have been made in both scholarly circles 23 and commercial sectors 456 to resolve this issue. However, all existing research and implementations for small scale clusters (with two servers for example) either depend on another HA solution or necessitate a standalone server as a witness, adding to deployment complexity and potential performance bottlenecks.
I hereby propose extended Raft algorithm, a variant of Raft algorithm, which is designed for clusters with regular servers and a single witness, minimizing data traffic and access to witness while maintaining all key Raft properties. The witness in this algorithm is very suitable for implementation as a storage object with various options such as NFS, SMB, or cloud storage.
The extended Raft algorithm is backward compatible with Raft, meaning any cluster running with Raft can be seamlessly upgraded to support witness.
The correctness of the algorithm has been conclusively proven through a formal proof in https://github.com/joshuazh-x/extended-raft-paper. Besides that, we also validate the formal specification of the algorithm using TLC model checker.
Look forward to your suggestions and feedback.
@pav-kv @serathius @ahrtr @tbg @Elbehery @erikgrinaker @lemmy
Footnotes
https://github.com/etcd-io/etcd/issues/8934#issuecomment-398175955 ↩
Pâris, Jehan-François, and Darrell DE Long. "Pirogue, a lighter dynamic version of the Raft distributed consensus algorithm." 2015 IEEE 34th International Performance Computing and Communications Conference (IPCCC). IEEE, 2015. ↩
Yadav, Ritwik, and Anirban Rahut. "FlexiRaft: Flexible Quorums with Raft." The Conference on Innovative Data Systems Research (CIDR). 2023. ↩
https://github.com/tikv/tikv ↩
https://platform9.com/blog/transforming-the-edge-platform9-introduces-highly-available-2-node-kubernetes-cluster/ ↩
https://www.spectrocloud.com/blog/two-node-edge-kubernetes-clusters-for-ha-and-cost-savings ↩
The text was updated successfully, but these errors were encountered: