Raft是一种共识算法,旨在使其易于理解。 它在容错和性能上与Paxos等效。 区别在于它被分解为相对独立的子问题,并且干净地解决了实际系统所需的所有主要部分。 我们希望Raft将使共识能够为更广泛的受众所接受,并且希望这个更广泛的受众能够开发出比当今更高质量的基于共识的系统。
共识是容错分布式系统中的一个基本问题。 共识涉及多个服务器就价值达成一致。 一旦它们对价值做出决定,该决定就是最终决定。 当大多数服务器可用时,典型的共识算法会发挥作用。 例如,即使2台服务器出现故障,包含5台服务器的群集也可以继续运行。 如果更多服务器发生故障,它们将停止运行(但绝不会返回错误的结果)。
共识通常出现在复制状态机的背景下,复制状态机是构建容错系统的通用方法。 每个服务器都有一个状态机和一个日志。 状态机是我们要使容错的组件,例如哈希表。 对于客户端来说,即使群集中的少数服务器出现故障,它们也将与单个可靠的状态机进行交互。 每个状态机都从其日志中获取输入命令。 在我们的哈希表示例中,日志将包含将x设置为3之类的命令。共识算法用于在服务器日志中约定命令。 共识算法必须确保,如果有任何状态机将x设置为3作为第n条命令,则其他任何状态机都不会应用不同的nth命令。 结果,每个状态机处理相同系列的命令,并因此产生相同系列的结果并到达相同系列的状态。
下面是浏览器中运行的Raft集群。 您可以与之互动以查看Raft的实际使用情况。 左侧显示5台服务器,右侧显示其日志。 我们希望尽快创建一个截屏视频,以解释发生了什么。 这种演示现在仍然很糙。建议先到Raft 中文入门原理模拟中熟悉下再来实验下面的模拟。
可以在下图每个节点上右键,对该节点进行停止、重启、请求等操作,单击每个节点可查看该节点当前状态。
The Secret Lives of Data-CN 是另一个演示的中文网站,它具有更多的指导性和较少的交互性,因此它可能是一个比较友好的入门演示网站。
这是Raft的论文,里面有详细的描述: 寻找一种可以理解的共识算法(扩展版) by Diego Ongaro and John Ousterhout. A slightly shorter version of this paper received a Best Paper Award at the 2014 USENIX Annual Technical Conference.
Diego Ongaro's Ph.D. dissertation expands on the content of the paper in much more detail, and it includes a simpler cluster membership change algorithm.
More Raft-related papers:
Doug Woos,
James R. Wilcox,
Steve Anton,
Zachary Tatlock,
Michael D. Ernst, and
Thomas Anderson.
Planning for Change in a Formal Verification of the Raft Consensus Protocol.
Certified Programs and Proofs (CPP), January 2016.
James R. Wilcox,
Doug Woos,
Pavel Panchekha,
Zachary Tatlock,
Xi Wang,
Michael D. Ernst, and
Thomas Anderson.
Verdi: A Framework for Implementing and Verifying Distributed Systems.
Programming Language Design and Implementation (PLDI), June 2015.
Hugues Evrard and
Frédéric Lang.
Automatic Distributed Code Generation from Formal Models of Asynchronous Concurrent Processes.
Parallel, Distributed, and Network-Based Processing (PDP), March 2015.
Heidi Howard,
Malte Schwarzkopf,
Anil Madhavapeddy, and
Jon Crowcroft.
Raft Refloated: Do We Have Consensus?.
SIGOPS Operating Systems Review, January 2015.
Heidi Howard.
ARC: Analysis of Raft Consensus.
University of Cambridge, Computer Laboratory, UCAM-CL-TR-857, July 2014.
These talks serve as good introductions to Raft:
Video | YouTube |
Slides | PDF with RaftScope visualization |
Video | YouTube |
Slides | SlideShare |
Video | InfoQ |
Slides | HTML PDF with RaftScope visualization |
Video | Air Mozilla |
Slides | Diego: PDF with RaftScope visualization |
Video | YouTube |
Slides | PDF with RaftScope visualization |
Video | YouTube |
Slides | PDF with RaftScope visualization |
Video | YouTube |
Slides | PDF PPTX with RaftScope visualization |
Video | YouTube (French) |
Slides | Speaker Deck (English) |
Video | USENIX |
Slides | RaftScope visualization |
Video | Ustream |
Slides | PDF PPTX |
Video | YouTube |
Slides | Speaker Deck |
Video | YouTube |
Slides | PDF PPTX |
Video | InfoQ |
Slides | Speaker Deck |
Video | Vimeo |
Slides | Speaker Deck |
Slides | Speaker Deck |
Video (screencast) | YouTube MP4 |
Slides | PDF PPTX |
This is a list of courses that include lectures or programming assignments on Raft. This might be useful for other instructors and for online learners looking for materials. If you know of additional courses, please submit a pull request or an issue to update it.
The best place to ask questions about Raft and its implementations is the raft-dev Google group. Some of the implementations also have their own mailing lists; check their READMEs.
There are many implementations of Raft available in various stages of development. This table lists the implementations we know about with source code available. The most popular and/or recently updated implementations are towards the top. This information will inevitably get out of date; please submit a pull request or an issue to update it.
Stars | Name | Primary Authors | Language | License | Leader Election + Log Replication? | Persistence? | Membership Changes? | Log Compaction? |
---|