Algebraic geometry and distributed systems were two of my favourite subjects when I was an undergrad. These two worlds don’t usually overlap. But they do share the “same” mathematical structures. I realized this when I was TAing for the distributed systems class at Stanford and finally decided to write it down. The upshot is that ensuring consistency in a consensus protocol amounts to checking a local-to-global principle, which underlies modern algebraic geometry. At the end, I’ll discuss some wishful thinking that might be pertinent to the correctness verification and performance optimization of distributed protocols.
The idea is rather simple once one internalizes the local-to-global principle and recognizes the familiar quorum system in distributed computing as one instance of it. Unfortunately, discussing these ideas formally requires quite a lot of set-up. Readers comfortable with the background can go straight to the punchline in the last section on distributed computing, where I make this observation explicit.
I hope the mathematically sophisticated readers won’t take offense at the “abstract nonsense” that I’m about to unleash, and that the system folks will pardon me for writing with strange symbols like .
Background
In distributed computing, a famous problem is how to achieve consensus, i.e. how to make nodes in a distributed system agree on a single thing despite all the network failures, hardware failures, and possibly malicious nodes themselves. The consensus problem has an (in)famous history due to the first solution (in the partition fault tolerance sense) from Leslie Lamport, who presented the solution in the Paxos paper about an ancient civilization in the Mediterranean Sea and gave talks on it dressed up as Indiana Jones… It took Microsoft a while to figure out its researcher solved a billion dollar problem that plagues their systems, after several ad hoc (i.e. broken, incorrect) engineering attempts.
Algebraic geometry is equally confusing to most folks. One might think of algebraic geometry as the study of polynomial systems in somewhat geometric ways, but that does massive simplification. The subject sounds esoteric but is relevant (sometimes central) to lots of modern sciences, such as cryptography, condensed matter physics, and quantum field theory.
Local-to-global
The key concept that relates the two subjects is the local-to-global principle.
A major part of modern algebraic geometry concerns a somewhat naive question: what is a space? We all have intuitive understanding of what geometry should look like. But like many things in life, a space is something easy to see but very hard to define. For example, most STEM people recognize the interpretation of high dimensional Euclidean space as some kind of coordinate system that you can describe with linear algebra. In this post, we introduce a different notion:
A geometric space is the recipe of how local data “glues” into global data.
A mental picture is worth more than a thousand words. Start with a rectangular piece of paper and draw arrows along opposite sides. If you paste the paper along the arrows pointing in the same direction, you get a normal band. If you reverse the direction, you get a Möbius strip.

These two bands are not the same geometric object. For example, you can trace a connecting loop on the Möbius strip covering both sides without lifting the pen (try it!), but you can’t do that for the cylinder strip. But locally they look the same. If one zooms in on any given point, it all looks the same flat two dimensional space. Imagine some strange tribes living on the Möbius strip yet not advanced enough to venture far on the surface, they can’t distinguish the world they live in from a cylinder band.
Back to algebraic geometry. The conceptual leap starts with two somewhat simple observations:
- Instead of geometry itself, one should study relations built on top of it, such as how the geometric spaces transform under maps.
- One can determine global maps by understanding how they behave locally, as long as there’s a coherent way to reconstruct the global transformation from local pieces.
I always think of point one using linguistic analogies. Concepts such as “lover” and “child” do not exist on their own, but encode presumptions about existing relationships in the subtext. If you fancy modernist literature, this relates to Derrida’s concept of différance, where meanings arise simultaneously via the relationships between words. This sounds abstract but it’s not. In practice, one gets a ton more insight by studying how algebraic structures transform instead of looking at objects alone. A classic example is Grothendieck’s generalization of Riemann-Roch in the relative setting.
The second point is more subtle. It relates to the notion of a “sheaf”, which I’ll discuss in the next section.
Category theory
To define the local-to-global principle requires category theory. Let’s do a brief recap.
A category consists of a collection of objects . For objects and in , there is a morphism set 1. The morphisms satisfy the composition rule: for morphisms and , the composition belongs in the morphism set . There are identity morphisms, which generalize the usual identity homomorphisms in groups, rings, modules, etc.
Category theory is the language that describes how generic mathematical structures are preserved under transformation. Consider two categories and . A functor is a structure preserving map in the sense it maps morphisms to morphisms and preserves identity morphisms and composition. For example, given a morphism in the category , its image under (assumed to be covariant) is a morphism . The functor is contravariant if it reverses the composition, i.e. has the type signature . To make it concrete, the study of algebraic topology assigns algebraic invariants to continuous shapes and studies how continuous transformations induce algebraic maps on the algebraic invariants extractable from the shapes. In programming language, the objects often encapsulate things like mutable states and types. For example, a monad abstracts how states and side effects associated with one data type get transformed to downstream data types. The description of monads below is perhaps more familiar to computer scientists.
-- Monad-- Core operationsreturn :: a -> m a(>>=) :: m a -> (a -> m b) -> m b-- Monad laws-- Left identityreturn a >>= f = f a-- Right identitym >>= return = m-- Associativity: sequencing order of binds is coherent(m >>= f) >>= g = m >>= (\x -> f x >>= g)
1 In general, one does not need to require that the collection of morphisms form a set, as in the case of higher categories. This is far beyond our needs in the post so we work with the more familiar notion. ←
Presheaf and sheaf
Now, we move one step closer to understanding geometry. We start with a prototype, which allows us to redefine a space — presheaf.
Definition (Presheaf). A presheaf on a category is a contravariant functor .
This definition is dense. Let’s unpack it. The category consists of sets as objects 1, and the morphisms are the usual set-theoretic maps. We illustrate this with the standard example of a topological space. Recall that a topological space is endowed with a collection of open subsets. For example, if is a subset of a Euclidean space, then each open set of , in the subspace topology, is the intersection of with the union of a collection of open balls in the standard Euclidean space. The standard set-theoretic inclusion induces a partial order on the open sets. We define the category such that the objects consist of open sets in and the morphism sets consist of the partial ordering from inclusion, i.e. if and only if as a subset.
Given an open set , we associate with the collection of functions defined on it. We can take the functions to be real valued and define it as
This captures the fact that we are trying to understand by understanding how it maps to real numbers, or the Yoneda lemma if you like category theory (or simply consider a local coordinate chart if you are fond of differential geometry). Then, consider , notice that for any , the restriction of on is an element of . This gives us a contravariant functor ! A presheaf on .
Intuitively, a presheaf captures local compatibility by zooming in onto ever smaller neighborhoods. We generalize the usual set-theoretic inclusion relations for open sets by the abstract arrows in the category . The payoff is that one can thus talk about “space” defined on objects that look nothing like a geometric space.
A presheaf allows one to talk about geometry locally. For global geometry, we need a local-to-global compatibility principle. For , the sheaf condition on an open cover is the equalizer exact sequence
This just says that any element of can be described as a product of elements in satisfying compatible conditions on overlaps. Specifically, we can express an element uniquely as . For two indexed open sets and , their intersection is endowed with two inclusion maps and . The sheaf condition says that .
Stalks
One way to think about the sheaf condition is via stalks. A stalk captures local information at a point. Specifically, for and a sheaf (working in the topological category as before), its stalk is a “direct limit” of , where contains . This means that we identify two elements and if and maps to under the morphism induced by . For a concrete example, consider the ReLU function and the identically zero function: they look the same when one zooms in around negative numbers as both are identically zero nearby, but not at zero and positive numbers.
Another way to interpret the sheaf condition is that to specify an element in is to specify a collection of elements at stalks , where . Here, “compatible” means that we can “lift” the elements at stalk to some element of , such that , in a way that satisfies the sheaf condition described above. In general, one works beyond merely the topological category for more flexibility and power (one works with the so-called Grothendieck topology, for example), but we don’t develop that generality here. We shall see that the stalk perspective developed here already generalizes beyond topological spaces and appears naturally in distributed computing.
1 A careful reader will notice that I used the word “collection” in the definition of a category to define its objects. This is standard to avoid set-theoretic issues. For example, there is no set that is the “set” of all sets. Readers interested in logic can look up the definition of a “universe”.←
Distributed systems
One characteristic of distributed systems is that faults that occur on a remote machine can affect the behavior of other machines in the networked system. Reasoning about the correct behavior of a distributed system is no amateur task and is error-prone. The main problem is that there is no notion of centralized time, so in particular different nodes cannot sync with each other based on reading local timestamps. Lamport laid out the right language to discuss patterns in a distributed system in his seminal paper Time, Clocks, and the Ordering of Events in a Distributed System (Lamport 1978). The central observation is that even though there is not a single global time, one can define a partial order, the so-called “event ordering”, to formalize sequences in a distributed system that encapsulate causal relations.
We give an example in the more familiar language of a micro-service architecture. Consider a merchandise ordering service. When a user submits an order, the following (in a somewhat simplified procedure) happens:
- Order creation: the vendor creates the order after performing basic checks, such as inventory availability, etc
- Database update: the relevant inventory and user order database records are updated.
- Backend logging: the backend updates metrics, such as user statistics and merchandise trends, etc.
- User notification: renders front-end updates and notifies the user.
The point is that there are causal relations among these steps. Steps 2 and 3 must happen after step 1, and they may or may not happen before step 4. (Step 2 needs to happen eventually for consistency — the so-called eventual consistency model.) There is no partial ordering relationship in general between step 2 and step 3, which can occur simultaneously as they are logically independent.
In practice, the services actually processing database and backend updates are located on remote machines, and information flows via the message passing model through remote procedure calls (RPC). Concretely, Lamport’s event ordering model says that these messages exchanged by machines in a distributed system decide event sequencing.
Following common practices in avant-garde literature and certain fields of mathematics, Lamport left the most consequential implication of the event ordering model, implicit in his seminal paper, as an exercise to the curious, voluntary reader:
One can use the event ordering to implement a distributed computing protocol as a state machine.
Consensus
Now back to consensus after all the background and detour. At a high level, the consensus problem asks whether it is possible to design a protocol in which remote nodes cooperate in a distributed system to decide on a single value. A celebrated FLP impossibility result (Fischer et al. 1985) shows that there does not exist a deterministic algorithm that achieves all of consistency, partition fault tolerance, and liveness. The proof structure is modeled after state transitions that really illustrate Lamport’s event ordering principle.
In practice, consensus algorithms, such as Paxos and Raft, trade liveness for consistency and partition fault tolerance. One can extend these algorithms to support resistance towards malicious nodes (Byzantine failures). A central idea that underlies all consensus algorithm design is the notion of a quorum system (Malkhi and Reiter 1999). The idea is that in a quorum system, nodes exchange messages until each node has heard back from a quorum slice, which abstracts the notion of trust from a local node’s perspective, and each node in a quorum system has an associated quorum slice contained in it. Intuitively, a quorum system encapsulates the union of all local trust levels assumed by the nodes.
Of course, the precise definitions and implementation in concrete protocols require additional formalism that is too much to fit into the margin of this post. But the idea is that one can think about quorum systems as resembling the open sets in some topology. This is not just an analogy. The notion is formalized via the concept of semitopology (see e.g. this paper (Gabbay 2023)). The upshot is that the notion of stalks from taking direct limits still makes perfect sense, where one thinks of the sets in the sheaf associated with some quorum system as the possible state transitions partially indexed by message passing carried with additional properties, such as whether or not the state transitions imply that the nodes in the quorum will diverge or converge on proposed or committed values, etc. And one can translate the more standard quorum intertwining and consistency conditions into the sheaf condition language. So far we have only discussed the partition fault tolerance setting, but one easily generalizes to Byzantine faults by working with some kind of “subspace topology” from restricting to non-Byzantine nodes and imposing coherence conditions there, similar in standard Byzantine quorum intertwining.
We give a few examples that illustrate how the quorum aspects are implemented in practice in the etcd Raft codebase. The below two code snippets demonstrate the logic of voting for a value after a quorum of nodes ack.
func (p *ProgressTracker) TallyVotes() (granted int, rejected int, _ quorum.VoteResult) { // ... count granted/rejected from voters ... result := p.Voters.VoteResult(p.Votes) return granted, rejected, result}
func (c MajorityConfig) VoteResult(votes map[uint64]bool) VoteResult { // ... q := len(c)/2 + 1 if votedCnt >= q { return VoteWon } ...}
The below two code snippets demonstrate the logic of committing to a value after a quorum of nodes ack.
if pr.MaybeUpdate(m.GetIndex()) || (pr.Match == m.GetIndex() && pr.State == tracker.StateProbe) { switch { case pr.State == tracker.StateProbe: pr.BecomeReplicate() case pr.State == tracker.StateSnapshot && pr.Match+1 >= r.raftLog.firstIndex(): // ... pr.BecomeProbe() pr.BecomeReplicate() case pr.State == tracker.StateReplicate: pr.Inflights.FreeLE(m.GetIndex()) } if r.maybeCommit() { // committed index has progressed for the term, so it is safe // to respond to pending read index requests releasePendingReadIndexMessages(r) } ...}
This code snippet below updates the quorum tracker.
// AckedIndex implements IndexLookuper.func (l matchAckIndexer) AckedIndex(id uint64) (quorum.Index, bool) { pr, ok := l[id] if !ok { return 0, false } return quorum.Index(pr.Match), true}// Committed returns the largest log index known to be committed based on what// the voting members of the group have acknowledged.func (p *ProgressTracker) Committed() uint64 { return uint64(p.Voters.CommittedIndex(matchAckIndexer(p.Progress)))}
Practical relevance?
So far, all the discussion has been on the theory side. I hope this provides an adjacent perspective on thinking about consensus, in addition to digging up scrolls about an ancient civilization (Lamport 1998) in a filing cabinet, or identifying the ambiguity in Paxos Made Simple (Lamport 2001) which Lamport refused to reveal, or trying to parse the TLA+ specs from Paxos and TLA+ notes (Lamport 2019).
In practice, almost all the complexity in optimizing consensus protocols comes from identifying exploitable partial-ordering structures that arise naturally in applications. For example, Google Spanner (Corbett et al. 2012) goes beyond the FLP impossibility setting by assuming an approximately synchronized global clock. The special structure of network ordering (Li et al. 2016) allows one to reduce Paxos overhead in data centers. Special architectures and RDMA also allow one to redesign state machine replication protocols (Aguilera et al. 2020) for consensus at microsecond scale (compare, for example, with some of what Jane Street (State Machine Replication and Why You Should Care 2024) discusses in practice). It is interesting if some of these perspectives simplify and streamline protocol optimization and enable modular methods for dynamic and static verification of distributed systems protocol implementation.
Aguilera, Marcos K., Naama Ben-David, Rachid Guerraoui, Virendra J. Marathe, Athanasios Xygkis, and Igor Zablotchi. 2020. “Microsecond Consensus for Microsecond Applications.”14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20), 599–616. https://www.usenix.org/system/files/osdi20-aguilera.pdf.
Corbett, James C., Jeffrey Dean, Michael Epstein, et al. 2012. “Spanner: Google’s Globally-Distributed Database.”10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12), 251–64. https://www.usenix.org/conference/osdi12/technical-sessions/presentation/corbett.
Fischer, Michael J., Nancy A. Lynch, and Michael S. Paterson. 1985. “Impossibility of Distributed Consensus with One Faulty Process.”Journal of the ACM. https://groups.csail.mit.edu/tds/papers/Lynch/jacm85.pdf.
Gabbay, Murdoch J. 2023. Semitopology: A Topological Approach to Decentralised Collaborative Action. https://arxiv.org/pdf/2303.09287.
Lamport, Leslie. 1978. “Time, Clocks, and the Ordering of Events in a Distributed System.”Communications of the ACM 21 (7): 558–65. https://doi.org/10.1145/359545.359563.
Lamport, Leslie. 1998. “The Part-Time Parliament.”ACM Transactions on Computer Systems 16 (2): 133–69. https://doi.org/10.1145/279227.279229.
Lamport, Leslie. 2001. “Paxos Made Simple.”ACM SIGACT News 32 (4): 51–58. https://lamport.azurewebsites.net/pubs/paxos-simple.pdf.
Lamport, Leslie. 2019. The Paxos Algorithm or How to Win a Turing Award. https://lamport.azurewebsites.net/tla/paxos-algorithm.html.
Li, Jialin, Ellis Michael, Naveen Kr. Sharma, Adriana Szekeres, and Dan R. K. Ports. 2016. “Just Say NO to Paxos Overhead: Replacing Consensus with Network Ordering.”12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), 467–83. https://www.usenix.org/conference/osdi16/technical-sessions/presentation/li.
Malkhi, Dahlia, and Michael Reiter. 1999. Byzantine Quorum Systems. https://malkhi.com/files/Quorums1999.pdf.
State Machine Replication and Why You Should Care. 2024. Signals and Threads. https://signalsandthreads.com/state-machine-replication-and-why-you-should-care/.
Leave a Reply