Decentralized systems are currently poorly defined and often misunderstood.
Bitcoin and other crypto networks form a $1.5 trillion market currently. Usually, technology markets at that size have solid engineering foundations and clearly defined technical concepts. The crypto industry is unique in this regard, where capital markets are far ahead of engineering maturity.
In this post, I will:
- iteratively build up a clear definition of decentralized systems,
- take a deeper dive into a key concept of decentralized systems: globally shared state, and
- discuss how modern decentralized systems make design tradeoffs around shared state.
Design tradeoffs involving global state lead to radically different designs for decentralized systems, making it an important concept to understand.
First, let's define the basics.
Definition of Decentralized Systems: In computer science, distributed systems is the subfield that studies networked systems [1]. It's best to think of the term distributed systems as a broad term with several subsets. Any computer system that involves two or more computers to interact with each other to accomplish a task is a distributed system.
Some people try to differentiate distributed systems from decentralized systems, which I find confusing. Decentralized systems are a subset of distributed systems. The subset lens gives us an initial definition:
Iteration 1: Basic Definition "A decentralized system is a type of distributed system where no single entity has majority administrative control."
This definition, however, is not complete. There are other types of distributed systems that lack central points of control, for example, peer-to-peer systems. In peer-to-peer systems, all nodes generally perform similar functions, and they self-organize without relying on centralized controllers.
What's the difference between decentralized systems and peer-to-peer systems then? Answer:
globally shared state, the set of information that distrusting nodes on the network need to agree on and can modify independently.
Historically, most peer-to-peer systems do not have global, mutable state, and are in fact designed to operate without it. This is because more often than not, different nodes of the peer network will have different views of the state of the network.
Modern decentralized systems like Bitcoin, Ethereum, Stacks
do have global mutable state.
I'd argue that consensus on globally shared state is implicitly a fundamental property of modern blockchain-based decentralized systems. How useful would be Bitcoin if there is no global consensus on the balance of a Bitcoin address?
Let's add this to the definition.
Iteration 2: With Globally Shared State "A decentralized system is a type of distributed system that
manages globally shared state without any single entity having majority administrative control."
It's important to differentiate between a system or protocol being decentralized i.e., having the property of decentralization and a
decentralized system as defined here. Consider a message embedded on the Bitcoin blockchain. Anyone in the world can agree on the global state of this message: when was it sent, who sent it, what are the contents of the message.
Now consider an alternate decentralized messaging system, say a peer to peer chat where nodes broadcast messages to each other. While participants can receive messages in such a system in a decentralized way, agreeing on a globally shared state is out of scope of such a system (it does not have a global consensus protocol). Questions like "did node A and node B receive the same message", "did message M come before message N" are complex and out of scope given the lack of a global consensus protocol.
Decentralization is a property that can exist without consensus protocols but it's more meaningful in systems that have globally shared state. Modern decentralized systems, at least in the crypto industry, implicitly assume that when we say decentralized system it means a decentralized system with globally shared state.
Globally shared state can be managed by a closed group of nodes as well. Now, let's look at another revision that takes into account open vs closed systems. In an open membership system any new node can join the consensus algorithm.
Iteration 3: With Open Membership "A decentralized system is a
type of open distributed system that
manages globally shared state without any single entity having majority administrative control."
This gives us a complete, specific definition.
One interesting aspect of this definition is that it excludes federated systems like Diem (formerly Libra) from being called "decentralized systems". I believe that open membership is a fundamental property of modern decentralized systems and instead of creating confusion around "private blockchains", it's best to use a clear definition of decentralized systems and classify private/federated systems as
federated systems.
These definitions are important because without them we cannot meaningfully compare the design space and pros and cons of different approaches.
Design Tradeoffs in Decentralized Systems: Without globally shared state it's not possible to build systems like Bitcoin where distrusting nodes agree on Bitcoin balances of different owners.
There are several design tradeoffs for how a decentralized system handles globally shared state. There can certainly be decentralization without any globally shared state but we're excluding these systems from our definition and categorizing them as peer-to-peer systems instead.
There is a fundamental tradeoff between globally shared state and scalability. The more global state
updates a system needs to maintain, the harder it generally is to scale out that system.
Maximizing globally shared state Bitcoin Cash and Ethereum are examples of modern decentralized systems that make the design tradeoff of maintaining a lot of globally shared state and the respective state updates. Bitcoin Cash designers generally believe in bigger blocks and more transactions i.e., more updates to shared state at the base layer. By contrast, Bitcoin Core designers generally believe that various limitations at the base layer put hard limits on globally shared state that can be maintained, and recommend solutions like lightning (layer-2) that might be better suited for most state updates.
Ethereum, the original design, and the research on Eth 2.0 are both in the camp of high-volume globally shared state. Eth 2.0 aims to solve scalability issues associated with state updates through a concept of "sharding", which hopes to reduce the globally shared state by dividing it into various shards but it introduces additional coordination (global) state.
Maximizing globally shared state is analogous to users depending on a giant mainframe that stores all their data, updates that data, and performs all their computations. That mainframe will need to be very powerful and store a lot of global information. Further, the mainframe can only perform user computations serially (one after the other) vs users performing computations in parallel on their own machines.
Minimizing globally shared state Modern decentralized systems that make the design tradeoff of less globally shared state are Bitcoin Core (main Bitcoin) and Stacks. These systems generally follow the design principle that the amount of shared state at the base layer should be as minimal as possible and should be as accommodating as possible to all kinds of node hardware, and all kinds of low-bandwidth last-mile networks. These systems argue for keeping as much data outside of the blockchain as possible. On Bitcoin Core, lightning is a scalability solution that minimizes global state. In Stacks, most apps run locally (not at the blockchain layer), and only a subset of programs are written as Clarity smart contracts. Storage generally happens off-chain as well.
Open Dialogue Technical design discussions in the crypto industry take a whole new meaning. Given the financial stakes of participants, technical arguments get clouded by "tribal" affiliations. The underlying scientific principles often get distorted or ignored in "pseudo-technical" discussions between crypto tribes.
I hope that the crypto industry can agree on standard terminology and collectively better understand the design tradeoffs that various projects make.
Only time will tell which design decisions end up being more successful. Right now, what the crypto industry needs is opening up a friendly dialogue about different design choices and sharing lessons and experiences; together, we can build a better future.
Comments? Tweet them
@muneeb.
Thanks to
Jude Nelson,
Patrick Stanley, and
Aaron Blankstein for reading drafts.
Footnote 1: Research universities often organize systems and networking together given their significant overlap (see
Princeton and
Stanford). Systems research has roots in operating systems while networking has roots in communication protocols. With modern systems/networks the line between them is blurry. Distributed systems is a broad term to capture a wide body of research in systems and networks.
SOSP and
NSDI are great examples of research venues where high-quality distributed systems research is published.
Disclaimer: This essay reflects my personal views and not the views of my employer.