The Trailing Finality Layer: A stepping stone to proof of stake in Zcash
At Electric Coin Co. (ECC) we’re exploring a transition in Zcash from the current proof-of-work (PoW) consensus to a proof-of-stake (PoS) consensus. We are proposing a step on this path that we call the Trailing Finality Layer (TFL). If deployed, this would be combined with Zcash’s existing consensus; the resulting consensus protocol at that point would be a hybrid of PoW and PoS.
The overall goal is to enable finality and PoS on Zcash in a minimally disruptive manner. Finality is a guarantee that once a block is finalized, that block and the transactions it contains cannot be rolled back. Finality can reduce delays for some use cases (such as centralized exchange deposit wait times) and enable new improvements such as safer cross-chain bridges.
If the TFL approach is adopted by the Zcash community, it could enable some new use cases, such as staking ZEC to earn protocol rewards, while minimizing disruption to existing use cases. Mining is an example of a process that would be impacted in a hybrid model, as mining rewards would be reduced while the rest of mining infrastructure and processes would remain unchanged.
We also aim to minimize disruption to analysis of consensus security, as many of the existing consensus properties remain intact in a hybrid model.
We’ve only begun to define the design of this PoW/PoS hybrid protocol. Many key details remain open questions, as we expand on below. By sharing our approach early in this process, we’re aiming to gather and incorporate feedback as we go, find potential collaborators, and stimulate discussion about this approach.
If you’re interested in providing feedback or collaborating on this project, please get in touch! A good opportunity to learn more and join the conversation is to attend (in person or virtually) the Interactive Design of a Zcash Trailing Finality Layer workshop I’m leading at Zcon4. Also feel free to email me, [email protected], about your interest. We’re looking for contributors from a wide range of backgrounds, including technical, product, community, and any Zcash users who want to weigh in as the proposal evolves.
PoS transition background
ECC previously shared our rationale for why we believe it’s in the best interest of current and future ZEC users to transition the protocol to proof of stake in the blog post Should Zcash transition from Proof of Work to Proof of Stake? and the Zcon3 presentation Motivations of Proof of Stake. In 2022, we published a high-level overview of our Proof-of-Stake Research approach, a more detailed Approach, Focus, and Next Steps companion post, and gave a Zcon3 presentation about high-level design challenges in Proof of Stake.
A proof-of-stake transition path
Our vision for a transition to proof of stake includes at least two major milestones:
- Moving Zcash from its current proof-of-work model to a hybrid PoW/PoS system.
- Moving Zcash from a hybrid PoW/PoS system to pure PoS.
Our primary motivation for proposing (at least) two steps is to minimize disruption of usability, safety, security, and the ecosystem during each step.
Design goals for a hybrid PoW/PoS system
ECC is refining the design of TFL with several goals in mind, and we’ll potentially be adding more as we continue to develop this proposal. Currently:
- We want to minimize disruption to existing wallet use cases and UX. For example, nothing should change in the user flows for storing or transferring funds, the format of addresses, etc.
- We want to minimize complexity of security analyses by preserving existing analysis results where possible.
- We want to enable new PoS use cases that allow mobile shielded wallet users to earn a return on delegated ZEC.
- We want to enable trust-minimized bridges and other benefits by providing a protocol with finality.
- We want to improve the modularity of the consensus protocol. Modularity has several loosely defined and related meanings, e.g., it’s possible to understand some consensus properties only given knowledge of a component of the protocol, and it’s possible to implement consensus rules in modular code components with clean interfaces.
Trailing Finality Layer in a nutshell
The hybrid PoW/PoS protocol we envision at ECC is structured like today’s Zcash NU5 protocol with a new Trailing Finality Layer. We describe it as a layer, because the existing nodes and most of their logic will continue to operate largely as-is with minimal changes, while much of the new functionality will be provided by new, supplementary components and network protocols.
The left diagram shows the current Zcash network, with a detail that illustrates how two nodes are connected to each other within the context of the entire network. The right shows the addition of the TFL after deployment: Each node continues to have its original PoW component, but now has an additional TFL component. The PoW components still connect to each other, as before, and TFL components use distinct connections to other TFL components.
This new layer provides the blockchain with a trailing finality guarantee: after blocks are mined they can be finalized meaning they may not be rolled back. This guarantee extends to any of the transactions within the blocks. It is trailing because this finality property follows the PoW mining system “trailing behind it.”
Because this hybrid design relies fully on PoW for producing new blocks, this protocol is resistant to halting in the same way Bitcoin or current Zcash is — although the finality guarantee may stall, as we describe next.This design paradigm has both a theoretical and practical track record: It is analyzed in a research paper, Ebb-and-Flow Protocols, and it is the same paradigm used by Ethereum both in the pre-Merge hybrid design of the Beacon chain as well as in current day Ethereum.
Why finality matters
Nakamoto PoW consensus, introduced with Bitcoin and inherited by Zcash, provides probabilistic finality. This means the chance that a block can be rolled back falls as more blocks are mined.
In our view, the primary challenge with this kind of finality is that different participants independently react to rollbacks. For example, probably most participants anticipate 1-block rollbacks (which are relatively common), but as the size of rollbacks grow larger, three challenges emerge:
- Larger rollbacks become more rare, so some participants may not have a process or policy for handling that situation.
- Different participants may have different policies, so in the event of a large rollback, the ecosystem may fracture as different participants disagree on how to recover.
- When counterparties require a sufficiently low tolerance for rollbacks, their interaction must incur a substantial delay.
Example: trust-minimized bridge
To drive this point home, consider the valuable use case of a trust-minimized bridge: ZEC sent into a bridge must be locked up while an equivalent number of proxy tokens are issued on another network.
If a rollback reverts a bridge deposit after the proxy tokens are issued elsewhere, those ZEC are no longer locked in the bridge, and the proxy tokens are now unbacked. This breaks the peg of the bridge, and many bridge users will simultaneously lose funds. If the bridge designers decide to require enough PoW blocks to make the probability of this event astronomically small, then issuing the proxy tokens on the other network will have an extremely large delay.
Example: exchange deposits
If a user deposits ZEC on a centralized exchange, their account on the exchange is credited the appropriate amount. If a rollback of this deposit occurs, the exchange accounting now has more ZEC liabilities than actual ZEC held.
Exchanges seek to address this by requiring more PoW blocks to reach a sufficiently low probability of this event. However, this is a balancing act: If an exchange imposes a delay of n hours, the probability is still not “astronomically small,” so users are inconvenienced by n hours and the exchange still carries a practical risk of a liability overhang event.
Additionally, because of challenge number 2 above, different exchanges require different deposit delays, which potentially confuses users and puts exchanges into competition to take on more risk by accepting fewer block confirmations.
In contrast to probabilistic finality, a consensus protocol may provide a finality guarantee. Protocols that do this ensure that all participants agree on which set of blocks and transactions are final. The trade-off is that finality may fail to make progress in the event of network disruptions. Once the network recovers, the trailing finality can “catch up” to the PoW blocks which were produced in the interim.1
Practically, this means if participants are waiting for a transaction to become final, sometimes they may need to wait an arbitrarily long amount of time.
Finality addresses all three challenges to some degree:
- Participants now no longer need to anticipate rollbacks of varying probabilities when designing their procedures and policies. Instead, they must anticipate the risk that sometimes finality fails to progress in a timely fashion.
- All participants agree exactly which blocks and transactions are final, though they may disagree on how to react if finality is stalled for long periods of time.
- Participants may now rely on the finality guarantee to ensure they only react to some transactions when there is zero chance of the transaction being reverted.
In the examples above:
- A trust-minimized bridge can rely on finality for issuing proxy tokens. This ensures the bridge will never be under-collateralized. The trade-off is that as long as finality stalls, cross-bridge transfers will also be stalled.
- All exchanges can use the same finality guarantee, so users can expect the same deposit delay everywhere (and it is likely to be notably lower than the status quo). The trade-off is that if finality stalls, new deposits will also be stalled, though all exchanges will behave consistently in this regard.
The end result for users is that some high-value interactions (such as bridging or exchange deposits) will now be faster and safer most of the time. Sometimes finality may stall. When finality resumes, it will “catch up” to the PoW chain, so users who don’t need the finality guarantee can continue using this hybrid protocol, similarly to how they use Zcash today, and be unaffected if finality stalls.
Status and open questions
This blog post introduction covers most of our R&D, so far, on the TFL design. There are still many open issues that need to be resolved with collaboration and input from the Zcash community before a TFL design is ready as a proposal for a Zcash upgrade.
An incomplete list of issues that still need to be resolved include:
- Is this general approach acceptable to the Zcash community?
- How will new ZEC issuance be distributed between PoW, PoS, and any potential Dev Fund successor? This is a key concern for both miners and potential stakers or delegators.
- How can we integrate any other changes to issuance mechanics, such as the Zcash Sustainability Fund?
- All of the PoS accounting mechanics, such as how bonding works, how delegation works, what kinds of slashing may occur, withdrawal delays and mechanics, etc.
- How would PoS operations interact with all other Zcash ledger activities, such as shielded pools, etc.? This will be a key area for understanding how privacy and staking interact.
- Should the chain tip ever stall in order to bound the gap between the finalized block and the chain tip?
- Detailed security analyses, including economic security, a case analysis of PoW capture, PoS capture, network security (esp. given two separate network protocols/layers).
- How can we ensure mobile shielded wallets are first-class participants? For example, can we ensure they can safely and efficiently delegate stake and manage stake-delegation positions? This includes both UX issues, like delegation UI flows, as well as implementation, such as lightwalletd protocol changes.
- How to integrate any TFL protocol changes with other proposed protocol changes in a safe and timely manner. Examples: Zcash Sustainability Fund, Zcash Shielded Assets, bridging efforts, Namada integrations, etc.
- Selection of a specific finalizing PoS protocol. We are currently focusing on using Tendermint and ABCI as a way to rapidly prototype and validate the design.
- Prototypes, testnets, code architecture, etc.
As we address the open issues above, especially the interaction with other protocol features and coordination with those teams on timing, we can begin to refine a deployment timeline.
Our next steps for this TFL R&D initiative are to get community feedback on this blog, host a workshop at Zcon4, and produce R&D updates on the open issues above.
As we collaborate with other protocol development teams, we’d like to create a tentative deployment timeline that incorporates all of the other protocol feature efforts such as ZSF, ZSAs, and potential cross-chain bridge efforts.
Finally, we’re seeking out other teams or individuals who are interested in collaborating on this TFL project, so if you’re interested, please see the Get involved section above.
1 In a pure finalizing PoS protocol, the equivalent “stalls” are actually network-wide halts. One strength of this hybrid design is that PoW need not halt, and if it doesn’t then users who do not depend on finality can continue using the network.