The State of ZK Rollups

The State of ZK Rollups
A Thinker, pondering the future of Ethereum scaling and throughput. Courtesy of Dalle-2.

ZK rollups are currently the hot topic in the Ethereum L2 space promising speed, security, and throughput advantages over L1 and other L2s. But what are they? How do they work? Are all ZK rollups the same? This, and more at 10:00.

First, a high-level summary of ZK rollups and their stated goals. ZK rollups are a class of rollup that verify off-chain computation through the use of zero-knowledge proof, which allow us to increase the throughput of Ethereum by orders of magnitude while still inheriting the security of the underlying Ethereum network. Doing so is a necessary step in the adoption of Ethereum as the world's state machine, as it will be able to support smaller-value, higher-volume transaction applications. Effectively, the performance difference we’re talking about here is on par with switching from integrated graphics to an RTX 3090 Ti, all while still being effectively as secure as Ethereum L1. Absolutely absurd.

ZK rollups are unique in the rollup space by way of their security guarantees and time to finality. In order to be accepted on L1, every computation must be explicitly proven to be valid. This is in contrast to optimistic rollups, which rely on the submission of fraud proofs within a given window to prove if a transaction is fraudulent. This, by design, forces users to wait for a longer period of time to withdraw their funds from an optimistic rollup.

Now that we've gotten that out of the way, let's take a closer look at what's up with rollups, and then after that, we can dive into the specifics of the ZK rollup solutions that we tried out for ourselves.

What's a zero-knowledge proof?

A zero-knowledge proof is a mathematical verification of a computation. A simple example of a zero-knowledge proof is as follows:

Let's say your friend claims that they can taste the difference between tap water and spring water, and they want to prove it to you:

They give you 2 bottles of water – one with tap water, and the other with spring water (the main thing here is there is no visible difference to you, and that you can't taste the water yourself, ie the zero-knowledge part). In order to prove that they can taste the difference, they have you (out of their sight) randomly pour some of the water in a glass for them, and give it to them to taste, and then they tell you whether the water was the same as the last glass. Assuming they can't taste the difference, they would have a 50% chance of guessing correctly for each iteration of the experiment. In order to become more and more certain, one can run the experiment multiple times, with the probability of the prover successfully randomly guessing whether the water was the same n times consecutively being proportional to (1/2)^n. Effectively what this gets us is strong confidence that something is true without having knowledge of the thing we are verifying.

How do they work?

How does this help us pay less money to make number go up?

Rollups

What do rollups do? Well, they roll up a bunch of transactions. See you next week.

Rollups are a way of aggregating transactions off-chain, so that only a single state update transaction can be posted to L1 at some greater cadence, leading to considerable gas savings. Effectively, you can execute orders of magnitude more transactions in the rollup’s execution environment, and only have to execute one transaction on L1 to represent the batch.

In the ZK rollup world, a proof for a block of transactions is generated, and then the proof is posted on-chain for verification by a verifier contract. Typically, this flavor of rollup offers different levels of transaction confirmation, based on the progress of this process (for example, Starknet's transaction statuses are submitted, accepted on L2 - Starknet security, and accepted on L1 - Ethereum security). The data posted to L1 generally includes:

  • A merkle tree representing all transactions in a batch / block
  • Merkle proofs for transactions that prove inclusion in the batch / block
  • Merkle proofs for each (sender, receiver) for proof of inclusion in the rollup's state tree
  • A set of intermediate state roots representing the state after application of a transaction

See this explainer for a more in-depth view into how the proof generation and submission/verification process works.

Zero Knowledge Circuits

The basic idea behind ZK rollups is that any computation / state update can have a proof of validity generated, and then verified without the computation input being revealed, but more importantly, it limits the interaction with L1 to only the proof having to be posted, allowing normally-expensive computation to be run on the L2, saving us a lot in gas. In order to do this, the approach generally taken is to develop a system of polynomial equations that can be efficiently used to prove that a program ran successfully on a given system architecture. The exact implementation varies with the proof system and VM chosen to carry out operations, but the end result is a means of being able to privately carry out some arbitrary off-chain computation (within the context of the chosen VM), generate a proof for it, and post the proof on-chain so that someone can verify the validity of the computation with some arbitrary level of certainty.

EVM Compatibility

EVM compatibility can be thought of in several levels which correspond to how closely the behavior of a given system maps to the actual EVM. Here’s one framework that helps in thinking about this:

  • Language-level compatibility: Someone can write a program in language X (say solidity) and there exists a compiler / transpiler that compiles bytecode for the target rollup that does something.
  • Opcode-level compatibility: The rollup implements all EVM opcodes, but there may be some implementation differences.
  • Full EVM-equivalence: one can take bytecode deployed on Ethereum L1, copypasta it to a given L2, and it runs as if it were operating on L1 with no differences between the behavior of the two. One can break this category down into two sub-categories, as done in this post by our lord and savior Vitalik:
    • true full equivalence - the cost of running any opcode on L1 is equal to the cost on L2, multiplied by a constant, ie CL1 = αCL2
    • "full" equivalence - The L1/L2 gas cost per-opcode relationship varies based on the difficulty of proof (eg αkeccak != α+ != α**, etc)

Roadblocks to EVM-equivalence

In a ZK proof system using an algebraic intermediate representation, all operations must be represented as a series of multiplications and additions. Because of this, some EVM opcodes are considerably more computationally-expensive to generate proofs for than others. Take for example opcode 0x16 - the bitwise-and operation:

uint256 c = a & b;

Since this must be represented as a series of multiplications and additions, we end up having to represent this as something like:

uint256_t a;
uint256_t b;
uint256_t c = 0;

for (unsigned short i = 0 ; i < 256 ; i++) {
	// it should be noted that bit shift operations are also potentially very expensive,
	// so this could also be expanded to a series of constants
	c = c + (a[i] * b[i]) * (1 << i);
}

So we have to represent a simple bitwise operation with 256 iterations of this addition/multiplication loop to arrive at an equivalent representation of what is typically one of the cheapest operations for a processor. This becomes especially problematic when we realize that many hash functions (including keccak256 and sha256) heavily use bitwise operations in their generation. This has caused many zk platforms to replace these hash functions with ZK-friendly hashes, such as starknet's use of the pedersen hash instead of keccak256 or sha256.

Comparison Methodology

There are several things to consider when considering a ZK rollup as a platform for hosting a dApp. Here, we will focus on the technical aspects of projects, but it is also important to consider whether a project has strong business development practices, as this an extremely important link in the adoption chain.

Transaction Cost (duh)

One of the main reasons to consider a ZK rollup for a dApp is to save on gas fees. Because zkVMs along with a network's fee structures may both vary considerably, one may end up with greatly varying costs for running the same transaction on multiple different ZK rollup L2s.

Decentralization

Another important consideration when assessing an L2 solution is decentralization. Currently, most zkRollup L2s are not decentralized in the sense of distributed provers / sequencers, so thinking of decentralization in this way is not especially helpful here. Instead, it is helpful to consider the guarantees of permissionlessness, censorship resistance, and liveness as proxies for true decentralization, so these attributes were used for comparison, along with projects' roadmap for adopting true decentralization.

Proof System

Integral to the security of the protocol is the proof system used to prove computation. Two of the frontrunners are zkSNARKs and zkSTARKS. These offer different security/performance/trust profiles. SNARKs are generally less expensive, but require the generation of what amounts to a seed phrase at network genesis, so this is a potential vulnerability if the creators of the phrase are not trusted. STARKs are generally more expensive, but do not require the genesis phrase, and are theoretically secure in the context of quantum computing.

Developer Experience

Developer experience is one of the main driving factors in the proliferation of an ecosystem. The dev experience is especially important because a network is only as secure as the code being deployed to it (and the underlying verification system, of course), and great dev tooling makes it exponentially easier to develop quality, secure code. Specifically for networks with their own langauage, Solidity has relatively-well-understood attack vectors and workflows that help mitigate the associated risks - that is not the case with a language that has not seen deployment outside of a testnet. By virtue of Ethereum being the L1, projects supporting natively-EVM languages (such as Solidity and Vyper) will inherently have a leg up on their competition, as this either gives devs out-of-the-box, or at least a shorter path to, support for tooling already developed for these languages.

Time to Finality

The L1 settle time for a ZK rollup (ie when a batch has been verified by a contract on L1) varies based on the implementation of the protocol. Some common paradigms exist:

Periodic Updates: A state update is committed to L1 according to a constant time interval (ie every 2 hours). With this approach, time to finality is simply a function of when in the interval a transaction is submitted.

Transaction Count Trigger: A threshold number is specified, and when the number of uncommitted transactions reaches the threshold, the process of committing the batch to L1 is triggered. Here, the time to finality is dependent on the transaction submission rate and when in the batch a transaction is submitted. Because of this, time to finality will be quicker during periods of higher network activity.

Volunteer Payer: This is generally used in conjunction with one of the previous approaches - a user can volunteer to pay for the cost of committing the batch to L1, and in exchange the batch commitment process is immediately triggered.

Transaction Throughput

Transaction throughput, the rate at which a network can process transactions is dependent on several factors:

  • The underlying proof generation system
  • The compute capability of the prover(s) in the network
  • The performance of the rest of the infrastructure of the network (sequencer, data availability layer, etc)
  • The current state of the L1 network (in this case Ethereum)

The Competitive Landscape

There are many players in the ZK rollup space, most of which are doing some very cool, innovative stuff. The following are a few of the projects that stood out in our research.

This chart directly above maps out the results of our research into the most competitive ecosystems today. Below, we discuss our research in further detail (in alphabetical order, because we are fair and balanced).

Aztec

Aztec is a privacy-centric ZK rollup which uses encrypted UTXOs to preserve privacy of transactions. Currently most of its traffic is on its privacy-oriented dex network zk.money, which offers yield opportunities on elementFi and lido. Aztec's batch model implements a transaction threshold with the option for a single user to opt to pay for the gas for the whole batch, so one can prioritize either cost or time to finality.

Developer Experience

Aztec's programming languange, Noir, is pretty bare-bones at the moment, so building production systems using it would be relatively difficult. On the up-side, it is written in and has similar syntax to Rust, so Rust devs would have a leg up in this environment.

Alternatively, one can opt to build using Aztec's payments SDK (see docs), which offers some helpful abstractions on the UTXO paradigm. Again, this is pretty low-level at the moment, but this is an active project.

Ecosystem

Currently, it appears that zk.money is effectively the only way to interact with the Aztec network via a frontend. As previously mentioned, this seems to be the gateway to lido and elementFi, with plans to also integrate AAVE, Compound, and Liquity.

Polygon zkEVM

Polygon has been heavily investing in ZK rollups, and has announced its version of a ZK rollup that uses recursive STARKs as a proof mechanism. This is purported to have considerable performance benefits for traditionally computationally-expensive opertations such as ECDSA and keccak. Their decentralization vision is fully decentralized sequencers and provers, operating within the context of a Proof of Efficiency (PoE) market. While, there is very little in terms of demonstrated performance so far, it seems likely that polygon will be able to deliver on their promises. For many of the claimed cost/throughput benefits of this L2, we will have to wait and see if polygon is able to deliver, as there is not yet a testnet available. To their credit, the polygon team has done a good job of maintaining open comms about the status of the project, making it clear that open questions remain surrounding some implementation details, but they are actively working to address these loose ends.

Developer Experience

There is no live testnet yet...  ¯\_(ツ)_/¯ ... but we're getting there!

Ecosystem

Polygon has mentioned that they are considering migrating the state of their dPoS chain to their zkEVM platform. If this turns out to be the path they take, the zkEVM platform will inherit a bustling ecosystem.

Scroll

Scroll is a ZK rollup that is promising a true EVM-equivalent execution environment. One of the main features being worked on by the Scroll team is parallelizable proofs, so that the prover market is less of a winner-take-all and a more energy efficient, meritocratic architecture.

The Scroll team has been very closely (more so than any of the other projects we’ve looked at) working with Ethereum’s privacy and scaling group to develop their system, which is a good signal that the Scroll team takes the long-term vision of decentralization and security especially seriously. There is limited information around this product in terms of hard numbers for performance/time to finality, so there isn't much to report here. Because it is intended to be strictly EVM-equivalent, account abstraction is not natively supported by scroll. It will be interesting to see how this project develops, as they are working on the most difficult form of EVM equivalence, a true zkEVM, which is a notable feature that is currently on the L1 Ethereum roadmap.

Here is most of what is available on Scroll to date, along with their blog.

Starknet

Starknet (Starkware's rollup product) is built on a STARK-based VM, known as Cairo VM.

A transpiler, warp, exists for Solidity -> Starknet transpilation, but many Solidity builtins are not currently supported, which would make it difficult to implement a production system using warp.

Starknet's native language is Cairo, which is built on an immutable (write-at-most-once) memory model. The base datatype is a felt, or 'field element'. Its size in memory is 252 bits, which makes direct ports of solidity code using uint256-types considerably slower than a cairo-native implementation using felts, as the uint256 must be represented using a struct of two felts, causing a ~15x slowdown in performance relative to simply using a felt.

Developer Experience

Writing vanilla low-level Cairo is far from a smooth experience, primarily for the following reasons:

  • The developer is responsible for tracking / updating the allocation and frame pointer (values to access elements in memory), as well as the program counter (current instruction)
  • Transitioning to an immutable memory model isn't intuitive
  • Loops aren't implemented natively, so one must use conditional jumps or recursion to mimic this functionality
  • function pointers / builtins must be passed as implicit arguments to every function that uses them

Thankfully, this vanilla Cairo can be likened to writing Yul blocks in Solidity, and the Starknet flavor of Cairo offers a set of abstractions built on top of the base language to streamline the dev process. On top of the base Starknet builtins, OpenZeppelin has gone and implemented a good number of ETH standards in cairo. One considerable benefit of writing Starknet contracts is that pytest is used as a testing framework, so those coming from a Python background will be able to get started relatively quickly.

Ecosystem

Starknet is one of the few rollups that currently have a live mainnet instance. There are a good number of dApps that make up Starknet's ecosystem, including wallets Argent X and Braavos. Speaking of wallets - on Starknet, there is no distinction between an EOA and a smart contract wallet. This allows starknet to natively support account abstraction and injection of arbitrary logic to streamline the process of sending transactions (security, social recovery, etc), which opens the door for greatly-improved UX ceiling when it comes to wallet interactions. One thing worthy of note is that starknet mainnet is currently permissioned, and a deploy key must be acquired from the starkware team in order to post contracts.

zkSync

zkSync is yet another ZK rollup that offers EVM compatibility. zkSync uses PLONK proofs to verify computation due to its lower computational overhead in proof generation. zkSync is currently unique in its fee structure, as it accepts ETH or any ERC20 as a form of fee payment, provided it meets certain market cap requirements and is listed on CoinGecko.

Developer Experience

zkSync had arguably the best dev experience out of all of the rollups we tried out. There are supported solidity and vyper compilers that seamlessly integrate with hardhat that are currently language-level equivalent (meaning some implementations of language features, such as keccak256 and sha256 actually call different, circuit-friendly hash functions).

One interesting note in the implementation of zkSync's EVM equivalent is that it works by compiling solidity code to an LLVM intermediate representation, and then there exists an LLVM -> zk circuit compiler. What makes this interesting is that this leads to one being able to deploy code written in any language that can be compiled to LLVM IR. This means that in the future, you could write smart contracts in Rust, C++, or even Fortran for that retro vibe.

zkPorter is zkSync's solution for off-chain data availability. This solution offers ~10x increase in transactions per second over standard zkSync through the move of data availability off-chain. zkSync and zkPorter are promised to be seamlessly interoperable (still yet to be seen), allowing for balancing of security and performance by individual users / devs.

Ecosystem

Currently, there is no zkSync v2 (EVM-compatible) deployment on Ethereum mainnet, however zkSync v1 has been live for over a year for ETH and ERC20 transfers, and a testnet for v2 is also live.

Closing Thoughts

The main takeaway from investigating ZK rollups is that they’re a really cool innovation in blockchain technology - seems obvious given that they’re made possible by cutting-edge mathematical concepts.

While they are super cool sci-fi tech, taking some time to interact with the leading projects, it appears that there is still a lot of work to do in terms of transforming these projects into decentralized, production-grade interfaces. There's a ton of research going into getting these production ready (too much to discuss in a simple blog post), but one development that's particularly exciting on Ethereum's short-medium term roadmap is EIP-4844. It's a topic for another time, but it'd be difficult to write a blog post about rollups without mentioning it once.

Nevertheless, I’m excited to see how the development of these projects progresses, and I look forward to seeing the disruptive effect such platforms will likely have in the future.