Vitalik's Blog: How to Make Ethereum as Simple as Bitcoin in 5 Years

Blockbeats

05-06

Original Title: Simplifying the L1

Original Author: Vitalik Buterin

Original Translation: GaryMa, Wu Blockchain

Abstract

Ethereum aims to be a global ledger that requires scalability and robustness. This article focuses on the importance of protocol simplicity, proposing a significant reduction in complexity by simplifying the consensus layer (3-slot finality, STARK aggregation) and the execution layer (replacing EVM with RISC-V or a similar VM) to reduce development costs, error risks, and attack surface. It is suggested to smoothly transition through a backward-compatible strategy (such as an on-chain EVM interpreter) and unify erasure codes, serialization formats (SSZ), and tree structures to further simplify. The goal is to bring Ethereum's consensus-critical code closer to Bitcoin's simplicity, enhance robustness and participation, and emphasize simplicity culturally by setting a maximum line of code target.

Ethereum's goal is to become a global ledger: a platform for storing human civilization assets and records, serving finance, governance, high-value data authentication, and other fields. This requires support in two aspects: scalability and robustness. The Fusaka hard fork plan will increase the available space for L2 data by 10 times, and the proposed 2026 roadmap also plans for a similar substantial improvement in the L1 layer. Meanwhile, Ethereum has completed the transition to proof of stake (PoS), client diversity is rapidly increasing, zero-knowledge (ZK) validation, quantum resistance research is steadily advancing, and the application ecosystem is becoming more robust.

This article aims to focus on an equally important but often underestimated element of robustness (even scalability): protocol simplicity.

Bitcoin's protocol is most admirable for its elegant simplicity:

1. There exists a chain of blocks, with each block linked to the previous block through a hash.

2. The validity of blocks is verified by proof of work (PoW), i.e., by checking if the hash value has a certain number of leading zeros.

3. Each block contains transactions, where the coins spent in a transaction either come from mining rewards or from previous transaction outputs.

That's it! Even a smart high school student can fully understand how the Bitcoin protocol works, and a programmer could even write a client as a hobby project. The simplicity of the protocol has brought many key advantages to Bitcoin (and Ethereum) as a trusted, neutral global base layer:

1. Easy to Understand: Reduce the complexity of the protocol so that more people can participate in protocol research, development, and governance, reducing the risk of being dominated by a technical elite.

2. Reduce Development Costs: Simplify the protocol significantly to reduce the cost of creating new infrastructure (such as new clients, validators, developer tools, etc.).

3. Reduce Maintenance Burden: Lower the long-term maintenance cost of the protocol.

4. Reduce Error Risk: Decrease the likelihood of catastrophic errors in protocol specifications and implementations, while making it easier to verify the absence of such errors.

5. Narrow the Attack Surface: Reduce the complexity of the protocol's components to lower the risk of being attacked by special interest groups.

In the past, Ethereum (sometimes due to my personal decisions) has often failed to remain simple, resulting in high development costs, increased security risks, and a closed research and development culture, with the benefits of this complexity pursuit often proving elusive. This article will explore how Ethereum will approach the simplicity of Bitcoin five years later.

Simplify the Consensus Layer

The new consensus layer design (historically referred to as the "beacon chain") aims to leverage the experience of the past decade in consensus theory, ZK-SNARK development, staking economics, etc., to build a long-term optimal and simpler consensus layer. In comparison to the existing beacon chain, the new design significantly simplifies:

1. 3-slot Finality Design: Removes concepts such as slots, epochs, committee reshuffling, and related efficient processing mechanisms (like synchronous committees). The basic implementation of 3-slot finality requires only about 200 lines of code and achieves security close to optimal compared to Gasper.

2. Reduce the Number of Active Validators: Allows for the implementation of simpler fork choice rules to enhance security.

3. STARK-Based Aggregation Protocol: Anyone can be an aggregator without needing to trust the aggregator or pay exorbitant fees for repeated field payments. The complexity of aggregation cryptography is high, but it is highly encapsulated, with low systemic risks.

4. Simplify P2P Architecture: The above factors may support a simpler and more robust peer-to-peer network architecture.

5. Redesign Validator Mechanism: This includes entry, exit, withdrawal, key rotation, inactivity leak, and other mechanisms, simplifying the codebase and providing clearer guarantees (such as the weak subjectivity period).

The advantage of the consensus layer is its relative independence from the EVM execution layer, allowing for significant ongoing improvements. The greater challenge is how to achieve similar simplifications at the execution layer.

Simplify Execution Layer

The complexity of the EVM has been increasing, and much of this complexity has been proven unnecessary (partially due to my own decision-making errors): the 256-bit virtual machine has over-optimized specific cryptographic forms that are now considered outdated, and precompiles are optimized for a single use case but are rarely used.

Addressing these issues one by one has had limited effectiveness. For example, the removal of the SELFDESTRUCT opcode required significant effort but brought only a small benefit. The recent debate about EOF (EVM Object Format) also demonstrates similar challenges.

I recently proposed a more radical approach: instead of making moderate-scale (yet still disruptive) changes to the EVM in exchange for a 1.5x benefit, we should transition to a superior, simpler virtual machine to achieve a 100x benefit. Similar to "The Merge," we reduce the number of disruptive changes but make each change more impactful. Specifically, I suggest replacing the EVM with RISC-V or another virtual machine used by Ethereum ZK rollup. This would bring:

1. Substantial Efficiency Gains: Smart contract execution (in the rollup) requires no interpreter overhead and runs directly. Succinct data shows that performance can be increased by over 100x in many scenarios.

2. Significantly Improved Simplicity: The RISC-V specification is much simpler compared to the EVM, and alternative solutions (such as Cairo) are equally concise.

3. Motivation for EOF Support: Such as code partitioning, more friendly static analysis, larger code size limits, etc.

4. More Developer Choices: Solidity and Vyper can add backends to compile to the new virtual machine. If RISC-V is chosen, mainstream language developers can also easily port their code to this virtual machine.

5. Removal of Most Precompiles: Possibly retaining only highly optimized elliptic curve operations (even these will disappear after the widespread adoption of quantum computers).

The main drawback is that, unlike a ready-to-use EOF, the benefits of a new virtual machine will take some time to reach developers. We can mitigate this issue by implementing short-term high-value EVM improvements (such as increasing the contract code size limit, supporting DUP/SWAP17–32).

This will bring about a simpler virtual machine. The core challenge is: how to handle the existing EVM?

Backward Compatibility Strategy for Virtual Machine Transition

Simplifying (or improving without adding complexity) the EVM faces the greatest challenge in balancing target achievements with backward compatibility for existing applications.

· First and foremost, it is important to note that the Ethereum codebase (even within a single client) does not have a single defined structure.

· The goal is to minimize the green area: the logic required for nodes to participate in Ethereum's consensus, including computing the current state, proving, verification, FOCR (fork choice rule), and "regular" block construction.

· The orange area cannot be reduced: if a protocol specification removes or changes a functionality at the execution layer (such as the virtual machine, precompiles, etc.), clients processing historical blocks will still need to retain the relevant code. However, new clients, ZK-EVM, or formal verifiers can completely ignore the orange area.

· The additional yellow area: valuable for understanding the current chain or optimizing block construction but not part of the consensus logic. For example, Etherscan and some block producers support ERC-4337 user operations. If we replace certain Ethereum functions (such as EOA and its supported legacy transaction types) with on-chain RISC-V implementation, the consensus code will be significantly simplified, but specialized nodes may still use the original code for parsing.

· The complexity of the orange and yellow areas is encapsulation complexity, where those familiar with the protocol can skip these parts, Ethereum implementations can ignore them, and errors in these areas will not pose a consensus risk. Therefore, the code complexity in the orange and yellow areas is far less harmful than the complexity in the green area.

The idea of moving code from the green area to the yellow area is similar to Apple's strategy of ensuring long-term backward compatibility through the Rosetta translation layer. Inspired by a recent article from the Ipsilon team, I propose the following virtual machine transition process (using EVM to RISC-V as an example, but it can also be applied to EVM to Cairo or RISC-V to a superior virtual machine):

1. Request new precompiles to provide on-chain RISC-V implementation: Enable the ecosystem to gradually adapt to the RISC-V virtual machine.

2. Introduce RISC-V as a developer option: The protocol simultaneously supports RISC-V and EVM, allowing contracts from both virtual machines to freely interact.

3. Replace most precompiles: Replace most precompiles with RISC-V implementation, except for elliptic curve operations and KECCAK (due to extreme speed requirements). Remove precompiles through a hard fork, changing the address's code from empty to RISC-V implementation, similar to the DAO fork. The RISC-V virtual machine is extremely simple, and even stopping here would significantly streamline the protocol.

4. Implement EVM interpreter in RISC-V: Deploy on-chain for smart contract execution (as needed for ZK rollups). Existing EVM contracts will run through this interpreter a few years after the initial release.

After completing step 4, many "EVM implementations" will still be used for optimizing block construction, developer tools, and chain analysis but will no longer be part of the critical consensus specification. The Ethereum consensus will "natively" only understand RISC-V.

Simplifying Through Shared Protocol Components

The third way to reduce the total protocol complexity (and often the most underestimated) is to share a common standard across different parts of the protocol stack as much as possible. It's usually counterproductive for different protocols to do the same thing in different scenarios, but this pattern still frequently emerges, mainly due to a lack of communication between different parts of the protocol roadmap. Here are a few specific examples of how sharing components can simplify Ethereum.

Unified Reed-Solomon Code

We require Reed-Solomon codes in three scenarios:

1. Data availability sampling: Clients validate that a block has been published.

2. Faster P2P broadcasting: Nodes can accept a block after receiving n/2 fragments, balancing latency and redundancy.

3. Distributed history storage: Ethereum historical data shard storage, where each group of n/2 fragments can recover the remaining fragments, reducing the risk of loss for a single shard.

If the same erasure code (whether it's Reed-Solomon, random linear code, etc.) is used in three scenarios, the following advantages will be obtained:

1. Code Size Minimization: Reduce the total number of lines of code.

2. Increased Efficiency: If a node downloads partial chunks for one scenario, this data can be used for other scenarios.

3. Ensuring Verifiability: Fragments for all scenarios can be verified against the root.

If different erasure codes are used, compatibility should be ensured at least, for example, using a horizontally sampled Reed-Solomon code for data availability and a vertically random linear code operating in the same field.

Unified Serialization Format

The serialization format of Ethereum is currently partially solidified as data can be reserialized and broadcasted in arbitrary formats. An exception is the transaction signature hash, which requires a standardized format for hashing. In the future, the level of solidification of the serialization format will further increase due to the following reasons:

1. Full Account Abstraction (EIP-7701): The complete content of transactions is visible to the virtual machine.

2. Higher Gas Limits: Execution layer data must go into blobs.

At that time, we have the opportunity to unify the serialization formats of the three Ethereum layers: execution layer, consensus layer, and smart contract call ABI.

I propose the use of SSZ because SSZ:

1. Is Easy to Decode: Including within smart contracts (due to its 4-byte-based design and fewer edge cases).

2. Is Widely Used in the Consensus Layer.

3. Is Highly Similar to the Existing ABI: Tool adaptation is relatively straightforward.

Efforts have already been made for a comprehensive migration to SSZ, and we should consider and continue these efforts when planning for future upgrades.

Unified Tree Structure

If transitioning from EVM to RISC-V (or other optional minimal virtual machines), the hexadecimal Merkle Patricia tree will become the bottleneck for block execution proof, even in average cases. Moving to a binary tree based on a better hash function will significantly improve the efficiency of the prover and reduce data costs for scenarios like light clients. During the transition, it should be ensured that the consensus layer uses the same tree structure. This will allow the Ethereum consensus layer to access and parse data using the same code as the execution layer.

From Present to Future

Simplicity in many ways is akin to decentralization, both being upstream goals of resilience. Emphasizing simplicity explicitly requires a certain cultural shift. Its benefits are often hard to quantify, while the costs of extra effort and foregoing some flashy features are immediately apparent. However, over time, the benefits become increasingly significant — Bitcoin itself being a prime example.

I propose following the example of tinygrad and setting a clear maximum code line count target for Ethereum's long-term specification, bringing Ethereum's consensus-critical code closer to Bitcoin's simplicity. The code handling Ethereum's historical rules will continue to exist but should be placed outside the consensus-critical path. Additionally, we should embrace the idea of choosing simpler solutions, prioritizing encapsulating complexity rather than systemic complexity, and making design choices that provide clear attributes and assurances.

Original Article Link

Welcome to join the official BlockBeats community:

Telegram Subscription Group: https://t.me/theblockbeats

Telegram Discussion Group: https://t.me/BlockBeats_App

Official Twitter Account: https://twitter.com/BlockBeatsAsia

免责声明：投资有风险，本文并非投资建议，以上内容不应被视为任何金融产品的购买或出售要约、建议或邀请，作者或其他用户的任何相关讨论、评论或帖子也不应被视为此类内容。本文仅供一般参考，不考虑您的个人投资目标、财务状况或需求。TTM对信息的准确性和完整性不承担任何责任或保证，投资者应自行研究并在投资前寻求专业建议。

老虎证券