The concept of scaling blockchains is often associated with the idea that blockchains are too slow and inefficient to be widely adopted. The maximum amount of transactions that can be processed each second by Bitcoin or Ethereum, is often compared to global payment networks like Visa or Mastercard to emphasize the latters’ lead.
Like any network, whether physical or virtual, a blockchain can get congested as the amount of information transferred over the network surges. In the case of Ethereum, the network is limited to one block every 12 to 14 seconds, about 175 transactions per block and roughly 12 transactions per second. Everyone posting a transaction on Ethereum will compete to have their transaction processed in the next block, paying gas fees to win that competition. This explains why gas fees surge as the network grows, and why it was needed to find a solution to scale Ethereum to suit transaction-intensive use-cases.
That image of inefficient blockchains comes from the technical challenges posed to public blockchains that have been attempting to unify global financial services and payment networks, into an all-purpose blockchain that can connect any type of application – such as games – to this economical layer. The media also generously put forward like an axiomatic law the “trilemma” that states an incompatibility between decentralization, security and speed. However, a lot of specific-purpose blockchains already exist and flawlessly serve the purposes or use-cases they were designed for, in fields like supply chain for instance.
Despite the challenge, Ethereum is a good example to study blockchain scaling and see that most all-purpose public blockchains today are actually on a path to be sufficiently technologically advanced to achieve their objectives. They promise to function at a global scale, thus aligning the “late” trilemma, and becoming significant challengers to the current status-quo of financial networks, while also offering some necessary specialization features to serve other use-cases like gaming.
This article will be split into several parts. Part 1 is a general overview of blockchain technology, and why it needs scaling.
Part 2 will then address each scaling solution separately with a focus on rollups, their characteristics, specialities, limitations and how Atlendis can benefit layer 2 scaling solutions. This discussion will conclude with a Part 3 on the current state of scaling and a view of what the future of scalability and blockchain specialization may look like.
Why are scaling solutions needed and what are so-called “Layer 2s”?
Layer 2 is a collective term for solutions designed to help scale blockchain applications. This concept cannot be dissociated from the concept of Layer 1, or Mainnet, which in this article refers to Ethereum, and serves as the foundation for Layer 2s.
To understand why and how to scale Ethereum, it is required to understand the general concept, what blocks and transactions are, and on which section of the transaction life cycle we can act to improve the blockchain’s efficiency.
a. Brief reminder on the blockchain and transactions
Ethereum is a decentralized digital ledger or blockchain, which indeed can be seen as a chain of blocks. It can also be described as a giant transactional database that is decentralized in the sense of governance and data storage. It is run by a bunch of nodes (called validators or miners depending on the network), which act as hosting servers. Anyone can set up a node and take part in the governance of the blockchain, since only then can one alter the blockchain.
Each block is a bundle of transactions picked from the mempool (sometimes called transaction pool or transaction queue), a sort of waiting line hosted on each node where newly created transactions are before being added to a block. Transactions in the mempool are thus in a pending state. Once bundled up and processed by validators of the network (miners/validators/operators, the terminology varies depending on the type of chain), transactions are published along with the new blockchain state.
Transactions themselves are the atom of blockchains, the smallest, unbreakable entities that cannot be split up or interrupted. Transactions are cryptographically signed messages sent by accounts on Ethereum. An account can be either an Externally Owned Account (EOA) or a Contract Account (CA). An EOA is typically a wallet and it can create new transactions. In contrast, a CA is ruled by code and responds to a transaction sent to it by initiating a new transaction according to the code.
An account initiating a transaction broadcasts a message on the blockchain that will contain some or all of the following information:
|Nonce||a scalar value equal to the number of transactions sent from this address|
|GasPrice||the amount of Wei the sender is willing to pay per unit of gas used to complete the transaction|
|GasLimit||the maximum amount of gas the sender is willing to pay for this transaction to be completed|
|To||the recipient’s address|
|Value||the quantity of ETH to transfer from the sending to the receiving account|
|v, r, s||used to generate the signature that identifies the sender of the transaction|
|Init (optional)||only filled with the smart contract’s code at the occasion of a contract creating transaction|
|Data (optional)||used for message calls when interaction with a contract that expects input fields, for instance a name for a character creation in-game|
As the transaction is validated and published on the blockchain, it triggers a new state for the sending and the receiving accounts and therefore, the blockchain as a whole.
- Blocks and state
Blocks are characterized by the block production frequency and the block size, that is the amount of gas per block, and are programmed as fundamentals of the blockchain.
If transactions are the vectors, blocks are the groups of vectors that alter the system as a bundled update to the blockchain’s state.
Transactions in one block are processed one after the other and so are blocks on the chain, which gives Ethereum a sequential structure. As such, Ethereum can be seen as a chain of blocks as much as a chain of states.
In the context of Ethereum, the state is a data structure that links all blocks together. The state keeps all accounts’ state linked by hashes (a string of characters resulting from a hash function) to guarantee the integrity of the entire system and ensure the consensus of the network state. In Ethereum, the state structure is in the form of modified Merkle Patricia Trees.
That tree structure is present in many instances in the blockchain, because it ensures that any change done to any entry of the database will change the hash of the branch above and so on, until the top of the data tree, thus changing the whole blockchain.
To simplify, blocks display a state header at the top followed by the list of transactions that are appended in this block. The header recalls the current state of Ethereum, and after this block is processed, the next one will include a new header that includes the new transactions and reflects the new state of Ethereum.
Headers are therefore a lighter data tool to have a summary of Ethereum state.
Worth noting is the fact that since externally owned accounts (EOAs) are the only accounts that can initiate transactions, the state of Ethereum is the direct reflection of their actions.
In theory, the chain is infinite and all the data of a past state of the blockchain is transparent, available and accessible to the public at any time using a blockchain explorer such as Etherscan.
However, because accessing the history of transactions since Ethereum’s genesis is rarely necessary, two types of nodes running the blockchain exist: full nodes and archival nodes.
To simplify, a full node is sufficient to run the blockchain and all it needs is the latest block and the head state (see section 3) of Ethereum. Currently, the size of the data needed to run such a node is already large, at around 500GB. As a general rule, full nodes are able to prune out old blockchain data to only retain data that is necessary. Light nodes only download the chain of headers from the genesis block to the current head block without executing any transaction.
Archival nodes on the other hand download all the blocks since Ethereum’s genesis and are therefore much larger in size, currently sitting at around 10TB.
Light nodes are one way to lighten the data burden and have only the critical data available, as long as it is relevant.
This is where the complexity of scaling starts unveiling though. Transaction throughput can be bumped by increasing block production frequency and block size, but it would have a negative impact on a node’s bandwidth and on that node’s storage capacity. Considering the already substantial amount of data stored on Ethereum’s full nodes – at one point these 500GB must be downloaded and every additional data as well – it is unrealistic to go down that route to scale Ethereum if we want anyone to be able to run it.
One can also see below that data availability becomes challenging as soon as the scaling solution involves additional blockchains, and this is where scaling solutions often diverge.
- Blockchain fundamental mechanisms
To understand how scaling has been undertaken and why scaling solutions are what they are, one more aspect of blockchains must be briefly reviewed.
Most public blockchains process transactions relying on three main layers or lego blocks:
- The data availability layer is where the state and history of transactions are stored, it is the database of the chain that makes the states publicly available in an immutable way. One could see it as the storage or the bookkeeping system.
- The consensus layer ensures that blocks are distributed to validators on the network and validated according to the rules. One could see it as the anti-corruption system.
- The execution layer manages the computation and executes smart contracts. Layer 2s excel at this, because they were designed to take that heavy execution part off mainnet.
As a general rule, scaling will consist in independently working on each of these layers, or sticking to a monolithic architecture and scaling all of them together.
Most of the scaling solutions presented in this article introduce additional blockchains that are all composed of each of these layers, but they are intertwined between blockchains to leverage each other’s assets. For instance, rollups have a consensus layer with nodes ran by operators, and they work in harmony with validators from Ethereum mainnet to leverage its consensus layer.
b. The concept of monolithic scaling
Some blockchains are labeled as monolithic because they choose the path to scale nodes indefinitely, keeping all three components – data availability, consensus and execution – in one chain. Monolithic blockchains constitute huge whole structures that can often handle a much higher transaction throughput.
Solana, for instance, claims a maximum of over 50,000 transactions per second (TPS), but scaling nodes comes at the cost of increasingly expensive hardware, fewer validating nodes and consequently, more centralization. Scaling the three layers together indeed comes with trade-offs. Solana is run by about 1,500 validators, but the 20 largest ones control more than 33% of the network’s total stake, potentially giving them the power to halt the network if they unite.
With that said, projects like Neon Labs will make Solana EVM compatible and thus allow the scaling of Ethereum based applications on Solana, so it would be unfair to leave it at that and call it monolithic forever. Solana might have chosen a different path, however, it is still an important and useful block in the general blockchain ecosystem.
Growing validating nodes is costly, work intensive, and not accessible to the public. Hardware cost is one impediment, but so are the technological challenges and R&D costs that come with increasing processing power, as stated by Rock’s law. This is therefore an effective solution to increase throughput in the short term, but is likely to be unsustainable over the long term and, above all, it goes against the Ethereum community’s emphasis on the idea that anyone should be able to run a node on their laptop.
However, scaling with the help of extra blockchains, whether sub-chains, side-chains or rollups, the plurality of states – each chain has its own state – must be consolidated and the general state keep the integrity of each sub-state.
This is why involving additional chains poses two major challenges:
- How to ensure security or correctness of transactions?
- How to ensure data availability?
If ZK-Rollups (ZKR) and Optimistic Rollups (OPR) are secured by Ethereum by respectively using ZK-Proof (also called Validity Proofs) and Fraud Proofs – these terms will be explained shortly – to ensure the correctness of transactions posted on mainnet, they are still separate chains, which poses the problem of data availability.
What if a malicious operator (an actor running a verification node) shuts down their server so that the data of the rollup disappears? Users of the rollup would lose availability to the data and could not withdraw to Layer 1 or mainnet without a detailed rollup state and the state of their account.
Part 1 conclusion:
Ethereum is a blockchain, that is a decentralized database where users interact with the chain through accounts, and create transactions – the atom of the blockchain – to exchange data from one account to another. Each account is linked to its account state and that Data is stored in the form of a data tree, to aggregate into a general state also called world state. Transactions are bundled into blocks, hence the name blockchain, and blocks are published by validators of the network who run nodes.
To guarantee the security and decentralization of the network, the Ethereum community emphasized that anyone should be allowed to run a node, which limits possibilities to scale the nodes themselves to increase block frequency and size.
Scaling blockchains will rely on scaling one or all of the three fundamental mechanisms that they rely on: the consensus layer that ensures that block validation rules are respected, the data layer that ensures the storage and availability of the blockchain’s state data, and the execution layer that handles smart contracts’ computation and execution.
Part 2 will show that layer 2 scaling solutions rely on intertwined blockchains that share and leverage each other’s fundamental mechanisms to come to a general consensus across layers, aggregate available data throughout the entire ecosystem to ensure correctness of states, and optimize smart contract execution to improve performances.