Introduction
Layer 2 rollup data compression is the technical mechanism by which rollups batch transactions off-chain, compress transaction data, and submit a compact summary to the Ethereum main chain, thereby reducing congestion and lowering gas fees for end users.
The Core Problem: Ethereum's Data Bottleneck
Ethereum’s base layer processes roughly 15–30 transactions per second (TPS), a limit imposed by block size and propagation constraints. Every transaction must be stored in perpetuity, consuming block space that costs gas. As decentralized finance (DeFi) and non-fungible token (NFT) activity surged, users faced exorbitant costs—often exceeding $50 per simple swap during peak times. Layer 2 rollups emerged to solve this by moving execution off-chain while inheriting Ethereum’s security guarantees. The crucial enabler is data compression: rollups pack hundreds of transactions into a single compressed batch, then publish only the essential state data to L1. Without compression, the L1 calldata cost alone would negate any efficiency gains. Optimistic and zero-knowledge (ZK) rollups both rely on compression, though they differ in how they validate the compressed data. Industry estimates suggest compression reduces L1 data per transaction from roughly 500 bytes to fewer than 20 bytes in some implementations, a 25x improvement.
How Data Compression Works in Rollup Architectures
Compression Strategies in Optimistic Rollups
Optimistic rollups, such as Arbitrum and Optimism, assume transactions are valid by default and rely on fraud proofs. Data compression here focuses on minimizing calldata sent to L1. The process begins off-chain: a sequencer collects user transactions and constructs a batch. Instead of sending full transaction data (sender, receiver, value, signature, nonce, and raw input), the rollup applies several compression techniques:
- State diffs instead of raw inputs: Rather than transmitting the entire transaction, the rollup sends only the net changes to accounts and storage slots. For example, a token transfer might be represented as “address A: balance -100; address B: balance +100” rather than the full transfer instruction.
- Differential encoding: Consecutive transactions often share fields, such as the same nonce prefix or gas price. Differential encoding stores only the change from the previous transaction, slashing redundancy.
- Variable-length integer encoding: Small numbers (common in balances and nonces) are encoded in fewer bytes using techniques like protobuf-style varint encoding.
- Signature batching: Aggregated signatures (e.g., BLS multi-signatures) replace individual signatures, saving 64 bytes per transaction.
After compression, the batch is published as calldata in an L1 transaction. The sequencer also posts a state root, which is used by verifiers to detect fraud. A typical Optimism batch might contain 10,000 transactions but occupy only 50–100 kilobytes, yielding effective throughput of 2,000–4,000 TPS. For a deep dive on alternative approaches, research on Gas Fee Reduction Methods compares rollup compression with other on-chain scaling solutions.
Compression in ZK-Rollups
Zero-knowledge rollups, such as zkSync Era and StarkNet, use validity proofs (zk-SNARKs or zk-STARKs) to verify correctness off-chain and submit only a succinct proof to L1. Data compression occurs at two stages. First, transaction data is compressed similarly to optimistic rollups. Second, the batch undergoes circuit-specific compression: the zk-proof itself is tiny (typically a few hundred bytes) regardless of how many transactions are batched. This means ZK-rollups can achieve lower on-chain footprint per transaction. For instance, zkSync Era encodes tokens using a 16-bit index internally, mapping each token ID to a compact reference. Account addresses are also indexed: instead of sending a 20-byte Ethereum address each time, the rollup uses a local 4-byte alias. Standard transactions in zkSync Era can be as small as 12 bytes after compression, compared to roughly 100 bytes for a raw Ethereum transaction. The compressed batch is submitted to L1 along with the proof; the proof validates every transition. Because the L1 contract only needs to verify the proof—not replay every transaction—the gas cost stays low. Users ultimately pay for the aggregated proof verification plus the compressed calldata, which together often amount to less than $0.10 per transaction.
Technical Details of Compression Algorithms
Rollups employ a mix of general-purpose and domain-specific compression algorithms. LZ4, Deflate (gzip), and Brotli are common general-purpose compressors applied at the batch level. These algorithms exploit patterns within raw transaction bytes: for repeated opcodes, similar token transfers, or identical recipient addresses. However, general-purpose compressors alone are insufficient because transactions have high variability in fields like signatures and nonces. Hence, rollups pre-process data before applying standard compression. Pre-processing steps include:
- Delta encoding for numerical fields: Convert absolute values (e.g., nonce=1023, nonce=1024) to deltas (nonce increment=1). Deltas are smaller on average.
- Dictionary encoding for addresses: Map frequent addresses (e.g., major DEX contracts) to small integers. A mapping table is included in the batch header, then each occurrence uses the 2-byte index instead of 20 bytes.
- Run-length encoding (RLE): If the same field repeats across multiple transactions (e.g., same gas price), RLE stores the value once and a count.
- Canonical ordering: Sorting transactions by recipient address or token ID groups repeated patterns, improving compression ratios.
Ethereum Research publications note that ZK-rollup compression can reduce calldata from 68 bytes per transaction (raw) to 12–16 bytes—a 75%+ saving. StarkWare’s SHARP (Shared Prover) reports further gains by combining multiple batches into a single proof. The technical frontier includes novel approaches like proto-danksharding (EIP-4844), which introduces a dedicated data blob space for rollups, reducing per-byte cost and eliminating the need for calldata. Compression remains critical even with blobs, as the blob size is limited to roughly 126 kB per block per blob. For perspective on how compression enables interoperability across these systems, the topic of Layer 2 Cross Rollup Communication explores bridging compressed state between different rollup architectures.
The Role of Data Availability and Calldata
Data compression does not eliminate the need for data availability. Even with compressed batches, the entire compressed dataset must be published on-chain—or otherwise made available—so that anyone can reconstruct the L2 state. In optimistic rollups, data availability is mandatory because honest nodes must be able to compute the state from the published data to challenge fraudulent claims. In ZK-rollups, compressed transaction data is sometimes posted on-chain for transparency, though the validity proof alone ensures correctness. Some newer rollup designs (e.g., validiums) move data off-chain entirely, using external data availability committees. However, this trade-off weakens trustlessness. Eternal storage costs on L1 motivate aggressive compression, yet rollups must balance compression ratio with computational overhead. Heavier compression (e.g., full dictionary encoding) increases off-chain CPU time and may delay batch submission. Most production rollups use lightweight compression to ensure batch windows of a few minutes. For instance, Arbitrum’s team states that compression reduces their L1 data footprint by roughly 60–70% compared to sending raw EVM call data. The saved gas is passed to users as lower fees. As of early 2025, typical L2 fees on major rollups remain below $0.01 for a simple transfer when traffic is low, thanks largely to calldata compression and the upcoming blob space from proto-danksharding.
Comparison: Optimistic vs. ZK-Rollup Compression
Both rollup types compress transaction data, but the economics differ. Optimistic rollups rely on total data availability—every byte of compressed data must be accessible for a challenge period (often 7 days). This limits how aggressively they can compress, because they must preserve enough information for fraud provers to reconstruct state nodes. Compression ratios for optimistic rollups typically range from 5x to 15x. In contrast, ZK-rollups can aim for higher ratios (up to 80x for transfers) because the validity proof eliminates the need for full replay. The proof itself encapsulates integrity. However, generating ZK-proofs is computationally heavy—orders of magnitude more expensive than fraud proof preparation—so the compression gain must offset the cost of proof generation. StarkWare reports that, after compression and proof generation, the total batch cost on L1 can be as low as 5,000 gas per transaction, versus roughly 21,000 gas for a base-layer ETH transfer. Both types offer enormous improvements over L1, but ZK-rollups achieve superior compression at the expense of higher off-chain hardware requirements.
Challenges and Emerging Solutions
Compression is not free. Encoding and decoding overhead can slow sequencer performance. Non-transfer transactions—such as complex DeFi swaps or contract interactions—contain non-repetitive calldata (function signatures, arbitrary inputs) that resists compression. Rollups handle this by falling back to generic compression for complex calls, sometimes achieving only 2x–3x ratios. Another limitation is the interaction between compressed state and cross-rollup bridges: a bridged message must be decompressed and re-compressed on the target layer, adding latency. zk-Sync’s recent upgrades introduce “state compression” for storage slots, using Merkle tree pruning to reduce the amount of data sent across layers. Meanwhile, new proposals like EIP-7623 aim to increase calldata cost to discourage bloat, further incentivizing rollups to compress aggressively. Developers are also experimenting with arithmetic coding for tuple payloads (sequences of addresses, amounts, and tokens), which can achieve near-optimal compression for structured transactions. While today’s rollups already process thousands of TPS, next-gen compression techniques could push TPS beyond 100,000 without sacrificing security.
Conclusion
Layer 2 rollup data compression is the unsung hero of Ethereum scaling, enabling cost reductions of 10–100x through a combination of state diffs, dictionary encoding, delta encoding, and general-purpose compressors. As rollups mature, the focus is shifting to cross-layer solutions that further optimize data packaging and availability. Whether via optimistic or ZK-rollup routes, compression will remain the bedrock technology for making decentralized applications accessible to a global user base.