Learn

Build

Ecosystem

Learn

Build

Ecosystem

Back

The Evolution of Data Availability in Blockchain and What’s Next

The Evolution of Data Availability in Blockchain and What’s Next

0g

Aug 11, 2024

Data availability (DA) refers to the availability of relevant transaction data required to verify a block’s validity. Given its direct relevance to Ethereum’s scalability, performance, and potential use cases, it has become one of Web3’s most pressing infrastructure concerns.

Understanding the evolution of the DA space will give the reader advanced insights into the space and its direction, including major use cases such as truly on-chain AI that are now becoming a reality.

In this article, we will cover:

  • Data availability and its significance

  • The emergence of data availability layers

  • Key limitations affecting industry advancement

  • New architectures for high-performance tasks

Early Days: 2017-2021

Even in Ethereum’s early days it was clear that scalability was a major concern, with the network hitting ~1M transactions per day in 2017 alone. 

Various solutions were proposed, including:

  1. Plasma: A framework for using child chains that periodically commit data to the main Ethereum chain.

  2. Sharding: Horizontal transaction processing for Ethereum.

  3. State channels: Off-chain transaction channels that settle on-chain when participants agree to close a channel. 

Layer 2 networks were ultimately chosen as the go-to solution, handling transactions off-chain and then posting summarized proofs to Ethereum to prove honest execution. Optimism and Arbitrum are two examples that both launched in 2021 and maintain data availability by providing off-chain transaction data that any full node may verify.

Daily Layer 2 Transactions. Source: https://www.growthepie.xyz/fundamentals/transaction-count 

However, the number of daily transactions executed by Layer 2s has since grown immensely (see: above). Ethereum’s full nodes, responsible for storing the entirety of Ethereum’s transaction history, have therefore been receiving more and more data. 

At this time, Ethereum’s EIP-4844 upgrade had not yet occurred, meaning that publishing Layer 2 transaction data to Ethereum was very expensive. This necessitated solutions that could efficiently prove data availability, known as data availability layers.

The Emergence of Data Availability Layers: 2021-2023

Between 2021 and 2023, various projects began developing these data availability layers (DALs), significantly reducing associated data availability costs.

To understand DALs, one must note that networks have four different layers:

  1. The Execution Layer: Executes valid transactions and smart contract operations.

  2. The Settlement Layer: Finalizes and records transactions to ensure immutability.

  3. The Data Layer: Stores and provides access to transaction data and blockchain state.

  4. The Consensus Layer: Ensures all nodes in the network agree on the current state of the blockchain and the order of transactions.

A monolithic blockchain is one where full nodes in a blockchain network handle all 4 tasks. Ethereum would be an example.

In contrast, modular blockchains separate the data layer and consensus layer from execution and settlement, providing a more cost-effective means of storing and verifying data. Celestia launched in late 2023 as an early pioneer in this space, using data availability sampling for cost-effective data verification that anyone can contribute to without needing a full node.

In short, this works by having light clients (a more lightweight version of full nodes) each store small bits of data, and use mathematical sampling to prove that all data exists and is valid.

Other DAL solutions such as NEAR, EigenDA, and Avail have also arrived. Each DAL takes a unique approach, such as NEAR relying upon sharding and EigenDA using data availability committees. 

As we’ll see, each has limitations that severely restrict both performance and potential use cases.

DAL Limitations: 2023 Onwards

We are now at an interesting point in time where various DALs have launched and have others using them as infrastructure, but their technology remains highly incapable of serving future use cases.

Consider Celestia's performance bottlenecks. This includes a 12-second block speed (meaning slow latency), while throughput is limited to 6.67 mb per second. And although NEAR’s throughput is over double that of Celestia’s, both projects are underwhelming and best suited for Layer 2s with limited performance needs.

As the number of Web3 use cases expands, so too do the requirements to adequately process them on-chain in a timely, secure, and cost-effective manner. The reality is that existing DA solutions are still inadequate for this, and cannot support use cases such as:

  • On-chain AI: AI models must quickly access vast amounts at a low cost for model training, data inference, and more.

  • High-Frequency Trading: Require ultra-fast querying of order book prices, as well as fast transaction settlement.

  • Gaming: Need fast access to data containing game states, assets, player interactions, and more.

  • Interoperability: Fast sharing of data between different blockchain networks.

  • Data Markets: Where users can store, update, and query data on-chain efficiently, unlocking new business revenues around data sharing and monetization.

Fortunately, given the importance of the DA space, it has received more widespread attention and many are racing to build the truly performant solutions needed.

It is also a vast market, meaning there will be no shortage of data availability needs and room for many solutions.

Infinitely Scalable DA: 2024 Onwards

The DA space is racing to address the use cases above, and more. The winners in the space will provide a solution that’s:

  1. Fast: Data should be quierable or verifiable in sub-seconds, similar to Web2 data infrastructure solutions like AWS.

  2. Interoperable: Capable of supporting EVM and non-EVM networks.

  3. Highly Scalable: Able to interact at a global scale with billions (or more) of transactions at any given moment in time. This is especially so for AI or high-frequency trading.

None of the early DA solutions have solutions to support this at present, but have bright teams working hard.

Meanwhile, 0G has raised $35M in pre-seed funding round for a new data availability architecture predicted to address the above.

We covered this in-depth in our blog post here, but at a high level 0G has a built-in data storage system that can be quickly queried for fast data availability at ~10-30 MB per second, or 24x faster than Celestia. It is also infinitely scalable, as more consensus networks can be added, with each handling transactions in parallel. The priority has been AI projects, as they in particular require a DA solution like 0G, and we are focused on making AI a public good.

The tech is thus far highly promising, which is why investors have been so attracted to the project. However, the team also realizes that this is a major market with room for many, and hopes to see other promising solutions proposed so that we can advance the space together.

All in all, we hope that the above provided useful context around the data availability landscape.

If you’d like to join our ecosystem, shoot us a message! 

And follow us on Twitter for up-to-date announcements.

Sign up for our newsletter

Develop

Learn

Ecosystem

© 2024 0G. All rights reserved.

Follow us on

Develop

Learn

Ecosystem

© 2024 0G. All rights reserved.

Follow us on