Back
0g
Sep 5, 2024
0G recently announced its whitepaper, a comprehensive document designed to address the complex challenges surrounding high-performance data availability in decentralized networks.
While the whitepaper is viewable here, this article will serve as a simplification and overview.
For those wishing to navigate the whitepaper on their own, we recommend:
Begin with the Abstract and Introduction. These sections provide a concise summary of the data availability problem and 0G’s approach to solving it.
Explore the System Design: The “Overview of the System Design” section provides a high-level view of how 0G’s architecture is structured, including the relationships between 0G Storage, 0G DA, and 0G Consensus.
Dive into the Technical Details: For those interested in the technical implementation, the sections on “0G Storage” and “Log System Protocol” break down how data is stored, accessed, and validated.
Review the Incentive Mechanism: The “Incentive Mechanism” section is particularly important for potential node operators and those interested in the economic aspects of the network, covering 0G’s Proof-of-Random-Access consensus and associated information.
By following this roadmap, you can efficiently navigate the 0G Whitepaper and focus on what’s most relevant to you - whether you’re a developer, researcher, or potential partner.’
And if you’re interested in building with 0G, reach out here.
Context: Why 0G?
The need for 0G arises from the rapidly growing volume of on-chain data and the increasing demand for advanced applications, such as decentralized AI and high-frequency trading. Current data availability solutions offer some relief but come with significant limitations, including latency, throughput, interoperability, and cost inefficiencies. These limitations hinder their ability to support the next generation of decentralized applications.
For instance, Celestia's approach, which broadcasts data to all consensus nodes, caps throughput at 10 MBps. Additionally, Celestia's light client model raises concerns about profitability, and the existing codebase may require substantial updates to remain competitive.
0G addresses these challenges by offering an infinitely scalable AI operation system through a modular and layered architecture design.
This architecture not only provides robust data storage and availability solutions but also scales to meet the demands of advanced use cases, many of which we covered here.
Now, with this context, let’s begin with 0G’s system design.
Section 1: System Design
The design of 0G revolves around a modular and layered architecture, which forms the backbone of its scalable and secure data availability layer, referred to as “0G DA.” This architecture is built on top of a decentralized storage network known as “0G Storage.” The modular approach allows each component to function independently yet cohesively, enabling efficient querying and verification of data directly from 0G’s internal system, ensuring high performance and reliability.
For 0G Storage to store the data, it is first erasure coded, meaning that it’s fragmented into smaller pieces with redundant elements and distributed across multiple storage locations. This enables fast recovery in the event of any storage node failures. A Merkle tree is then formed with this data and submitted to 0G’s consensus layer (“0G Consensus”), which cryptographically records the order of data entering the system and ensures fast and accurate data retrieval. The chunks of data are then sent to various Storage Nodes, which are covered below.
The interaction between 0G Storage and 0G Consensus operates as follows:
Data is erasure-coded and stored in 0G Storage.
0G Consensus manages the communication and synchronization of data through a specialized data publishing lane, which ensures that only essential information flows through the consensus process, optimizing performance and scalability.
Data is erasure-coded and then stored in 0G Storage, communicating with 0G Consensus via the data publishing lane.
The architecture is highly scalable as 0G Consensus can have an arbitrary number of independent consensus networks (e.g. 1, 100, or even 1,000). These consensus networks operate in parallel, scaling with demand and each is responsible for managing data commitments and ensuring the integrity of stored data.
A major risk is that malicious Storage Nodes may exist and collude when interacting with 0G Consensus. To mitigate this, a quorum design is used whereby a VRF randomly selects nodes to prevent collusion.
By distributing data across various partitions and allowing these partitions to be processed simultaneously by different consensus networks, 0G achieves virtually infinite scalability. This design ensures that the system can handle the growing demands of decentralized AI platforms,, such as providing massive volumes of data for model training in real time.
Furthermore, 0G incorporates a shared staking mechanism across all its consensus networks. Validators stake funds on the Ethereum mainnet and subsequently validate across the various 0G Consensus networks. This shared staking approach ensures a consistent level of security across all consensus networks while leveraging Ethereum’s security guarantees.
This is illustrated below.
Validators stake $0G on mainnet, simultaneously validate various 0G Chains, and burn $0G on these networks in exchange for $0G on mainnet.
Section 2: The Log Layer
Next, we’ll cover 0G’s log system, which is an append-only system for unstructured data, responsible for organizing and storing data efficiently within 0G Storage.
Data entries in the log system correspond to a transaction that is stored in a sequential, append-only manner. This is akin to a traditional filesystem, where every log entry functions like a file that can be easily accessed and verified by any participant in the network. The result is that data is submitted to 0G Storage while kept readily available.
There is also a Key-Value (KV) runtime which operates on top of the log system, focused on structured data that can be updated via new entries that are added to log entries. We’ll cover this further in Section 4.
A Key-Value (KV) runtime operates on top of the Log Layer, which sends append-only updates to 0G Storage.
Section 3: Incentivization
To effectively incentivize miners to store data within the 0G network, the Proof of Random Access (PoRA) protocol is employed.
PoRA requires miners to respond to random queries related to specific chunks of archived data. The process involves miners loading the requested data, computing a hash, and determining whether the resulting output meets the target difficulty, defined by having a sufficient number of leading zeros (similar to PoW systems like Bitcoin). Since the challenge queries are random, miners can increase their chances of successfully mining rewards by dedicating more computing power and storing more data. This approach ensures that the most active and well-resourced nodes are appropriately rewarded for contributing to the network's data availability.
To maintain fairness among miners, particularly those with fewer resources, the mining range is capped at 8 TB of data. This means that even single-machine miners can compete effectively within a designated 8 TB data range, while larger operations with multiple machines can mine across several data ranges simultaneously. This design ensures that smaller miners have a fair opportunity to earn rewards, preventing larger operations from monopolizing the mining process.
Incentivizing nodes to share data is inherently challenging, especially since much of this activity occurs outside the direct oversight of the 0G Consensus. To address this, 0G has a built-in royalty mechanism that rewards nodes for data sharing. When a new PoRA mining proof is generated based on shared data, the original data provider receives a royalty. For example, if Node A shares data with Node B, and Node B uses that data to create a valid mining proof, Node A would be rewarded with a portion of the mining rewards.
Section 4: Key-Value Runtime
The Key-Value store provides a way to handle structured data that requires mutability (i.e., the ability to change or update data over time). This is essential for applications that involve changing data, relevant to all use cases that 0G serves.
The KV store is built on top of the log layer, which is responsible for archiving unstructured data in a decentralized and append-only manner. The log layer ensures that the data remains consistent and immutable once written, while the KV store allows for the management of mutable data within this structure.
The KV store also supports transactional processing, which ensures that operations on the data are executed in a consistent manner, even when multiple users are making updates simultaneously. This is crucial for applications like collaborative editing (e.g. on-chain versions of Google Docs, Notion, and more). All KV store nodes in the network synchronize their states by processing the log entries, ensuring that each node has an up-to-date and consistent view of the data.
Conclusion
Throughout the Whitepaper, we've explored key aspects of 0G’s architecture, including the system design, the log layer, the incentive mechanisms, and the Key-Value runtime. Each of these components plays a critical role in ensuring that 0G can deliver a decentralized storage solution that is both efficient and reliable.
We encourage developers, researchers, and industry partners to dive deeper into the whitepaper and to consider the opportunities for collaboration. As 0G continues to evolve, ample opportunities exist to work together in making AI a public good.
If you would like to get involved, please reach out here.
Sign up for our newsletter