How Web3 and AI can work together

Web3 x AI has turned into quite a hot topic recently; my inbox has been spammed with newsletters, event invitations, and exciting updates from projects building in the space. There are new visionaries hosting podcasts promising us that this trend is real, and here to stay.

All of this hype raises the question: what is Crypto x AI?

This post is an attempt to answer that question, while also covering three key trends we're bullish on:

Tokenization for improved data,
Decentralized compute networks, and
Anti-monopolization via decentralization.

The way we see it, Crypto x AI is the combination of places where web3 can enable smarter, better, and faster AI, the places where we in the blockchain world can work with the AI world to build things that can’t be achieved without Web3.

To see where this overlap exists, we should focus on what Web3 enables: aligning financial incentives among distributed and untrusted parties. With this, we can refine our question: where does the AI world require the trustless, transparent, or more aligned incentive structures that we in Web3 can make possible?

Well, to get an AI model to work, you need three things:

A model architecture,
Data to feed into that model, and
Compute (GPUs) to run that model on.

So, when we discuss enabling superior incentive systems, we should aim to improve these three categories. This is already becoming a reality in three areas:

Tokenization for improved data,
Decentralized compute networks, and
Anti-monopolization via decentralization.

We’ll explain why you should be working towards improving this as well.

Tokenization For Improved Data

When we think about tokenization and AI, one of the things we’re most bullish on is using tokenization to boost the financial structures for acquiring quality data. One of the most important ways to improve model quality is by gathering more high-quality data. This is where we believe that decentralized data-gathering protocols have a decisive edge. Collection and labeling tasks scale very well to decentralized labor pools, and we see a world where the data to train foundation models could come from a collection of individuals rather than corporate data vaults. When you think about what it takes to label data, it’s fairly simple: anyone with a cell phone and internet connection can start labeling.

Additionally, by using trustless incentive systems, it becomes possible to reward data providers in a way that is correlated with the value of the data that they provide, which means that you can build datasets that are not just big but also universally high-quality, which leads to better models.

Several decentralized protocols are building out the tech to have decentralized workforces solve these problems while correlating their economic incentives directly to the value added by their contributed data. At 0G, we’re excited about this trend, and we’re helping to accelerate it with our next-generation data storage layer. This infrastructure ensures that decentralized data protocols can seamlessly deliver high-quality data to AI companies, offering the speed and performance required to build and train AI models effectively.

Decentralized Compute: Costs down, Capabilities Up

One of the highest costs of AI is the immense computing power and energy required to train and serve models.

For example, Meta’s LLAMA3 took 16,000 H100 GPUs 30 days to train. This is a $1B investment in computing hardware (GPUs+Network chips), and every month of training costs $20M in energy, burning 1.6 billion kWh. That’s equivalent to using a quarter of the power of the most powerful nuclear reactor ever built for a month at a time to train a single model. This is representative of a growing problem that’s also backed by statistics. The International Energy Agency (IEA) estimates that we’ll need about 10% of all energy produced in the U.S. to meet the booming AI demand. This is a massive increase in our electricity consumption, and we’re not ready for it.

This is a massive problem, but it leads to a place where the decentralization of AI compute can help, both in terms of a) electricity and b) AI chips. In fact, much of the additional energy that we need is already out there: The University of Cambridge estimates that annualized bitcoin mining uses 148 TWh, which is substantially more than the existing major clouds. Best of all, it’s cheap: Bitcoin miners typically have 4-10x lower costs for electricity than their centralized counterparts.

On the chip supply side, there is room for growth: experts estimate that NVIDIA made between one and ten million enterprise-grade A100/H100 chips in the last 2 years, which is not enough to meet the existing demand, let alone what is coming down the pipe. However, the consumer cards miners favor are much more abundant, with 8.1M produced in Q1 2024 alone. These cards in general, have fewer matrix multiply units but can still contribute meaningfully to collecting enough cards to run AI at scale.

This adds up to how Web3 infrastructure can help with AI’s massive computing problem: There are many computing arbitrage opportunities out there that are small enough that large companies can't take advantage of them, but crypto miner-style operations can. Many smaller strategies can scale to $200k-10M/year of compute spend, but not above that. This is perfect for protocols that find ways to construct better incentive systems that bridge the gap between Bitcoin miners and AI companies. We’re also bullish on Proof of Useful Work as a possible solution, which is what our inference serving platform is building towards.

Better AI-Societal Alignment

There is a structural misalignment between the status quo in AI, where a few centralized, monopolistic players (OpenAI, Anthropic, Google, Microsoft) currently control the flow of state-of-the-art models and what is best for our society. Their stated goal is to create Artificial Superintelligence (ASI), meaning a robot that outperforms a human on arbitrary tasks. If this centralized ASI becomes a reality, then it’s a problem for the rest of us, as the robot’s economic output will quickly outstrip our own, leaving the people who run these organizations in a position of incredible power and influence with few checks and balances.

The fundamental barrier to competing with these centralized players is economic – they can afford to make multi-billion dollar investments to drive this future, while other players are not able to do so. In previous waves of the internet, a good idea was a good idea; fixed costs to entry were not based on investment in physical infrastructure. Google famously started with servers hosted on laptops in the back of the office. In current AI systems, infrastructure investment is by far the dominant cost, which makes it very hard to break into the market, and easy for incumbents to conspire to suppress innovation.

Decentralization can offer a path out of this. We want a future where many different users/ecosystem players can have aligned interests and a way to contribute to training and deploying a model and providing the data, expertise, and compute power to make the best foundation models and inference platforms all exist in a crowdsourced, fairly distributed infrastructure.

How Web3 and AI can work together

Tokenization For Improved Data

Decentralized Compute: Costs down, Capabilities Up

Better AI-Societal Alignment

Related news:

0G Origin Story

Navigate Web3's Future with 0G: Key Concepts and Innovations

6 Emerging Key Trends in Blockchain Technology

The Comprehensive Guide to Data Availability in Blockchain Technology

Sign up for our newsletter