Allgemein

TiDB and the rise of the AI-native database

TiDB and the rise of the AI-native database

An overhead, central-perspective digital artwork featuring a dense, futuristic landscape of intricate 3D structures. The composition is packed with highly detailed rectangular and cylindrical elements in white, vibrant orange, and dark grey, all converging toward a central point to create a sense of deep perspective. The aesthetic resembles a complex microchip, a sprawling "greeble" city, or a massive, multi-layered data infrastructure.

When enterprises talk about artificial intelligence, the attention usually points to models: larger parameters, faster inference, cheaper tokens. But we at TiDB contend that this framing misses the most consequential change now underway in this sector.

In the AI era, data infrastructure — not AI models — will define the new competitive advantage. This parallels the development of databases themselves, which are undergoing their own fundamental transformation.

That belief is shaping the evolution of TiDB, our distributed open-source SQL database, also described as an AI-native database. It is designed not only to store data at scale but to support millions (yes, millions) of autonomous AI agents creating, querying, and discarding databases at machine speed.

The top-tier large language models will eventually be available to everyone. What will differentiate companies is not the model; it’s the data, and how efficiently you and your AI agents can use it.

For enterprise applications using agentic AI, “the real bottleneck is not the raw size of the underlying model but its ability to remember, retrieve, and reuse the right information at the right time,” co-author Blaize Stewart and I wrote in our recent report, “Agentic AI data architectures: How distributed SQL unifies enterprise scale and AI-native application design,” published by O’Reilly Media.

Physical memory is the key to all of this. “If memory is what grants agents the capacity to act with persistence and purpose, then infrastructure is what determines whether that capacity can endure,” we wrote. “Memory cannot be left to chance or patched together from tools that were never intended to sustain it … Just as no chef can serve reliably in a kitchen without order, equipment, and flow, no agent can serve effectively without an environment built for memory.”

From big data to personal data at massive scale

For more than 20 years, enterprise scalability meant consolidating data into centralized systems: private storage arrays, Hadoop systems, then cloud data warehouses such as Snowflake or Databricks. That model assumed humans were the primary consumers of data and that schemas, queries, and workloads were carefully designed.

AI changes that assumption entirely. In the agent-driven world we describe, every person — and every enterprise customer — can have their own AI agents operating over their own data context. These agents don’t wait for centralized pipelines or curated datasets. They operate continuously, dynamically, and independently.

In the past, we didn’t have the computing power to extract value from everyone’s personal data. Now we do. Everyone will have an AI agent sitting on top of their own data.

That change has enormous implications for databases. For example, data is never discarded; it’s retained indefinitely. Schemas diverge rather than converge, and scale is measured not in terabytes but in the number of databases.

When AI agents become the primary database users

One of our most striking observations is that humans are no longer the primary users of databases.

In newer AI-native startups — including Manus, an emerging AI agent platform recently acquired by Meta and built with TiDB — databases are created, queried, and destroyed entirely by AI agents, often without any direct human interaction. These are not developers writing careful SQL code anymore. Agents are goal-driven. They try many approaches, write fragments of queries, test possibilities, and discard databases when the task is done.

That behavioral change creates workloads databases were never designed to handle, such as millions of tiny, short-lived database instances; extremely high metadata churn; bursty, unpredictable access patterns; and cost sensitivity at unprecedented scale.

In Manus’ case, agents, not humans, created more than 90% of newly created database instances, and roughly 99% were one-time use.

Why conventional databases break under agent scale

Conventional databases, even cloud-hosted MySQL or PostgreSQL, struggle under this model. The economics alone are prohibitive.

If every agent creates a $5-per-month database instance, no one can afford this. Within three months, Manus created close to 1 million database tenants.

The real bottleneck, however, isn’t storage – it’s metadata management. Most of these databases are tiny. The challenge is not storing data; it’s tracking and managing millions of database instances. This is where TiDB’s architecture becomes so important.

TiDB as a virtualized, massively multi-tenant database layer

Rather than spinning up a physical database for each agent task, TiDB is evolving into a virtualized database layer in which logical tenants are isolated yet share the underlying infrastructure. Extreme multi-tenancy is no longer a nice-to-have — it’s the core requirement. We’re not talking about thousands of tenants anymore. We’re talking about millions, or hundreds of millions, at very low cost.

TiDB’s distributed architecture, SQL compatibility, and separation of compute and storage make this possible. It enables fast creation and teardown of logical databases; efficient metadata handling at scale; elastic resource -sharing across workloads; and SQL as a stable interface for AI agents. The stable interface is the key.

Despite decades of predictions to the contrary, we remain convinced that SQL will remain the dominant interface for AI-driven data access. SQL is still the language for AI; agents use it because it works.

The cost model has to change — radically

Agent-driven infrastructure doesn’t just break databases — it breaks pricing models. TiDB had to invent an entirely new pricing approach for Manus, abandoning per-instance or per-tenant pricing in favor of a usage-based, aggregate-consumption model. We cannot price per database or per agent. Most of them exist for minutes or hours, then disappear.

This reality reflects a broader truth about AI-native systems: Efficiency at scale matters more than individual optimization. AI agents are vastly more productive than humans, but that productivity can turn into runaway costs without architectural discipline.

Full-stack AI experiences without technical users

One reason we find Manus compelling is that it pushes the database entirely out of sight. In Manus’ latest iteration, a user can ask by voice to create a website, analyze data, or build an application. The AI agent then interprets the request, writes the code as needed, provisions all the infrastructure, spins up databases, executes tasks, and tears everything down when finished.

No technical background is required. Users don’t manage databases or servers. The AI handles it all.

This is the future TiDB must support: databases as invisible, ephemeral infrastructure, operating at LLM (large language model) scale.

The database market is entering a new era

We believe the database sector is at the start of its next major transition, on par with the rise of the cloud itself. In three years, TiDB won’t be a single-product company. You’ll see MySQL-compatible, PostgreSQL-compatible systems, maybe even file-system-based storage for agents.

What ties them together is the same core idea: databases must be designed for AI agents first, humans second. In that world, TiDB’s evolution — from distributed SQL database to AI-scale data platform — offers a preview of what enterprise data infrastructure may soon look like.

If you want to compete in the AI era, the best strategy is simple: Store everything. And make it usable at machine speed.

The post TiDB and the rise of the AI-native database appeared first on The New Stack.