Allgemein

Nvidia brings together AI labs to build the next generation of open base models

Nvidia brings together AI labs to build the next generation of open base models

Nvidia on Monday announced the Nemotron Coalition at its GTC conference. This new coalition of AI labs will pool expertise, data, and evaluations to build shared base foundation models, with Nvidia handling training on its DGX Cloud infrastructure. The coalition’s first project, a new base model currently in training, will form the foundation for the upcoming Nemotron 4 family.

The founding members are Black Forest Labs, Cursor, LangChain, Mistral, Perplexity, Reflection AI, Sarvam, and Thinking Machines Lab. Several of these already have existing Nvidia partnerships. For example, Black Forest Labs, best known for its Flux image models, has been collaborating with Nvidia on model optimization, while Perplexity and LangChain are already integrating Nvidia’s Nemotron models into their platforms.

Credit: The New Stack.

Why pool base model training?

The core message here seems to be that these base models are becoming table stakes that don’t necessarily allow these companies to differentiate. Instead, it’s the post-training and other work that allow them to make these models their own.

Kari Briski, Nvidia’s VP of generative AI software for the enterprise, said as much in a press briefing ahead of the announcement. “Building frontier models demands significant time, expertise, and compute — a major investment most organizations can’t make alone,” she said. “While many want open models, few have the resources to build them independently […]. Instead of every group duplicating the effort on the same base models, we’re building a shared open foundation together.”

Developing frontier open models, after all, demands enormous compute resources that only companies like OpenAI, Anthropic, Google — and Nvidia —can invest independently. Rather than every lab duplicating the same base-model training, coalition members contribute domain expertise, data, and evaluations, while Nvidia provides DGX Cloud compute. The resulting base models are open, and participants — or anyone else — can then tune them for their own use cases.

Nemotron 3 Ultra and Super

Alongside the coalition and its plans for building the Nemotron 4 models, Nvidia also announced the newest member of the Nemotron 3 family: Nemotron 3 Ultra.

Nvidia first announced its plans for an Ultra model last year. At the time, the company said it would feature 500 billion parameters with 50 billion active parameters.

Sadly, the model isn’t available yet. Instead, Nvidia says it has finished training and calls it “the best open base model in the world,” but we will have to see what this looks like in practice.

Nvidia also highlighted Nemotron 3 Super, a 120 billion-parameter hybrid Mamba-Transformer model with 12 billion active parameters. This smaller model shipped on March 11 and scores 85.6 percent on PinchBench, the benchmark that evaluates how well LLMs perform as the brain of an OpenClaw agent. That makes it the top-scoring open model on the benchmark and the fourth-ranked model overall, according to Nvidia. It has a native 1-million token context window designed for long-running agent workflows.

The post Nvidia brings together AI labs to build the next generation of open base models appeared first on The New Stack.

KI-Assistent
Kontext geladen: Nvidia brings together AI labs to build the next generation