NVIDIA Introduces an Efficient Family of Open Models for Building Agentic AI Applications
NVIDIA is launching the NVIDIA Nemotron 3 family of open models, data, and libraries designed to power transparent, efficient, and specialized agentic AI development across industries.
The Nemotron 3 models—with Nano, Super, and Ultra sizes—introduce a breakthrough hybrid latent mixture-of-experts (MoE) architecture that helps developers build and deploy reliable multi-agent systems at scale, according to NVIDIA.
“Open innovation is the foundation of AI progress,” said Jensen Huang, founder and CEO of NVIDIA. “With Nemotron, we’re transforming advanced AI into an open platform that gives developers the transparency and efficiency they need to build agentic systems at scale.”
NVIDIA Nemotron supports NVIDIA’s broader sovereign AI efforts, with organizations from Europe to South Korea adopting open, transparent, and efficient models that allow them to build AI systems aligned to their own data, regulations, and values, the company said.
Early adopters, including Accenture, Cadence, CrowdStrike, Cursor, Deloitte, EY, Oracle Cloud Infrastructure, Palantir, Perplexity, ServiceNow, Siemens, Synopsys, and Zoom, are integrating models from the Nemotron family to power AI workflows across manufacturing, cybersecurity, software development, media, communications, and other industries.
“NVIDIA and ServiceNow have been shaping the future of AI for years, and the best is yet to come,” Bill McDermott, chairman and CEO of ServiceNow. “Today, we’re taking a major step forward in empowering leaders across all industries to fast-track their agentic AI strategy. ServiceNow’s intelligent workflow automation combined with NVIDIA Nemotron 3 will continue to define the standard with unmatched efficiency, speed and accuracy.”
As multi-agent AI systems expand, developers are increasingly relying on proprietary models for state-of-the-art reasoning while using more efficient and customizable open models to drive down costs. Routing tasks between frontier-level models and Nemotron in a single workflow gives agents the most intelligence while optimizing tokenomics.
The open Nemotron 3 models enable startups to build and iterate faster on AI agents and accelerate innovation from prototype to enterprise deployment.
Available now, Nemotron 3 Nano is the most compute-cost-efficient model, optimized for tasks such as software debugging, content summarization, AI assistant workflows, and information retrieval at low inference costs, NVIDIA said. The model uses a unique hybrid MoE architecture to deliver gains in efficiency and scalability.
Nemotron 3 Super excels at applications that require many collaborating agents to achieve complex tasks with low latency. Nemotron 3 Ultra serves as an advanced reasoning engine for AI workflows that demand deep research and strategic planning.
Nemotron 3 Super and Ultra use NVIDIA’s ultraefficient 4-bit NVFP4 training format on the NVIDIA Blackwell architecture, significantly cutting memory requirements and speeding up training. This efficiency allows larger models to be trained on existing infrastructure without compromising accuracy relative to higher-precision formats, the company said.
With the Nemotron 3 family of models, developers can choose the open model that is right-sized for their specific workloads, scaling from dozens to hundreds of agents while benefiting from faster, more accurate long-horizon reasoning for complex workflows, the company said.
NVIDIA also released a collection of training datasets and state-of-the-art reinforcement learning libraries available to anyone building specialized AI agents.
The Nemotron Agentic Safety Dataset provides real-world telemetry to help teams evaluate and strengthen the safety of complex agent systems.
Nemotron 3 Nano is available now on Hugging Face and through inference service providers including Baseten, DeepInfra, Fireworks, FriendliAI, OpenRouter, and Together AI.
Nemotron is offered on enterprise AI and data infrastructure platforms, including Couchbase, DataRobot, H2O.ai, JFrog, Lambda and UiPath.
For customers on public clouds, Nemotron 3 Nano will be available on AWS via Amazon Bedrock (serverless) as well as supported on Google Cloud, CoreWeave, Crusoe, Microsoft Foundry, Nebius, Nscale, and Yotta soon.
Nemotron 3 Nano is available as an NVIDIA NIM microservice for secure, scalable deployment anywhere on NVIDIA-accelerated infrastructure for maximum privacy and control.
Nemotron 3 Super and Ultra are expected to be available in the first half of 2026.
For more information about this news, visit www.nvidia.com.