-->

Friends of Enterprise AI World! Register NOW for London's KMWorld Europe 2026 & save £300 with the code EAIFRIEND. Offer ends 12/12.

NVIDIA NeMo Microservices Accelerate Accurate, Secure AI Agent Development and Management at Scale

NVIDIA has just announced the general availability of NVIDIA NeMo microservices, a modular platform for building and deploying AI workflows across on-premises or cloud Kubernetes environments. Designed in part to jumpstart the creation of state-of-the-art agentic AI systems, NVIDIA’s latest microservices framework offers an end-to-end developer platform for accelerating AI agent development.

These microservices employ data flywheels—or, as NVIDIA defines, a feedback loop where data collected from interactions or processes is used to continuously refine AI models—to build effective AI agents, or “teammates,” that tap into user interactions and data. This helps improve model performance over time, where usage generates insight, therefore driving actionability, according to NVIDIA.

These data flywheels help the model tap into three crucial areas of data—inference data to gather insights and adapt to changing data patterns, up-to-date business data to provide intelligence, and user feedback data to advise if the model and application are performing as expected. Custom data flywheels can be built to enhance AI agent accuracy and efficiency at scale, aligning agents closer to business objectives.

With the help of data flywheels, the NVIDIA NeMo microservices accelerate AI agent development through three key services:

  • NeMo Customizer, a high-performance, scalable microservice that expedites large language model (LLM) fine-tuning, capable of delivering up to 1.8x higher training throughput with popular post-training techniques such as supervised fine-tuning and low-rank adaptation
  • NeMo Evaluator, which simplifies the end-to-end evaluation of generative AI (GenAI) applications, including both retrieval-augmented generation (RAG) and agentic AI, through an intuitive API based on custom and industry benchmarks
  • NeMo Guardrails, which streamlines scalable AI guardrail orchestration for safeguarding GenAI apps, helping to improve compliance by up to 1.4x with only half a second of additional latency

These microservices empower enterprises to build a range of AI agents—as well as scale highly specialized, multi-agent systems—without increasing operational complexity. Deployable through the NVIDIA AI Enterprise software platform, NeMo microservices are simple to operate and can run on any accelerated computing infrastructure, on premises or in the cloud.

The “enterprise-wide impact [of NeMo microservices] positions AI agents as a trillion-dollar opportunity—with applications spanning automated fraud detection, shopping assistants, predictive machine maintenance, and document review—and underscores the critical role data flywheels play in transforming business data into actionable insights,” explained Joey Conway, senior director of generative AI software for enterprise at NVIDIA.

The NeMo microservices are accompanied by broad support for a range of popular open models, such as Llama, the Microsoft Phi family of small language models, Google Gemma, Mistral, and Llama Nemotron Ultra.

To learn more about NVIDIA NeMo microservices, please visit https://www.nvidia.com/en-us/.

EAIWorld Cover
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues