Nscale’s Serverless Inference Makes Cost-Efficient AI Accessible to All

Nscale, the hyperscaler engineered for AI, is debuting the Serverless Inference platform, enabling enterprises to instantly access generative AI (GenAI) models without needing to manage the underlying infrastructure. Nscale Serverless Inference democratizes AI to organizations of all sizes, delivering cost efficient, scalable deployment shaped by a pay-as-you-go model.

With Serverless Inference, users can immediately deploy an array of GenAI models—including Meta's Llama, Alibaba's Qwen, and DeepSeek—without dealing with infrastructural headaches. Users pay only for what they consume, reducing operational overhead spurred by idle capacity costs which inhibit enterprises’ abilities to deploy and experiment with GenAI models, according to Nscale.

When paired with the broader Nscale platform, users benefit from a variety of other capabilities, including Slurm and Kubernetes orchestration, observability, and multi-tenant security.

“Launching our Serverless Inference platform marks Nscale’s expansion into public, on-demand AI services, making AI model deployment simple and cost-effective,” said Daniel Bathurst, chief product officer at Nscale. “While our private cloud remains ideal for large-scale enterprise workloads, this new serverless option enables more developers to experiment with and scale inference workloads. With upcoming features set to include dedicated endpoints, fine-tuning capabilities and the ability to support custom model hosting, we're proud to offer sovereign, European AI infrastructure to meet rapidly growing inference demand.”

To learn more about Nscale, please visit https://www.nscale.com/.

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

Register Now to SAVE BIG & Join Us for Enterprise AI World 2025, November 19-20, in Washington, DC

Nscale’s Serverless Inference Makes Cost-Efficient AI Accessible to All

AI-Ready Data for GenAI Success

Semantic Layers for the AI Era: Real-Time Access, Smarter Decisions

Accelerating generative AI deployment

The New Era of Gen AI: Enabled by Logical Data Management

More

Explainability and Interpretability: Building Trustworthy AI Models

Building and Managing an Effective AI Governance Strategy

More Webinars