-->

Register Now to SAVE BIG & Join Us for Enterprise AI World 2025, November 19-20, in Washington, DC

Amazon Nova Sonic Model Radically Streamlines the Development of Voice-Enabled AI Apps

AWS is unveiling Amazon Nova Sonic, the latest addition to the Amazon Nova family of foundation models (FMs) available in Amazon Bedrock, designed to address the various challenges of developing voice-enabled applications.

Traditional approaches to building voice-enabled applications are riddled with complexity due to the need to orchestrate multiple models, including speech recognition to convert speech to text, language models to understand and generate responses, and text-to-speech to convert text back to audio, explained Danilo Poccia, chief evangelist (EMEA) at AWS.

“This fragmented approach not only increases development complexity but also fails to preserve crucial linguistic context such as tone, prosody, and speaking style that are essential for natural conversations. This can affect conversational AI applications that need low latency and nuanced understanding of verbal and non-verbal cues for fluid dialog handling and natural turn-taking,” Poccia continued.

Amazon Nova Sonic delivers a single, unified model that centralizes speech comprehension and generation, enabling developers to create natural, human-like conversational experiences. According to the company, this new model offers expressive speech generation paired with real-time text transcription, without necessitating a separate model, enabling fluid, dynamic speech responses based on the pace or timbre of speech inputs.

With a unified model architecture, Amazon Nova Sonic delivers low latency and industry-leading performance while streamlining development and reducing overall complexity in voice-enabled app development. Additionally, the model can interact with other systems and use agentic retrieval-augmented generation (RAG) in conjunction with Amazon Bedrock Knowledge Bases, enabling access to a variety of customer-specific data.

Currently, Amazon Nova Sonic can understand speech in different speaking styles for American and British English, with more languages coming soon. The model was developed in accordance with AWS’ commitment to responsible AI, accompanied by various built-in safeguards for content moderation and watermarking.

To learn more about Amazon Nova Sonic, please visit https://aws.amazon.com/.

EAIWorld Cover
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues