Day 6 of 12 Days of OpenAI Brings Real-Time Video to Advanced Voice
Returning with Day 6 of the 12 Days of OpenAI—where for 12 days of December, the company will launch new product innovations for the AI space—OpenAI introduced video in Advanced Voice mode.
Video in Advanced Voice mode enables users to bring live video and live screen sharing into ChatGPT interactions. When entering Advanced Voice mode, users can chat with ChatGPT through video and voice in real time. Leveraging ChatGPT’s natively multimodal 4o model to create a natural conversation pace, users can share real-time visual context with ChatGPT to make conversations richer and more useful.
Rowan Zellers, a member of technical staff at OpenAI, showcased an exciting example of OpenAI’s new capability in action. Armed with a kettle, a coffee cup, a dripper, and ChatGPT’s video in Advanced Voice mode, Zellers prompted ChatGPT to walk him through the steps of making pour-over coffee.
Guiding Zellers through each step—paired with some informative commentary about the process—ChatGPT demonstrated the way its latest feature can amplify human processes, all in real time. During the interaction, Zellers was able to ask ChatGPT additional questions about the process as it unfolded—including how his technique looked. The full video of this demo can be found here.
With live screen sharing, ChatGPT was further able to aid Zellers, recognizing the app he was using on his phone, and responding to Zellers’ query in real time. Most interesting about this interaction was ChatGPT’s capacity to deliver a response with a tone matching one that Zeller’s requested—in this case, a polite way of telling Kevin Weil, CPO at OpenAI, not to quit his day job in pursuit of becoming a mall Santa (with a festive photo attached).
For more information about OpenAI’s latest innovations, please visit https://openai.com/. To stay up-to-date with the 12 Days of OpenAI, please visit https://www.enterpriseaiworld.com/.