OpenAI Debuts Hyperrealistic, Highly Accurate AI Image Generator for GPT-4o
OpenAI is unveiling its most advanced image generator for GPT-4o, OpenAI’s new flagship model that can reason across audio, vision, and text in real time. Capable of generating hyperrealistic images or more artistic, 2D outputs, OpenAI’s image generation is an incredibly accurate, natively multimodal model that unlocks useful and valuable image generation, according to OpenAI.
According to OpenAI, GPT-4o’s image generation rivals traditional generative models in its capacity to accurately render text and precisely follow prompts, powered by 4o’s knowledge base and chat context. The model was trained on the inherent relationships between text and online images, learning not only how images relate to language, but how images and language relate to each other.
The new image generator in GPT-4o offers the following capabilities:
- Advanced text rendering that blends precise symbols with imagery, enhancing visual communication
- Multi-turn generation which allows users to refine images through natural conversation, maintaining chat context
- Detailed instruction following capable of handling up to 10-20 different objects (where other systems struggle with 5-8 objects, according to OpenAI)
- In-context learning based on user-uploaded images, where the model can seamlessly integrate details from the uploaded content into its context to inform the image generation
- Photorealistic or stylistic generation, based on a variety of image styles
Though highly advanced, OpenAI emphasizes that its image generator “isn’t perfect,” where multiple limitations will be improved upon after the initial launch. Currently, the image generator can occasionally crop images, hallucinate information, and struggle to bind over 20 objects, to name a few of its current limitations.
Regarding safety, OpenAI intends to block requests that violate its strong safety standards. For example, when images of real people are used in context, OpenAI has implemented heightened restrictions regarding what sort of image can be created—with particularly stringent safeguards surrounding nudity and graphic violence.
4o image generation is currently available to Plus, Pro, Team, and Free users as the default image generator in ChatGPT, with access coming soon to Enterprise and Edu. It’s also available in Sora, as well as through a dedicated DALL·E GPT.
To learn more about OpenAI’s 4o image generator, please visit https://openai.com/.