Microsoft Debuts Advanced Image Generation Model, GPT-image-1
Microsoft is unveiling GPT-image-1, its latest and most advanced image generation model that sets the standard in generating high-quality images and solving complex prompts, according to the company. Building off the success of its predecessor, DALL-E, GPT-image-1 brings a variety of new capabilities with multi-modality and strengthened performance.
Since DALL-E, Microsoft has made significant enhancements to the functionality of GPT-image-1, including more granular instruction response. GPT-image-1 is better equipped to understand and execute detailed instructions for precise and accurate image generation, further capable of reliably rendering text within images.
Outside of the model’s enhancements, GPT-image-1 also supports several modalities and new features, such as:
- Text-to-image for generating images from text prompts
- Image-to-image for creating new images from user-uploaded images and text prompts
- Text transformation for editing images using text prompts
- Inpainting, which allows users to edit images with text prompts and user-drawn bounding boxes
Due to both its multi-modal nature and enhanced performance, GPT-image-1 unlocks a myriad of new use cases, such as generating material for educational purposes or developing video game assets with a consistent style and design. GPT-image-1 is also designed to seamlessly integrate with APIs, expanding what’s possible with image generation.
GPT-image-1 adheres to Microsoft’s standard of robust safety, delivering c2pa and input/output moderation, as well as content safety and abuse monitoring.
To learn more about GPT-image-1, please visit https://www.microsoft.com/en-us/.