Final Day of the 12 Days of OpenAI Brings o3 and o3-mini
As “Shipmas” draws to a close, Day 12 of the 12 Days of OpenAI event brought the announcement of o3 and o3-mini, which are now available for public safety testing.
“o3 is a really strong model at very hard technical benchmarks,” explained Mark Chen, SVP of research at OpenAI. In regard to software engineering, “we’re seeing that o3 performs at about 71.7% accuracy, which is over 20% better than our o1 models. This really signifies that we’re climbing the frontier of utility.”
o3’s benchmarks are a defining feature of OpenAI’s next frontier of AI models, where the model consistently outperformed the o1 model in competition code (Codeforce), competition math (AIME 2024), and PhD-level science questions.
The o3-mini’s cost efficiency, paired with its reasoning power, is a remarkable milestone for OpenAI. Achieving better performance than o1 at a fraction of the cost, o3-mini additionally accounts for coding flexibility. The mini model will support low, medium, and high reasoning effort, allowing users to adjust their reasoning time for different use cases. In its coding evaluation, at medium and high reasoning effort, the o3-mini outperformed both the o1-mini and o1.
To learn more about o3 and view a demo of o3-mini, please click here.
As part of this announcement, OpenAI is inviting safety researchers to apply for early access to its o3 family of models. Complementing OpenAI’s existing frontier model testing process, early access will include rigorous internal safety testing, external red teaming, and collaborations with third-party testing organizations, as well the U.S. AI Safety Institute and the U.K. AI Safety Institute. Those interested in OpenAI’s early access safety testing can apply here.
For more information about OpenAI’s latest innovations, please visit https://openai.com/. To look back at previous days of the 12 Days of OpenAI, please visit https://www.enterpriseaiworld.com/.