OpenAI launches ChatGPT-o1
OpenAI launched Sept. 12, 2024, the first version of their new 'Strawberry' series of AI models: ChatGPT-o1. This makes it the first company to move from AI Level 1 to Level 2. The name o1 can be seen as a reset (from the 'old' GPT-4 model of which Sam Altman previously stated 'it sucks') to the beginning of a new phase.
What levels of AI are there?
According to OpenAI's vision of the development toward artificial general intelligence (AGI), this happens in 5 different stages:
- Conversational AI (such as ChatGPT-4 and current models of Claude, Gemini, Grok and Llama): Conducts natural conversations.
- Reasoning AI (ChatGPT-o1): Solves basic problems similar to someone at the level of a PhD student or who has earned a doctorate. AI outperforms 50% of humans on the tasks they are skilled at.
- Autonomous AI: Acts independently for extended periods of time without human intervention. AI outperforms 90% of humans on the tasks they are skilled at.
- Innovating AI: Develops new ideas and solutions. AI outperforms 99% of humans on the tasks they are skilled at.
- Organizational AI: The ultimate level of AI researchers where it performs tasks of an entire organization and outperforms 100% of its people.
OpenAI expects to reach level 5 (AGI) around 2030, a little sooner or a little later!
The difference between GPT-o1 and GPT-4o
What makes GPT-o1 different from GPT-4o? Whereas with ChatGPT-4o (level 1) you get an answer almost immediately, these new level 2 models can think longer and incrementally to give better answers and follow instructions better. With the answer, you can also see what steps it followed and how long it took. Here you can see ChatGPT-o1 in action. I asked the infamous question "how many r's are in the word 'strawberry'? You can see how ChatGPT-4o and o1 handle this simple-seeming question that all AI models have so far failed to address.
The performance of ChatGPT-o1
In various tests, ChatGPT-o1 showed impressive results: it solved 83% of the problems in the International Mathematics Olympiad qualifier, while GPT-4o achieved only 13%. In coding benchmarks, o1 outperformed 89% of participants in Codeforces competitions.
The o1 models use advanced reinforcement learning techniques. This allows them to refine their thinking process, try different strategies and even learn from their mistakes. OpenAI has also made great strides in the area of security. The o1 preview model scored 84 on one of OpenAI's most difficult jailbreaking tests, compared with just 22 for GPT-4o.
Getting started with ChatGPT-o1
If you have a Plus or Teams subscription, you can now use ChatGPT-o1 by selecting 'o1 preview' or 'o1 mini' in ChatGPT at the top. Don't go all out right away, as o1-preview has a limit of 30 messages per week, while o1-mini has a limit of 50 messages per week.
Try o1 especially for complex tasks or multi-step problems to experience the differences and get an idea of where this development is going. In this first preview version, several features of ChatGPT-4o are still missing, such as the voice feature, accessing the Internet, invoking GPTs and uploading files and images.
OpenAI indicates that these features will be added soon. For now, it is still best to use the ChatGPT-4o version for the tasks where you need it. With the introduction of these "self-thinking" AIs, we are entering a new phase where we can use AI for more complex tasks and the outcomes will become much better. This gives us very interesting opportunities, but also challenges to adapt our role and way of working to this.
Invitation to monthly AI inspiration sessions
We would love to take a look at the most important AI developments with you, discuss the latest news and updates, share experiences, give you concrete tools, you can ask questions and spar with us.
Do you enjoy being a part of this as well? Then sign up.