The strawberry is now ripe! OpenAI has launched its mystery ‘Strawberry’ model named ‘o1’ and it is already rolled out to ChatGPT to all Plus and Team users. Developers are trying their hands on it as well as the company also shared some interesting things it can do. Here’s some of them!
OpenAI’s o1 Model is Better for Coding
The biggest differentiation of the o1 model when compared to other LLMs is that it ‘THINKS’. It processes before answering the prompt, instead of using predictive sequences of words. While CEO Sam Altman has said that “o1 is still flawed, still limited, and it still seems more impressive on first use than it does after you spend more time with it.” developers are already testing its powers.
Coding is one of the big areas where it is said to work better than other LLMs. It is better at writing code and solving multistep problems. On Codeforces competitions, o1 reached the 89th percentile of participants, while GPT-40 was at the 11th percentile.
Here are 7 insightful findings to look at:
1) Video Game Coding
First, let’s start with one of the official demos shared by ChatGPT. The model was asked to write the code for the video game: squirrel finder. The user prompted it with the game instructions and additional details. The model thought for 21 seconds describing what it was thinking about the game layout and setting up screens.
Watch the full demo here:
OpenAI o1 codes a video game from a prompt. pic.twitter.com/aBEcehP0j8
— OpenAI (@OpenAI) September 12, 2024
This shows how the ‘thinks’ might take some time to get the answer but it did everything perfectly.
2) Space Shooter Game
Continuing the video game example, this user asked the model to create a space shooter game in HTML and JavaScript. However, he didn’t include any more details (like instructions) about the project, just adding ‘Make it interesting’ in the prompt. But that didn’t stop o1. While it took about 81 seconds in the preview, it finally gave a working code:
OpenAI o1 creates a fully interactive space shooter game in less than 2 minutes and Replit lets me run it in seconds.
— Shubham Saboo (@Saboo_Shubham_) September 13, 2024
AI and coding has changed forever. pic.twitter.com/toRmzaHock
Along with the code, it also provided instructions for how to play the game. The AI model is something that each coder must try themselves.
3) Creating a blog page
This is one of the simpler coding problems compared to making video games, but let’s check it as well. The user prompted “Create a personalized blog page with all the coding“. In just 14 seconds, the output was there. It started with HTML code, then CSS, code to add blog posts and ultimately how to deploy it.
Coding with OpenAI o1 🍓😳 pic.twitter.com/zN3QZPlvC1
— Mustafa Ergisi (@mustafaergisi) September 12, 2024
It also provided an optional JavaScript code.
4) Visualizing Transformers
Coming back to official demos by OpenAI, we will see how good the reasoning capabilities are of the o1 model. The user wanted a visualization of how Transformers works. He gave a detailed prompt on what he wanted.
As we know, the reasoning helped o1 it followed all the instructions. Other models can fail with them by missing one of the instructions, but since this model takes its time, it was able to do everything it was asked. There were some minor CSS issues but nothing more than that.
5) Full Stack React Native app
Moving to some regular programming stuff, a user wanted to see whether o1 could develop a React Native app. The prompt was: “Create a React Native app that allows me to connect to a local network IP from a dropdown list, and in short stream videos from my Pi 4 just like Netflix. I want this app to run on my Fire TV cube and have voice commands. The pi 4 has 2 HDDs (USB) that hold folders with the movie files to be played, enable search by title and an auto-play next for TV shows.“
Here’s the output:
It’s official…programming as a career is over. GPT o1-preview creating a fullstack react native app…I didn’t make a company fast enough, and now I’m obsolete…fullstack for 16 years pic.twitter.com/key853iXCj
— Dallas Lones (@dallaslones) September 12, 2024
In just 5 seconds, it started putting out the code with all the steps. After that, the users again asked it to make a Node.js API for Pi 4 and it did that also. So, from the front end to the back end, it is capable of everything.
6) Full Weather app for iOS
Another big achievement is it can develop a full weather app for iOS. The user used o1 and Cursor Composer and completed the task in 10 minutes.
Just combined @OpenAI o1 and Cursor Composer to create an iOS app in under 10 mins!
— Ammaar Reshi (@ammaar) September 12, 2024
o1 mini kicks off the project (o1 was taking too long to think), then switch to o1 to finish off the details.
And boom—full Weather app for iOS with animations, in under 10 🌤️
Video sped up! pic.twitter.com/hc9SCZ52Ti
In this case, o1 mini was taking too long to think but then switched to o1 to finish the task.
7) Create a PDF Tool
The user asked it to create a tool to help people improve their reading speed via PDFs they can upload. Here’s the prompt: “Create a program in a single HTML page that lets the user upload a PDF, choose which page to start on, and then trains the user to read faster. The goal is to make something that people can use to read their PDFs comfortably, and learn to read faster in the process.“
The output is here:
My first attempt using o1-preview, I asked it to create a tool to help people improve their reading speed via PDFs they can upload.
— Ippi (@Coolzippity) September 12, 2024
It spat out a 400 line program that worked on the first try, let me upload a pdf, control the reading speed, and select the page number. pic.twitter.com/veAtwBGEj7
The user was impressed because the code worked on the first try.
Conclusion
While the o1 model is still in its nascent stage and it takes some time to ‘think’, it is still a big step ahead in the world of AI. It is not a competitor for ChatGPT but a sibling of it that makes the openAI’s toolset more worthy! ChatGPT is still useful for common reasoning, while o1 is here for maths and coding!