OpenAI just announced their latest model – GPT-4o in their spring update. This was clearly the gpt2-chatbot that was doing the rounds of the internet these past weeks. The oversimplified summary is that ChatGPT now has native multimodality built into it. Their blog does a very good job of covering the key points and Sam Altman’s blog post adds important focus to their announcement. Will try and break down the key parts:
Here’s a quick preview of it’s ‘Vision’ capabilities: A game of Rock, Paper, Scissors. The emotion is really off the scale!
While OpenAI are saying this is their best model yet, there are folks who have been testing it and finding that on specific hard code tasks, it does a little worse than GPT-4T. The thread below gets into specifics if you’re interested:
But aside from that, the final verdict – which we currently believe to be true is that GPT-4o is the best language model out there and does best on knowledge-first tasks. For complex tasks, GPT-4T or Claude Opus are still the best models. Of course, since we like opensource, Llama3 isn’t that far behind and given that it’s technically free, it deserves a mention here.
Customer Service and Support: We already knew that customer support was going to move to AI. Many companies and start-ups have built their solutions as add-ons to foundational models that provide AI based customer support. With the GPT-4o update, the foundational model, itself, will now be able to do most of these tasks. The scope of customer support will also move beyond just voice to video. Having trouble with your Ikea Furniture? No problem, have the IKEA-GPT-4o bot take a look. You can build alongside the bot. It’s also much more human-like. So hopefully irate customers won’t be as annoyed that they’ve been palmed off to an incompetent bot. (Competence yet to be proved…)
Education & Coaching: Imagine having an AI tutor who can see how you’re behaving, understand your emotions, decode your responses and see where you’re going wrong. All in real-time. Now this is a much more palatable AI teacher than what we’ve had so far. Again, all this in a foundational model. Right out of the box. Homework will be a lot easier. But beyond that, a competent Multimodal AI can be a great asset even for vocational training or other skills that require a tutor to give feedback based on physical progress. While OpenAI showed off some interview prep use cases, those could be achieved earlier via patching a few different tools together. The real genius is in the ability of the AI to understand emotion and tailor their responses to a very nuanced degree. Just imagine a sweet and loving singing teacher, rather than the generally grumpy irate Russian teacher that we’re used to. (Or maybe that’s just us😅 )
Gaming and Entertainment: While we hate to admit this, this is just the beginning for AI Waifus. Beyond that though, having realistic AI within games that can respond to sight and sound will add more depth to the games themselves. Of course having your personal AI entertainer can also be a thing. We’re hoping someone will build a cool AI entertainer for my 2y.o to enjoy.
Of course, this is just the start. Sam Altman has been saying that they like to iterate fast and ship quickly. We also know that Google io is about to start anytime now and OpenAI wanted to ensure they were the first to strike with the best assistant on the market.
What are you most excited to see in this update and how do you think this can be used in your industry? Let us know!
AI can be effectively used to teach English Grammar too. Here's a quick guide on…
AI gets taken to court, OpenAI continues making headlines, the world's first AI generated Ad…
If you're a teacher or a parent, you can use AI to help make learning…
One of the more hype weeks in AI in the last few months with new…
Apple Intelligence is pretty good. Open AI is in the news. Again! Luma Labs' Dream…
More hype in the AI Video space from China. Apple talks about AI the Apple…