It seems unbelievable that just over two years ago ChatGPT barely existed. This week, two giants in artificial intelligence, Google and OpenAI, have presented their vision of the future regarding the application of this technology, and both are looking towards integration through audio and video.
Google, at its Google I/O event, presented Gemini Live, among many other things, and just a day before, OpenAI revealed ChatGPT Voice with GPT-4o to the world, a new model faster and more focused on voice responses.
The most obvious analogy made is with the movie Her, where a person falls in love with their voice assistant, an artificial intelligence.
But, what do these innovations mean for us and how could they transform our daily interaction with technology?
Google vs. OpenAI: A race for AI supremacy
At the forefront of artificial intelligence technology, Google and OpenAI have been setting the pace since the beginning, in a sort of relentless cold war. In fact, OpenAI scheduled its presentation on purpose to precede Google’s.
However, both share a common goal: to integrate their new AI technologies into everyone’s daily lives.
ChatGPT Voice vs. Gemini Live
Both products promise to revolutionize the way we interact with our devices through natural voice interfaces and real-time video analysis.
Both companies have shown images of their employees interacting with their surroundings through video and audio that their respective AI interprets through their smartphone.
ChatGPT Voice, in particular, has been praised for its ability to sound extremely natural and adapt in real-time to the emotional tones of the conversation.
On the other hand, Google has introduced Project Astra, still a prototype, but it will reach all users through the name Gemini Live. It also promises innovations, seemingly still depending on other models for the generation of visual and audio content, such as Imagen 3 for images and Veo for video.
The new future of voice and video assistants
Looking towards the future, it is clear that voice interaction is becoming a crucial part of the digital experience.
The release of these technologies not only reinforces the idea of a significant change in the human-computer interface but also raises questions about the future direction of these developments.
Will we see OpenAI venturing into hardware with its own smart glasses? Or will Google try to revive and dominate this market with a new version of Google Glass?
It seems that we never reach the peak of a revolution in interaction with artificial intelligence technology. Just as the mouse and the touch screen changed the paradigm at the time, these new voice assistants promise to open new frontiers in accessibility and functionality. As always, the real impact of these technologies will depend on how they are adopted and adapted in our daily lives.