The big Google I/O keynote was streamed live to the world today. Given the state of the tech field and the investor class, Google's Gemini AI received the vast majority of attention — even on the Android side of things.
Gemini AI services
Google's messaging was focused in on two aspects of their AI play: multimodality and long context. In effect, Google wants to give their AI tools the most possible information about each of their users so that it can interface intelligently with text, photos, videos audio and anything else you end up collecting in your digital life.
In the paid user-facing version of Gemini AI, you can now use up to one million tokens worth of context (PDFs, videos, photos and the like). So if you're trying to solve a specific problem, you can provide loads of data for Gemini to make connections to that is unique to your particular needs. And if you're a developer, you can access Gemini Pro with two million tokens.
The goal, essentially, is to create so-called AI "agents" that can make reasonable assumptions about your needs and actives, plan ahead and integrates in all of your productivity apps. It sounds fine, but it's hard to buy into the hype when so much of existing AI tech has been so disappointing when the rubber hits the road.
Of course, they're not content making a fancy digital assistant. Not only did we get to see how Music AI Sandbox works for musicians, we got to enjoy an extended set ahead of us the event played by none other than Marc Rebillet.
Google is also tackling AI-generated video with the Veo tool, and they partnered with Donald Glover make a weird short film. We're promised that it's "Coming soon." If you want to fiddle with it yourself, you can sign up now.
Their work with AI and video isn't all about generation though. Their tech can also be applied to search when you add in your own video clip for context. The example they showed is a broken record player arm, and giving the search engine that visual context helped it find answers around the repair process.
Gemini for businesses
As for their business offerings, the new "Gemini for Workspace" features in the coming weeks and months. Enjoy live captions for Google Meet, email summaries and AI-driven organization tools.
Looking ahead to 2025, they're working on implementing "AI teammates" to accomplish tasks. As if the goal of replacing workers wasn't obvious enough, they're just straight-up demonstrating their intent live on stage. Surely, this won't end poorly at all!
Sundar Pichai spent a long time pitching how strong their server-side AI infrastructure is too. On the surface, it's aimed at the developers in the audience to entice them to stick with Google as a business partner, but it reads a lot more like an attempt to reassure investors that Google isn't losing the AI race to the likes of OpenAI and Microsoft.
Gemini on Android
Shocker, Gemini is the new digital assistant on Android, and they're leaning very heavily into making AI-driven features the main way you interface with your phone.
Circle to search is on a hundred million devices today, and their goal is to double that within the year, and make it easier to provide answer and context for complex content.
Also, the Gemini app is integrated into Android, so you can have it generate any ol' nonsense you want in any other app. If your friends spam you with memes, get ready for even less coherent images to clog up your chats.
On-device AI is also being harnessed so that their accessibility and anti-fraud tools work even without access to Google's servers. We're supposed to believe Google won't listen to our phone calls, but let's stay skeptical.
We'll have more info about Android 15 starting tomorrow.
AI safety
Of course, misuse of these AI tools remains a huge problem. Google promises that they're working hard to prevent that by, you guessed it, wielding AI against the core problem with "AI-assisted red teaming."
Real humans, both inside and outside of Google, are trying to identify and prevent bad outcomes as well, but don't be surprised when scammers, professional liars and spurned colleagues continue to use all of these tools for evil.
However, there is one nice thing that have going for them: SynthID. It already watermarked audio and video, but they're working on making their generated text and video easy to detect as well. How will that work with text? No idea!
Watch the entire keynote
Via Google.