This is Google’s response to OpenAI’s GPT-4o, a way to talk to the assistant naturally, much like a normal voice conversation between two humans (or at least that’s the goal). It’s rolling out in English to Gemini Advanced subscribers ($20 per month), and you can access it by tapping on the little Live button at the bottom right of the Gemini app. It will come to the iOS app and more languages in the coming weeks.
Sissie Hsiao, Google’s vice president of Gemini experiences, tells WIRED this chatbot isn’t just a reheated Google Assistant. Instead, it’s an interface that’s been completely rebuilt using generative AI. “Over the years of building Assistant, there are two things users have asked us for repeatedly,” Hsiao says. “Number one is they’ve asked for a more fluid and natural assistant—they want to be able to talk to it naturally without having to change the way they speak. The second is more capable; to help them solve their life problems, not just simple tasks.”
Live, From Google
Launch Gemini and you’ll see a blank screen with an ethereal light glowing up from the bottom. You can start talking to the assistant and have a conversation even if your phone is locked and the screen is off, and it’s also accessible through Google’s new Pixel Buds Pro 2 wireless earbuds so you can talk hands-free while your phone is in your bag. There are 10 voices you can choose from of varying tones, accents, and styles. When you end the session, you’ll see a transcription of the entire conversation, and that’s something you can access at any time in the Gemini app.
Unlike voice assistants of old, Gemini Live lets you interrupt the conversation without disrupting the entire experience. (This is especially useful as Gemini tends to talk … a lot.) And the idea is to connect it with other apps via extensions, though many of these aren’t available yet. For example, you’ll be able to ask in Gemini Live to pull up a party invitation in your Gmail and ask about the time and location instead of digging it up yourself. Or hunt for a recipe and ask it to add the ingredients to a shopping list in Google Keep. Google says these extensions to its apps like Keep, Tasks, Utilities, Calendar, and YouTube Music will launch in the coming weeks.
Later in the year, Google will imbue Gemini Live with Project Astra, the computer vision tech it teased at its developer conference in May. This will allow you to use your phone’s camera app and, in real time, ask Gemini about the objects you are looking at in the real world. Imagine walking past a concert poster and asking it to store the dates in your calendar and to set up a reminder to buy tickets.
Talk to Me
Our experiences using voice assistants until this point have largely been transactional, so when I chatted with Gemini Live, I found initiating a conversation with the bot to be a little awkward. It’s a big step beyond asking Google Assistant or Alexa for the weather report, to open your blinds, or whether your dog can eat celery. You might have a follow-up here and there, but it was not built around the flow of a conversation the way Gemini Live was.
Hsiao tells me she enjoys using Gemini Live in the car on her drive home from work. She started a conversation about the Paris Olympics and about Celine Dion singing at the opening ceremony. “Can you tell me a little bit about the song she sang?” Hsiao asked. The AI responded with the song’s origin, writer, and what it meant, and after some back and forth, Hsiao discovered Celine Dion could sing in Chinese.
“I was so surprised,” she says. “But that just gives you an example of how you can find out stuff; it’s an interaction with technology that people couldn’t have before this kind of curiosity and exploration through conversation. This is just the beginning of where we’re headed with the Gemini assistant.”
In my demo, I asked Gemini what I should eat for dinner. It asked if I wanted something light and refreshing or a hearty meal. We went on, back and forth, and when Gemini suggested a shrimp dish I lied and said I was allergic to shrimp, to which it then recommended salmon. I said I didn’t have salmon. “You could always grill up some chicken breasts and toss them in a salad with grilled salad and a light vinaigrette dressing.” I asked for a recipe, and it started going through the instructions step by step. I interrupted it, but I can go back into the Gemini app to find the recipe later.
I can imagine following this approach now when I want to learn about anything, and just continuing the conversation even after Gemini answers my initial query. I still have many concerns: Why is there no direct attribution or sourcing for the information it surfaces? Can I trust that everything it says is accurate? Hsaio says when you exit Gemini Live, you can click on the little “G” icon underneath transcribed text to check its work and run your own Google searches.
But more and more, I find myself thinking that this is the future of search. You just ask, get the answers, and keep talking to learn more. The problem is that Gemini tends to talk a lot. Its responses are verbose, so you’re often waiting a while before you can follow up. Yes, you can interrupt it to move on, but it’s awkward interrupting a voice assistant. I don’t want to be rude!
Where in the World Is Google Assistant?
With all this focus on Gemini and Gemini Live, you’re probably wondering: Where’s Google Assistant? If you tap on your profile icon in the Gemini app, you’ll see an option to Switch to Google Assistant if you want to go back to the old experience, but it’s hard to say how long that option will be available. Currently, there are a few things Assistant can do that Gemini can’t, so there’s a hand-off from one to the other. “Increasingly, Gemini will be able to do those actions on its own,” Hsiao says.
But earlier this month, Google announced new Nest products, which also brought word that Google Assistant will soon be getting a more natural voice, and some of its features will be upgraded with Gemini’s large language models. You’d be able to ask it if a FedEx delivery person showed up at your doorstep, for example, and it’d be able to parse this from your video doorbell’s feed. Motion alerts could be far more descriptive rather than just saying “person detected.”
That means we now have two assistants, and it sounds like Google is completely OK with this at the moment. Hsiao says Gemini will be your personal assistant, the one you can ask about calendar appointments and email invites, all grounded in your personal data. In the home, Google Assistant is your “communal” assistant, because it’s more of a family device. “People don’t want their personal emails to be accessible through voice on a home speaker in their living room where a guest can ask, ‘Hey Google, what’s in Julian’s email.'”
It sounds like a recipe for a branding disaster. It’s already so hard to keep track of all the variations of Gemini already out there (and don’t forget, Gemini was “Bard” when it launched in preview last year). It also might mean certain functions will be limited based on the device you’re using, to prevent a guest from snooping on your emails. If you get used to asking your Gemini on your phone to handle a task, but then you leave your phone in the other room and the Assistant on your Nest Speaker refuses to follow through, isn’t that frustrating?
“We’re still exploring the branding of that, and we’re still in the early development phases,” Hsiao says. “Branding aside, we need to make sure that people get what they want from their most helpful assistant, whether it’s on their personal phone or in the home, and it solves their use cases.”
Lire l’article complet sur : www.wired.com
Leave A Comment