Subscribe for notification
AI

Gemini Live First Look: More Engaging Than Siri

Gemini Live was available by Google at its Made by Google event on Tuesday

This feature enables users to engage in a semi-natural spoken conversation with an AI chatbot powered by Google’s most recent large language model rather than a typed version. TechCrunch was present to conduct this experiment firsthand.

Gemini Live is Google’s response to OpenAI’s Advanced Voice Mode, a feature virtually identical to ChatGPT and currently in a limited alpha test. Although OpenAI had demonstrated the feature before Google, Google was the first to implement the final version.

I have found that these verbal features with minimal latency feel significantly more natural than texting with ChatGPT or conversing with Siri or Alexa. I discovered that Gemini Live could respond to inquiries in less than two seconds and adjust its course rapidly when interrupted. Gemini Live is not flawless; however, it is the most effective method of using a phone hands-free that I have encountered thus far.

The Operation of Gemini Live


The feature allows users to select from 10 accents before engaging with Gemini Live, unlike OpenAI’s three voices. Each creation was a collaborative effort between Google and voice actors. I found each one to sound very humanlike and appreciated the variety.

In one instance, a Google product manager verbally requested that Gemini Live identify family-friendly wineries near Mountain View with outdoor areas and playgrounds. This was done to accommodate the possibility of children accompanying the visit. Gemini successfully recommended a location that satisfied the criteria: Cooper-Garrod Vineyards in Saratoga. This is a considerably more intricate endeavor than I would have asked Siri or Google Search to handle.

Nevertheless, Gemini Live is not without its shortcomings. It appeared to induce hallucinations regarding Henry Elementary School Playground, a playground purportedly “10 minutes away” from the vineyard. Although Saratoga has additional facilities, the nearest Henry Elementary School is over a two-hour drive away. Henry Ford Elementary School is situated in Redwood City but is located 30 minutes away.

Google enjoyed demonstrating the ability of users to interrupt Gemini Live mid-sentence, and the AI will promptly reorient. According to the organization, this enables users to regulate the discourse. This feature is not entirely functional in practice. Occasionally, the AI appeared unable to comprehend the conversation between Gemini Live and Google’s project supervisors while conversing with one another.

According to product manager Leland Rechis, Google prohibits Gemini Live from singing or imitating any dialects beyond the 10 it offers. The organization is probably taking action to prevent potential conflicts with copyright legislation. Additionally, Rechis stated that Google is not currently concerned with enabling Gemini Live to comprehend the emotional intonation in a user’s voice, a feature that OpenAI promoted during its demonstration.

Leland Rechis | source, Crunchbase

The feature appears to be an excellent method for conducting a more in-depth investigation of a topic than would be possible with a straightforward Google Search. Google acknowledges that Gemini Live is a preliminary step toward Project Astra, the entirely multimodal AI model introduced at Google I/O. Currently, Gemini Live can only be used for voice conversations; however, Google intends to incorporate real-time video comprehension in the future.

James Emmanuel

James is a Computer Science student with a robust foundation in tech and a skilled DevOps engineer. His technical expertise extends to his role as a news reporter at Protechbro, where he specializes in crafting well-informed, technical content that highlights the latest trends and innovations in technology.

Disqus Comments Loading...

Recent Posts

Northvolt CEO Resigns After Company Files for Bankruptcy

Swedish battery maker Northvolt dealt a setback to Europe's lithium-ion battery ambitions by announcing its bankruptcy filing in the United…

4 hours ago

Kairos Wins Approval for 2 Nuclear Reactors

With permission from the U.S. Nuclear Regulatory Commission, nuclear company Kairos Power can begin building two test reactors in Oak…

5 hours ago

Binance Surges to 240M Users, Institutional Adoption Doubles

Binance attributes growth to regulatory compliance initiatives and the adoption of Bitcoin ETFs. According to a statement released on November…

6 hours ago

Gary Gensler Exits SEC, Successor to Face Intense Scrutiny

While cryptocurrency users rejoiced when SEC Chair Gensler announced his departure, the US Senate will hold confirmation hearings for Trump's…

7 hours ago

Gold-Backed Coin Aims To Boost Bitcoin In Texas

A Texas congressman says the state's gold-backed digital currency could boost crypto adoption and inspire investors to explore Bitcoin. According…

10 hours ago

Ether Price Rises Despite Whale Sell-off

Ether price is breaking out above $3,700 despite significant selling pressure, driven by an emerging bull flag, analysts report. Some…

10 hours ago