Subscribe for notification
AI

OpenAI Debuts ChatGPT’s Realistic Voice for Paying Users

On Tuesday, OpenAI began rolling out ChatGPT’s Advanced Voice Mode, offering a select group of ChatGPT Plus users access to GPT-4o’s hyper-realistic audio responses, with a full rollout to all Plus users expected by fall 2024

In May, OpenAI introduced GPT-4o’s voice to the public, and the feature was met with astonishment due to its rapid responses and striking resemblance to the voice of a specific human.

Sky’s voice resembled Scarlett Johansson, the actress who portrayed the artificial assistant in the film “Her.” Johansson declined numerous requests from CEO Sam Altman to use her voice shortly after OpenAI’s demonstration. Subsequently, she retained legal counsel to defend her visage after witnessing GPT-4o’s demonstration.

OpenAI initially denied employing Johansson’s voice; however, it subsequently eliminated the voice from its demonstration. OpenAI announced in June that it would postpone the release of Advanced Voice Mode to enhance its safety protocols.

After one month, the delay has been somewhat alleviated. OpenAI has announced that this alpha will not include the video and screen-sharing capabilities demonstrated during its Spring Update. They will be released at a “later date.”

Currently, the GPT-4o demo that captivated the audience is merely a demonstration. However, certain premium users can now access ChatGPT’s voice feature, as demonstrated in the demo.

ChatGPT can now talk and listen

You may have already experimented with the Voice Mode currently accessible in ChatGPT; however, OpenAI asserts that the Advanced Voice Mode is distinct.

ChatGPT’s previous audio solution comprised three distinct models: one to convert your voice to text, GPT-4 to process your prompt, and a third to convert ChatGPT’s text into voice.

However, GPT-4o is multimodal and can perform these tasks independently, resulting in conversations with a significantly reduced latency. OpenAI also asserts that GPT-4o can detect emotional intonations in your voice, such as singing, excitement, or sorrow.

ChatGPT Plus users can observe the hyper-realistic nature of OpenAI’s Advanced Voice Mode firsthand during this pilot. TechCrunch was unable to evaluate the feature before the publication of this article; however, we will assess it upon obtaining access.

OpenAI has announced that it will progressively introduce ChatGPT’s new voice to monitor its usage closely. Individuals in the alpha group will receive an alert in the ChatGPT application, followed by an email containing instructions on operating it.

In the months since OpenAI’s demonstration, the company has conducted voice tests on GPT-4o with over 100 external red teamers proficient in 45 distinct languages. According to OpenAI, a report regarding these safety initiatives will be released in early August.

The company has announced that Advanced Voice Mode will be restricted to the four preset voices of ChatGPT – Juniper, Breeze, Cove, and Ember – which were developed in collaboration with professional voice actors.

The Sky voice featured in OpenAI’s May demo is no longer accessible in ChatGPT. According to Lindsay McCallum, spokesperson for OpenAI, “ChatGPT is unable to imitate the voices of other individuals and public figures, and it will suppress outputs that diverge from one of these preset voices.”

OpenAI is endeavoring to prevent the emergence of deepfake controversies. In January, the vocal cloning technology of AI startup ElevenLabs was employed to impersonate President Biden, thereby deceiving primary voters in New Hampshire.

Additionally, OpenAI claims that it has implemented new filters to prevent specific requests for the generation of music or other copyrighted audio.

AI companies have been involved in legal proceedings for copyright infringement in the past year, and audio models such as GPT-4o have opened up a new category of companies eligible to submit a complaint.

In particular, record publishers have a litigation history and have already sued AI song generators Suno and Udio.

Hillary Ondulohi

Hillary is a media creator with a background in mechanical engineering. He leverages his technical expertise to craft informative pieces on protechbro.com, making complex concepts accessible to a wider audience.

Disqus Comments Loading...

Recent Posts

Upbit Refunds Millions After Crypto Hack

Upbit refunded 8.5 billion won to 380 voice phishing victims, as authorities expose North Korea's involvement in previous hacks. Upbit,…

13 minutes ago

Charles Schwab CEO Regrets Not Investing In Crypto

Rick Wurster, set to become CEO next year, stated he has no plans to buy crypto but aims to support…

47 minutes ago

Federal Task Force Busts Cartel-Linked Crypto Laundering Ring

Nine individuals were charged with laundering U.S. drug proceeds into cryptocurrency for Mexican and Colombian cartels from 2020 to 2023.…

2 hours ago

Truemarkets Raises $4M in NFT Sale, Vitalik Buterin Purchases 400 NFTs

Truemarkets raised over $4M by selling 15,071 NFTs at $250 each. Vitalik Buterin bought 400 NFTs worth $107K in the…

3 hours ago

Apple Acknowledges Security Flaw Exposing Crypto Users—Here’s What to Do

Apple acknowledged on Monday that its devices were susceptible to an exploit that enabled the execution of remote malicious code…

3 hours ago

Hackers Breach Tate’s Online ‘University,’ Steal Data

Hackers have infiltrated an online course that was established by Andrew Tate, a self-described misogynist and purported influencer The compromise…

8 hours ago