Subscribe for notification
Tech

OpenAI DevDay Debuts Realtime API, Tools for AI Devs

OpenAI faced a turbulent week with executive exits and major fundraising but is now focused on rallying developers to create tools with its AI models at the 2024 DevDay

The company also announced a public beta of its “Realtime API” on Tuesday. This API is designed to facilitate the development of applications that feature AI-generated voice responses with low latency. Although it is not quite ChatGPT’s Advanced Voice Mode, it is a close second.

Kevin Weil, the chief product officer of OpenAI, stated in a press briefing before the event that the recent departures of chief technology officer Mira Murati and chief research officer Bob McGrew would not impede the company’s advancement.

OpenAI names ex-Twitter and Instagram VP Kevin Weil as CPO – Mobile Marketing Magazine.

“I will commence by stating that Bob and Mira have been exceptional leaders.” Weil stated, “I have acquired a wealth of knowledge from them, and they have played a significant role in our current position.” “In addition, we will not decelerate.”

OpenAI is reorganizing its C-suite, which serves as a reminder of the turmoil following last year’s DevDay. The company is endeavoring to persuade developers that it remains the most optimal platform for developing AI applications.

Even though OpenAI is operating in an increasingly competitive space, its leaders claim that the startup has over 3 million developers who are building with its AI models.

In the past two years, OpenAI has reduced the cost of accessing its API for developers by 99%. However, this reduction was probably necessitated by competitors like Meta and Google, who have consistently undercut their prices.

The Realtime API, one of OpenAI’s newest features, will enable developers to create speech-to-speech experiences in their applications that are virtually realtime. Developers will have the option of utilizing six voices provided by OpenAI.

To prevent copyright issues, developers are prohibited from utilizing third-party voices, as these voices are distinct from those available for ChatGPT. (The voice is ambiguous based on Scarlett Johansson’s and is unavailable in any location.)

OpenAI’s director of developer experience, Romain Huet, presented a demonstration of a trip planning application developed using tRealtimeime API during the briefing.

The application enabled users to communicate verbally with an AI assistant regarding an impending excursion to London, and they received responses with minimal latency.

The app was able to annotate a map with restaurant locations as it answered, as tRealtimeime API has access to various tools.

Huet demonstrated trealtimeime API’s ability to communicate with a human over the phone to query about ordering food for an event at additional points.

Unlike Google’s notorious Duo, OpenAI’s API cannot directly contact restaurants or stores. Nevertheless, it can integrate with calling APIs such as Twilio.

It is important to note that OpenAI is not incorporating disclosures to ensure that its AI models automatically identify themselves during calls like this, even though the AI-generated voices sound genuine.

Currently, lopers are responsible for incorporating this disclosure, a requirement that a recent California law may mandate.

OpenAI also announced vision fine-tuning in its API as part of its DevDay announcements.

This feature will enable developers to fine-tune their GPT-4o applications using images and text. In principle, this should assist developers in enhancing the performance of GPT-4o for tasks that require visual comprehension.

Olivier Godement, OpenAI’s director of product API, informs TechCrunch that developers will be unable to upload copyrighted imagery (such as a photograph of Donald Duck), images that depict violence, or any other imagery that contravenes OpenAI’s safety policies.

Olivier Godement, OpenAI’s director of product API | Forbes Daily

OpenAI is in a race to rival its competitors’ offerings in the AI model licensing sector.

Its prompt caching feature is comparable to the feature that Anthropic introduced several months ago, which enables developers to cache frequently used context between API calls, thereby reducing costs and improving latency.

OpenAI says developers can save 50% by utilizing this feature, while Anthropic guarantees a 90% discount.

Finally, OpenAI provides a model distillation feature that enables developers to fine-tune smaller models, such as GPT-4o mini, by utilizing larger AI models, such as o1-preview and GPT-4o.

This feature should enable developers to enhance the performance of those small AI models even though operating smaller models typically results in cost savings compared to running larger ones.

OpenAI is introducing a beta evaluation tool as part of model distillation. This utility will enable developers to assess the performance of their fine-tuning within OpenAI’s API.

For example, the absence of any information regarding the GPT Store during the previous year’s DevDay may generate greater interest. OpenAI has been conducting a revenue share program with several of the most prominent creators of GPTs, but there has been little news since then.

Additionally, OpenAI has announced that it will not be publishing any new AI models during this year’s DevDay. Developers anticipating the release of OpenAI o1 (not the preview or mini version) or the startup’s video generation model, Sora, will be required to endure an additional period of anticipation.

Tags: AIAPiOpenAI
Hillary Ondulohi

Hillary is a media creator with a background in mechanical engineering. He leverages his technical expertise to craft informative pieces on protechbro.com, making complex concepts accessible to a wider audience.

Disqus Comments Loading...

Recent Posts

Coinbase Prepares to Delist Stablecoins in December

MiCA rules require stablecoin issuers to obtain e-money authorization, so Coinbase will delist stablecoins that have not been authorized by…

3 hours ago

Truflation to Power Sphinx DeFi Market Using RWA

Through the formation of a new partnership, Truflation, and Sphinx will work to improve RWA, as Truflation will now power…

3 hours ago

Vietnamese Police Arrest 5 in Crypto Scam Ring

Vietnamese police have broken up an international crypto scam network and arrested several people they think stole billions of VND…

6 hours ago

Gmail Users on iOS Can Ask Gemini About Emails

The company said this week that some iOS Gmail users can now talk to Google's Gemini about their inbox in…

7 hours ago

Browser Company launches Arc Search on Android

Arc, an alternative browser by The Browser Company, will release its Arc Search browser in open beta on Android for…

8 hours ago

Meta Movie Gen Produces Realistic Video, Sound

Few understand generative video models, but Meta’s Movie Gen produces realistic video and sound, turning text into visuals It's called…

8 hours ago