Subscribe for notification
AI

Runware Uses Custom Hardware for Fast AI Inference

Runware is a new company in AI inference, also known as generative AI

The company is building its servers and improving the software layer on those servers to eliminate slow spots and make it faster for picture-generation models to conclude. Andreessen Horowitz’s Speedrun, LakeStar’s Halo II, and Lunar Ventures have already given $3 million to the company.

Demos are sometimes all you need to understand a tool fully. That’s also true of Runware. It will surprise you how quickly Runware can make an image for you if you go to their website, type in a challenge, and hit enter. It takes less than a second.

The business does not wish to create something new. It only wants to spin faster. Runware builds its computers behind the scenes, putting as many GPUs as possible on the same motherboard. It runs its data centers and has a cooling system made just for it.

Runware has changed the BIOS and operating system to improve cold start times for the orchestration layer so that AI models can run faster on their computers. It has made algorithms that divide up interference tasks.

Just by itself, the demo is amazing. The company now wants to capitalize on all this work in R&D and make it a business.

Runware will not rent its GPUs based on GPU time as other GPU hosting companies do. It thinks that businesses should be pushed to speed up their work instead. That’s why Runware provides a picture generation API with a standard fee structure based on an API call. It comes from well-known AI models from Flux and Stable Diffusion.

TechCrunch spoke with co-founder and CEO Flaviu Radulescu about Together AI, Replicate, and Hugging Face. “They are all selling compute based on GPU time,” Radulescu said. “You can see how much longer it takes us to make an image than theirs.” When you look at our prices, you’ll see that we’re much less expensive and faster.

“They will not be able to beat this performance,” he said. “You have to run on a virtualized environment, which adds more time delays, especially if you’re a cloud provider.”

Runware looks at the whole inference process and improves hardware and software. Soon, the company hopes to be able to use GPUs from more than one vendor. Nvidia is the clear winner in the GPU space, which means that Nvidia GPUs tend to be quite expensive. This has been an important goal for several startups.

“Right now, we only use Nvidia GPUs.” “But this should be a layer between this and the software,” Radulescu said. “We can quickly move a model from GPU memory to another one, so we can use the same GPUs for more than one customer.”

“So we’re not like our rivals.” They just give the GPU a model to work on, which does a certain kind of work. For our case, we made this software that lets us change models in the GPU memory while this reasoning is going on.

As long as AMD and other GPU makers can make compatibility layers that work with common AI tasks, Runware can easily create a mixed cloud that uses GPUs from many makers. The company can stay cheaper than its rivals if it does that.

James Emmanuel

James is a Computer Science student with a robust foundation in tech and a skilled DevOps engineer. His technical expertise extends to his role as a news reporter at Protechbro, where he specializes in crafting well-informed, technical content that highlights the latest trends and innovations in technology.

Disqus Comments Loading...

Recent Posts

Coinbase Prepares to Delist Stablecoins in December

MiCA rules require stablecoin issuers to obtain e-money authorization, so Coinbase will delist stablecoins that have not been authorized by…

6 hours ago

Truflation to Power Sphinx DeFi Market Using RWA

Through the formation of a new partnership, Truflation, and Sphinx will work to improve RWA, as Truflation will now power…

6 hours ago

Vietnamese Police Arrest 5 in Crypto Scam Ring

Vietnamese police have broken up an international crypto scam network and arrested several people they think stole billions of VND…

9 hours ago

Gmail Users on iOS Can Ask Gemini About Emails

The company said this week that some iOS Gmail users can now talk to Google's Gemini about their inbox in…

10 hours ago

Browser Company launches Arc Search on Android

Arc, an alternative browser by The Browser Company, will release its Arc Search browser in open beta on Android for…

11 hours ago

Meta Movie Gen Produces Realistic Video, Sound

Few understand generative video models, but Meta’s Movie Gen produces realistic video and sound, turning text into visuals It's called…

12 hours ago