• bitcoinBitcoin$101,166.55-2.60%
  • ethereumEthereum$3,666.01-4.74%
  • rippleXRP$2.35-6.65%
  • binancecoinBNB$701.64-1.26%
  • solanaSolana$208.73-4.00%

Runware Uses Custom Hardware for Fast AI Inference

Runware Uses Custom Hardware for Fast AI Inference

Runware is a new company in AI inference, also known as generative AI

The company is building its servers and improving the software layer on those servers to eliminate slow spots and make it faster for picture-generation models to conclude. Andreessen Horowitz’s Speedrun, LakeStar’s Halo II, and Lunar Ventures have already given $3 million to the company.

Demos are sometimes all you need to understand a tool fully. That’s also true of Runware. It will surprise you how quickly Runware can make an image for you if you go to their website, type in a challenge, and hit enter. It takes less than a second.

The business does not wish to create something new. It only wants to spin faster. Runware builds its computers behind the scenes, putting as many GPUs as possible on the same motherboard. It runs its data centers and has a cooling system made just for it.

Runware has changed the BIOS and operating system to improve cold start times for the orchestration layer so that AI models can run faster on their computers. It has made algorithms that divide up interference tasks.

Just by itself, the demo is amazing. The company now wants to capitalize on all this work in R&D and make it a business.

Runware will not rent its GPUs based on GPU time as other GPU hosting companies do. It thinks that businesses should be pushed to speed up their work instead. That’s why Runware provides a picture generation API with a standard fee structure based on an API call. It comes from well-known AI models from Flux and Stable Diffusion.

TechCrunch spoke with co-founder and CEO Flaviu Radulescu about Together AI, Replicate, and Hugging Face. “They are all selling compute based on GPU time,” Radulescu said. “You can see how much longer it takes us to make an image than theirs.” When you look at our prices, you’ll see that we’re much less expensive and faster.

“They will not be able to beat this performance,” he said. “You have to run on a virtualized environment, which adds more time delays, especially if you’re a cloud provider.”

Runware looks at the whole inference process and improves hardware and software. Soon, the company hopes to be able to use GPUs from more than one vendor. Nvidia is the clear winner in the GPU space, which means that Nvidia GPUs tend to be quite expensive. This has been an important goal for several startups.

“Right now, we only use Nvidia GPUs.” “But this should be a layer between this and the software,” Radulescu said. “We can quickly move a model from GPU memory to another one, so we can use the same GPUs for more than one customer.”

“So we’re not like our rivals.” They just give the GPU a model to work on, which does a certain kind of work. For our case, we made this software that lets us change models in the GPU memory while this reasoning is going on.

As long as AMD and other GPU makers can make compatibility layers that work with common AI tasks, Runware can easily create a mixed cloud that uses GPUs from many makers. The company can stay cheaper than its rivals if it does that.

Previous Article

Microsoft Ends HoloLens 2 With No Successor in Sight

Next Article

Arbitrum One Exceeds 1 Billion Transactions