Back to home
Technology

Alibaba Cloud says it cut Nvidia AI GPU use by 82% with new pooling system— up to 9x increase in output lets 213 GPUs perform like 1,192

Source

Tom's Hardware

Published

TL;DR

AI Generated

Alibaba Cloud's new Aegaeon pooling system reportedly reduced the need for Nvidia GPUs by 82% during a beta test, allowing 213 GPUs to perform like 1,192. This system, designed for inference-time scheduling, maximizes GPU utilization across multiple models with varying demand patterns. By virtualizing GPU access at the token level, Aegaeon significantly increased system-wide output, outperforming other serverless systems in benchmarks. The success of this system in optimizing GPU usage may have implications for cloud providers facing limited GPU supply, particularly in regions like China.

Read Full Article

Similar Articles

Talent over tokens: AI models are becoming more expensive to run, and productivity gains are limited — efficient workers might be the solution to strained budgets

Talent over tokens: AI models are becoming more expensive to run, and productivity gains are limited — efficient workers might be the solution to strained budgets

As AI models become more expensive to run, with costs exceeding those of actual workers, companies are facing strained budgets. Despite the promise of productivity gains through AI deployment, many firms are not seeing the expected returns. High costs associated with AI usage are leading to budget exhaustion, with examples like Uber spending its annual AI budget in a few weeks. As AI spending continues to rise, companies may need to reconsider their reliance on AI and potentially invest in efficient human workers instead.

Tom's Hardware
Pirate RPG game is secretly looting your SSD lifespan — new Windrose patch promises smoother sailing and addresses excessive disk writing

Pirate RPG game is secretly looting your SSD lifespan — new Windrose patch promises smoother sailing and addresses excessive disk writing

The Windrose RPG game has been criticized for excessively writing to SSDs, potentially shortening their lifespan. The game's new patch aims to reduce disk usage significantly, addressing the issue of up to 108GB per hour being written to SSDs. Comparisons with other games like Enshrouded and Valheim show Windrose's disproportionate SSD resource consumption. The game's high storage demand is attributed to its save system design, which has been adjusted in the latest patch to improve write speeds by 60-75%. Players are advised to update to the latest version to mitigate potential SSD wear and tear.

Tom's Hardware
Framework's new RTX 5070 12GB graphics module costs a whopping $1,199 — 72% more expensive than $699 8GB version, says pricing is beyond its control

Framework's new RTX 5070 12GB graphics module costs a whopping $1,199 — 72% more expensive than $699 8GB version, says pricing is beyond its control

Nvidia released a new 12GB version of the RTX 5070 mobile GPU with upgraded memory chips, increasing memory throughput. Framework introduced a new graphics module for its Framework Laptop 16 featuring this GPU, priced at $1,199, a significant increase from the $699 8GB version. The high cost is attributed to the expensive GDDR7 memory and the ongoing global memory shortage. Framework clarified that the pricing is influenced by external factors and not within its control, highlighting the challenges faced by consumers due to the current market conditions.

Tom's Hardware
Palit Group says Galax GPU brand will continue to operate following restructure — Galax management centralized under Palit Group in 'pre-planned' shakeup

Palit Group says Galax GPU brand will continue to operate following restructure — Galax management centralized under Palit Group in 'pre-planned' shakeup

Palit Group has confirmed that the Galax GPU brand will continue to operate and release hardware despite recent reports suggesting otherwise due to an internal restructuring. The management of Galax will now be centralized under Palit Group, aiming to enhance operational efficiency and synergy between the brands. Galax reassured customers that they remain committed to developing and supporting high-performance hardware. The move involves centralizing brand management at Palit Group headquarters, with Galax now being managed directly from Taiwan. Despite the changes, Galax emphasizes that the goal is to strengthen its global presence.

Tom's Hardware

We use cookies

We use cookies to ensure you get the best experience on our website. For more information on how we use cookies, please see our cookie policy.