We use cookies

We use cookies to ensure you get the best experience on our website. For more information on how we use cookies, please see our cookie policy.

Back to home

Alibaba Cloud says it cut Nvidia AI GPU use by 82% with new pooling system— up to 9x increase in output lets 213 GPUs perform like 1,192

Source

Tom's Hardware

Published

TL;DR

AI Generated

Alibaba Cloud's new Aegaeon pooling system reportedly reduced the need for Nvidia GPUs by 82% during a beta test, allowing 213 GPUs to perform like 1,192. This system, designed for inference-time scheduling, maximizes GPU utilization across multiple models with varying demand patterns. By virtualizing GPU access at the token level, Aegaeon significantly increased system-wide output, outperforming other serverless systems in benchmarks. The success of this system in optimizing GPU usage may have implications for cloud providers facing limited GPU supply, particularly in regions like China.