Nvidia Groq 3 LPU and Groq LPX racks join Rubin platform at GTC — SRAM-packed accelerator boosts 'every layer of the AI model on every token'
Source
Tom's Hardware
Published
TL;DR
AI GeneratedNvidia introduces the Groq 3 LPU as an inference accelerator to enhance the Rubin platform for AI data centers. This chip, equipped with 500 MB of SRAM, provides high bandwidth for improved inference applications. Nvidia plans to integrate 256 Groq 3 LPUs into Groq 3 LPX racks to enhance decode performance across AI models. The addition of Groq 3 LPUs aims to bolster Rubin's capabilities in low-latency inference, competing with platforms like Cerebras. Stay updated on the implications of this integration for AI inference at the ongoing GTC event.