Back to home
Technology

HW-SW Co-Designed System With 3 Core Optimization Pathways For Long-Context Agentic LLM Inference (Cambridge, ICL)

Source

SemiEngineering

Published

TL;DR

AI Generated

Researchers from University of Cambridge, Imperial College London, and University of Edinburgh have published a technical paper on optimizing long-context agentic LLM inference tasks. They introduce PLENA, a hardware-software co-designed system with three core optimization pathways to address challenges related to memory walls. PLENA includes efficient hardware implementation, a novel flattened systolic array architecture, and support for FlashAttention to handle memory walls in long-context LLM scenarios. Simulated results show PLENA achieves significantly higher utilization and throughput compared to existing accelerators like the A100 GPU and TPU v6e. The full PLENA system will be open-sourced.

Read Full Article

Similar Articles

SemiEngineering

Panel-Level Packaging’s Second Wave Meets Engineering Reality

Panel-level packaging is gaining traction due to economic pressures and the increasing size of AI accelerators and HPC packages. Glass substrates are being explored to address warpage and dimensional stability issues, but they introduce new failure modes that require material solutions. Challenges in panel-level processing include materials and process integration, not just packaging problems. The industry is moving towards panels driven by economic and technological shifts, but solving these challenges requires a holistic approach.

SemiEngineering
SemiEngineering

Inside the AI Accelerator: Essential IP Design Solutions: eBook

The eBook delves into how advanced IP, high-speed interconnects, memory interfaces, and multi-die architectures are utilized in next-gen AI accelerators to surpass single-chip limitations. It highlights the role of optical links in enhancing bandwidth and security IP in safeguarding AI data without compromising performance. The eBook also covers how technologies like UALink, PCIe, CXL, and Ultra Ethernet support scaling AI architectures, integrating compute, memory, and accelerators, and enhancing bandwidth density through optical I/O. The focus is on unlocking AI performance at scale and ensuring data security across accelerators.

SemiEngineering
Industry's first TSMC COUPE-based optical connectivity solution for next-gen AI chips displayed — Alchip and Ayar Labs show future silicon photonics device

Industry's first TSMC COUPE-based optical connectivity solution for next-gen AI chips displayed — Alchip and Ayar Labs show future silicon photonics device

Alchip and Ayar Labs showcased a TSMC COUPE-based optical connectivity solution at TSMC's European OIP forum, designed for next-gen AI accelerators. This solution integrates Ayar's silicon-photonics TeraPHY IC with Alchip's electrical interface die and detachable fiber connector, offering up to 100 Tb/s of bandwidth per accelerator. The system targets hardware developers seeking optical connectivity without the need to build their own optical subsystem. The three-chiplet co-packaged optical I/O subsystem includes a protocol-converter chiplet, EIC, and TeraPHY PIC, enabling high-speed optical modulation and detection. This production-ready solution allows smaller chip designers to incorporate optical connectivity affordably and efficiently.

Tom's Hardware
Meta reportedly buying RISC-V AI GPU firm Rivos — acquisition to bolster dev team and possibly replace Nvidia internally

Meta reportedly buying RISC-V AI GPU firm Rivos — acquisition to bolster dev team and possibly replace Nvidia internally

Meta is in talks to acquire RISC-V chip startup Rivos to enhance its internal chip development teams and potentially reduce reliance on Nvidia GPUs. Rivos specializes in designing GPUs and AI accelerators on the RISC-V open standard. If the deal goes through, Meta could become a major player in RISC-V chip production. The acquisition may help Meta advance its AI initiatives and address CEO Mark Zuckerberg's concerns about slow progress in chip development. Apple previously sued Rivos for alleged theft of insider information, with the companies settling in 2024.

Tom's Hardware

We use cookies

We use cookies to ensure you get the best experience on our website. For more information on how we use cookies, please see our cookie policy.