Back to home
Technology

Reliability Extension Architecture For Cost-Effective HBM (RPI, ScaleFlux, IBM TJ Watson)

Source

SemiEngineering

Published

TL;DR

AI Generated

Researchers from Rensselaer Polytechnic Institute, ScaleFlux, and IBM T.J. Watson Research Center have published a technical paper titled "Making Strong Error-Correcting Codes Work Effectively for HBM in AI Inference." The paper introduces REACH, a controller-managed ECC design that aims to maintain end-to-end correctness and throughput for HBM while tolerating higher raw bit error rates. By implementing a two-level Reed-Solomon scheme, REACH significantly extends device error rate tolerances while reducing ECC area and power consumption compared to traditional methods. This innovation could lead to lower-cost HBM implementations without changing the standard interface.

Read Full Article

Similar Articles

Dr. L.C. Lu on TSMC Advanced Technology Design Solutions

Dr. L.C. Lu on TSMC Advanced Technology Design Solutions

Dr. L.C. Lu, a key figure at TSMC, focuses on design-technology co-optimization, packaging innovations, and AI-driven methodologies for next-gen semiconductor systems. TSMC emphasizes DTCO and DDCL innovations for scaling from N5 to A14 nodes, with NanoFlex and NanoFlex Pro architectures offering efficiency gains. N2P and N2U nodes incorporate advanced DTCO and power delivery optimizations, with hybrid dual-rail architectures achieving significant energy savings. TSMC collaborates with EDA partners for AI integration, enhancing productivity and design quality. Advanced packaging technologies like CoWoS and SoIC play a crucial role in enabling AI scaling, with memory bandwidth and interconnect performance scaling aggressively. TSMC addresses power delivery and thermal management challenges in AI systems through advanced solutions. TSMC's advancements in design methodologies and AI-driven automation promise improved productivity and scalability in chip-package co-design.

SemiWiki
MIT Technology Review

The Download: a new Christian phone network, and debugging LLMs

A new US phone network for Christians is launching, blocking porn and gender-related content with network-level controls. Goodfire, a San Francisco startup, released Silico, a tool for debugging AI models by allowing users to adjust parameters during training. The National Science Foundation faced mass firings, impacting US science funding and governance. China's AI labs are releasing open-source models, challenging the traditional Silicon Valley approach. Elon Musk admitted using OpenAI models for xAI training, sparking debate on AI ethics and practices.

MIT Technology Review
Mark Zuckerberg says Meta is cutting 8,000 jobs to pay for AI infrastructure — insatiable compute demand means the company can't rule out further headcount reductions

Mark Zuckerberg says Meta is cutting 8,000 jobs to pay for AI infrastructure — insatiable compute demand means the company can't rule out further headcount reductions

Meta CEO Mark Zuckerberg announced at a town hall that the company plans to cut 8,000 jobs due to the increasing costs of AI infrastructure. These layoffs will affect about 10% of Meta's workforce and are linked to the company's expanding AI budget. Zuckerberg mentioned that the focus on AI hardware is diverting funds from employee-related expenses, hinting at possible future reductions. Despite the layoffs, Meta reported strong Q1 earnings, raising questions about the necessity of the job cuts. The move highlights a broader debate about the role of AI in driving layoffs in the tech industry.

Tom's Hardware
Talent over tokens: AI models are becoming more expensive to run, and productivity gains are limited — efficient workers might be the solution to strained budgets

Talent over tokens: AI models are becoming more expensive to run, and productivity gains are limited — efficient workers might be the solution to strained budgets

As AI models become more expensive to run, with costs exceeding those of actual workers, companies are facing strained budgets. Despite the promise of productivity gains through AI deployment, many firms are not seeing the expected returns. High costs associated with AI usage are leading to budget exhaustion, with examples like Uber spending its annual AI budget in a few weeks. As AI spending continues to rise, companies may need to reconsider their reliance on AI and potentially invest in efficient human workers instead.

Tom's Hardware

We use cookies

We use cookies to ensure you get the best experience on our website. For more information on how we use cookies, please see our cookie policy.