We use cookies

We use cookies to ensure you get the best experience on our website. For more information on how we use cookies, please see our cookie policy.

Back to home

Reliability Extension Architecture For Cost-Effective HBM (RPI, ScaleFlux, IBM TJ Watson)

Source

SemiEngineering

Published

TL;DR

AI Generated

Researchers from Rensselaer Polytechnic Institute, ScaleFlux, and IBM T.J. Watson Research Center have published a technical paper titled "Making Strong Error-Correcting Codes Work Effectively for HBM in AI Inference." The paper introduces REACH, a controller-managed ECC design that aims to maintain end-to-end correctness and throughput for HBM while tolerating higher raw bit error rates. By implementing a two-level Reed-Solomon scheme, REACH significantly extends device error rate tolerances while reducing ECC area and power consumption compared to traditional methods. This innovation could lead to lower-cost HBM implementations without changing the standard interface.