Benchmark For AI-Aided Chip Design That Evaluates LLMs Across 3 Critical Tasks (UCSD, Columbia)
Source
Published
TL;DR
AI GeneratedResearchers from UCSD and Columbia University have introduced "ChipBench," a new benchmark for evaluating Large Language Models (LLMs) in AI-aided chip design. The benchmark focuses on three critical tasks: Verilog generation, debugging, and reference model generation, featuring realistic modules and debugging cases. Results show significant performance gaps, with top models achieving only around 30-13% in certain tasks. The benchmark aims to address limitations in existing benchmarks and provides an automated toolbox for generating high-quality training data. The code for the benchmark is available for further research in this area.