Technology

Benchmark For AI-Aided Chip Design That Evaluates LLMs Across 3 Critical Tasks (UCSD, Columbia)

Source

SemiEngineering

Published

Jan 30, 2026

TL;DR

AI Generated

Researchers from UCSD and Columbia University have introduced "ChipBench," a new benchmark for evaluating Large Language Models (LLMs) in AI-aided chip design. The benchmark focuses on three critical tasks: Verilog generation, debugging, and reference model generation, featuring realistic modules and debugging cases. Results show significant performance gaps, with top models achieving only around 30-13% in certain tasks. The benchmark aims to address limitations in existing benchmarks and provides an automated toolbox for generating high-quality training data. The code for the benchmark is available for further research in this area.

Read Full Article

Transforming DRC Closure At Advanced Nodes

The article discusses the challenges of Design Rule Checking (DRC) at advanced nodes, particularly for SoCs at 2 nm or below. Traditional DRC processes can result in billions of violations, making it difficult to prioritize and fix issues efficiently. The solution lies in real-time, AI-powered analysis tools that provide actionable insights as violations are detected, enabling engineers to address issues promptly. By utilizing instance-complete analysis, AI-driven grouping, parallel DRC debug capabilities, clear status tracking, and global filters, teams can streamline the DRC closure process, leading to significant time savings and improved efficiency in chip design.

SemiEngineering•

1 day ago

Geekbench 6.7 adds Intel BOT detection to spoof out 'unrealistic' CPU scores — Benchmark runs with BOT enabled will be marked as invalid

Geekbench 6.7 has introduced Intel BOT detection to flag benchmark results with the Binary Optimization Tool (BOT) as invalid. The BOT feature, supported by Intel's latest Core Ultra chips, can selectively boost performance in specific tasks, leading to concerns about unrealistic performance representations. Geekbench's update aims to maintain fair benchmarking by identifying and invalidating results with BOT enabled. Additionally, the update includes improvements like enhanced SoC identification on Android and better support for RISC-V processors and Arm-based Linux systems.

Tom's Hardware•

3 weeks ago

SemiEngineering

Challenges In Scaling Chips To 2nm And Below

Chips scaling to 2nm and below face challenges in terms of power improvements per watt, increased complexity, and higher costs. Solutions to these challenges often create new problems due to less margin for tradeoffs, necessitating larger interposers, more chiplets, and complex packages. Precision is crucial throughout the design-to-manufacturing flow, requiring shifts to technologies that have been underutilized. Companies are customizing designs for specific data types and operating conditions, leading to vendor- or workload-specific designs at the leading edge. The transition to 2nm logic involves extraordinary complexity in transistor design, materials, and performance metrics, with costs exceeding $100 million from design to working silicon.

SemiEngineering•

4 weeks ago

Geekbench 6 warns about inconsistent benchmarking performance from new Core Ultra 200S Plus chips — says Intel's IPC boosting Binary Optimization Tool modifies scores in 'unclear' fashion

Geekbench 6 has raised concerns about inconsistent benchmarking performance of Intel's new Core Ultra 200S Plus chips due to the Binary Optimization Tool that modifies instructions at the hardware level to boost IPC. John Poole from Geekbench warned that the tool's impact on benchmark scores is unclear, with scores potentially increasing by up to 40% with the tool enabled. Intel lacks public documentation on how the tool optimizes code, making it challenging to assess its effectiveness across different applications. To address this issue, Geekbench will flag benchmark results on iBOT-supported chips as potentially invalid due to binary modification tools. Intel is cautious about the tool's rollout and is currently using it in Geekbench 6.3 to showcase its potential performance enhancements.