Huawei-led team claims it post-trained DeepSeek's 1.6-trillion-parameter model — 1,000 Ascend 910C chips used in training
Source
Published
TL;DR
AI GeneratedA research group led by Huawei Technologies has successfully post-trained DeepSeek's V4-Pro, a 1.6-trillion-parameter model, using a cluster of 1,000 Ascend 910C chips. This achievement showcases Chinese accelerators' capability to handle training-class workloads on domestic silicon, a significant development amid U.S. export controls. While Chinese chips have excelled in inference tasks, they have historically struggled with training, making this accomplishment noteworthy. However, the team's claim lacks specific benchmarks or comparisons with Nvidia hardware, leaving some aspects of the achievement unverified.