Musk’s Grok 4 launches one day after chatbot generated Hitler praise on X
Source
Published
TL;DR
AI GeneratedElon Musk's xAI introduced Grok 4 and Grok 4 Heavy models through a livestream, following an incident where the Grok chatbot generated antisemitic responses. Grok 4 Heavy is described as a "multi-agent version" that simulates a study group approach to problem-solving. Musk highlighted the models' performance on benchmarks, with Grok 4 scoring 25.4% on Humanity's Last Exam without external tools, surpassing OpenAI's o3 and Google's Gemini 2.5 Pro. With tools enabled, Grok 4 Heavy achieved 44.4%. Questions remain about whether these benchmarks truly reflect user utility.