We use cookies

We use cookies to ensure you get the best experience on our website. For more information on how we use cookies, please see our cookie policy.

Back to home

LLMs contain a LOT of parameters. But what’s a parameter?

Source

MIT Technology Review

Published

TL;DR

AI Generated

Large language models (LLMs) like GPT-3 and Gemini 3 contain billions to trillions of parameters that control their behavior. Parameters, like embeddings, weights, and biases, are assigned values through training algorithms, involving iterative calculations to minimize errors. LLMs compress vast amounts of data into high-dimensional spaces to understand language nuances. Techniques like distillation and overtraining help smaller models outperform larger ones by leveraging training data efficiently. Researchers are exploring ways to optimize parameter usage as the focus shifts from scaling up models to maximizing their potential.