Last reviewed:
What is an LLM? Definition and business implications
An LLM (Large Language Model) is a type of artificial intelligence trained on text corpora of several hundred billion words, which produces natural language by predicting, word by word, the most probable continuation of a given text.
An LLM is a very large neural network, typically built on the transformer architecture (Vaswani et al., 2017), trained to predict the next word in a sequence from the preceding words. This simple objective, repeated over tens of trillions of tokens, is enough to produce models capable of answering questions, drafting texts, translating, reasoning, and coding. The LLM family spans very different sizes, from the lightweight 7-billion-parameter model (Mistral 7B) to latest-generation models with more than a trillion parameters (GPT-4, estimated at 1.76 trillion according to architecture leaks). Size is no longer the sole quality criterion: since 2024, well-trained 70-billion-parameter models rival on common benchmarks with models five to twenty times larger, at a much lower inference cost.
Concrete example
The original transformer, published by Google in 2017, contained 65 million parameters. GPT-3, unveiled by OpenAI in 2020, had 175 billion, that is 2,700 times more in three years. Since then, inflation has continued: Llama 3.1 (Meta) reaches 405 billion in open-source, and the mixture-of-experts architecture of GPT-4 totals about 1.76 trillion parameters according to public estimates. But in 2026, the leader in quality-to-cost ratio according to public MMLU benchmarks is Llama 3.3 with 70 billion parameters, which rivals models ten times larger at a far lower inference cost.
See also
Sources
- Attention Is All You Need, Vaswani et al., NeurIPS 2017. https://arxiv.org/abs/1706.03762
- Language Models are Few-Shot Learners, Brown et al., NeurIPS 2020. https://arxiv.org/abs/2005.14165