Last reviewed:

What is AI training? Definition and business implications

Training is the construction phase of an AI model, during which the model ingests a massive corpus and adjusts its billions of internal parameters to learn the statistical regularities of that corpus. It is the most expensive operation in a model's life cycle.

Training an AI model breaks down into two distinct phases. Pre-training consists of exposing the model to a very large, generalist corpus (Wikipedia, digitised books, source code, web archives) so it learns the statistical structures of language. This phase lasts weeks to months on clusters of thousands of GPUs. At its end, the model correctly predicts the continuation of a text without knowing your business use case. Post-training covers the alignment and adaptation stages: supervised learning on annotated examples, reinforcement learning with human feedback (RLHF), and fine-tuning on specific data. These steps orient the model's behaviour towards what is expected of a useful assistant (safety, tone, format). Training is a high fixed-cost operation. Once done, the model is frozen: to modify its knowledge, you must either retrain it (expensive) or supply context with each query via a RAG (economical).

Concrete example

According to the Stanford AI Index 2025 report, training costs for frontier models have exploded: 670 dollars for the original transformer in 2017, 4.6 million dollars for GPT-3 in 2020, 78 million dollars for GPT-4 in 2023, and 192 million dollars for Google Gemini Ultra 1.0 in 2024. Meta's Llama 3.1 405B cost about 170 million dollars. This inflation, estimated at 2.4 times per year since 2016, places frontier-model training out of reach of any entity outside the GAFAM, outside Anthropic, outside the few competing labs funded in billions of dollars. For a mid-cap, the question is not whether to train, it is to choose the right pre-trained model.

See also

Sources

  1. Artificial Intelligence Index Report 2025, Stanford HAI, chapter 1. https://hai.stanford.edu/ai-index/2025-ai-index-report (accessed 2026-05-24)
  2. The Rising Costs of Training Frontier AI Models, Cottier et al., arXiv:2405.21015, 2024. https://arxiv.org/abs/2405.21015 (accessed 2026-05-24)

← Back to glossary

Address copied