Last reviewed:
What is AI training? Definition and business implications
Training is the construction phase of an AI model, during which the model ingests a massive corpus and adjusts its billions of internal parameters to learn the statistical regularities of that corpus. It is the most expensive operation in a model's life cycle.
Training an AI model breaks down into two distinct phases. Pre-training consists of exposing the model to a very large, generalist corpus (Wikipedia, digitised books, source code, web archives) so it learns the statistical structures of language. This phase lasts weeks to months on clusters of thousands of GPUs. At its end, the model correctly predicts the continuation of a text without knowing your business use case. Post-training covers the alignment and adaptation stages: supervised learning on annotated examples, reinforcement learning with human feedback (RLHF), and fine-tuning on specific data. These steps orient the model's behaviour towards what is expected of a useful assistant (safety, tone, format). Training is a high fixed-cost operation. Once done, the model is frozen: to modify its knowledge, you must either retrain it (expensive) or supply context with each query via a RAG (economical).
Concrete example
According to the Stanford AI Index 2025 report, training costs for frontier models have exploded: 670 dollars for the original transformer in 2017, 4.6 million dollars for GPT-3 in 2020, 78 million dollars for GPT-4 in 2023, and 192 million dollars for Google Gemini Ultra 1.0 in 2024. Meta's Llama 3.1 405B cost about 170 million dollars. This inflation, estimated at 2.4 times per year since 2016, places frontier-model training out of reach of any entity outside the GAFAM, outside Anthropic, outside the few competing labs funded in billions of dollars. For a mid-cap, the question is not whether to train, it is to choose the right pre-trained model.
See also
Sources
- Artificial Intelligence Index Report 2025, Stanford HAI, chapter 1. https://hai.stanford.edu/ai-index/2025-ai-index-report
- The Rising Costs of Training Frontier AI Models, Cottier et al., arXiv:2405.21015, 2024. https://arxiv.org/abs/2405.21015