Last reviewed:
What is an AI token? Definition and business implications
A token is the elementary unit of text that an AI model manipulates, generally a word fragment equivalent to 3 or 4 characters in English. It is both the model's unit of computation and the AI providers' unit of billing, counted separately on input and output.
To split text into tokens, models use an algorithm (generally Byte Pair Encoding, BPE) that assigns a unique token to the most frequent character sequences in their training corpus. Consequence: a common word like “hello” counts as 1 token, whereas a rare term like “disintermediation” will be split into several pieces. On the operational side, each generated token requires a complete pass through the model's neural network, billions of mathematical operations. Hence a dual cost: energy (electricity consumed by servers) and economic (billed by providers). Anthropic prices in May 2026 illustrate the orders of magnitude: Claude Haiku 4.5 at $1 input and $5 output per million tokens, Sonnet 4.6 at $3/$15, Opus 4.7 at $5/$25. The output/input ratio is constant at 5:1 across the line-up.
Concrete example
A 50-employee SME uses an AI assistant via the Claude Sonnet 4.6 API to draft its commercial responses. A typical exchange consumes 500 input tokens and 800 output tokens, about $0.014 with May 2026 pricing. Over 1,000 monthly exchanges, the bill remains anecdotal ($14). But double the length of prompts through lack of method, and it rises to $28. At a mid-cap with 50,000 monthly exchanges, the gap becomes structural: $700 vs $1,400, about $8,400 annual difference for the same business usage.
See also
Further reading
Sources
- Anthropic API pricing page, Claude Haiku 4.5, Sonnet 4.6, Opus 4.7 rates, May 2026. https://www.anthropic.com/pricing
- Neural Machine Translation of Rare Words with Subword Units, Sennrich, Haddow & Birch, ACL 2016 (BPE algorithm). https://arxiv.org/abs/1508.07909