Tokens
The basic units that LLMs process — roughly corresponding to word fragments. A token is typically 3–4 characters of English text, so 1,000 tokens equals about 750 words. Token count matters for two hardware-related reasons: (1) the context window size determines the maximum tokens per conversation, which affects KV cache VRAM usage, and (2) tokens per second (tok/s) is the primary speed benchmark for comparing AI hardware.