Tokens

The basic units that LLMs process — roughly corresponding to word fragments. A token is typically 3–4 characters of English text, so 1,000 tokens equals about 750 words. Token count matters for two hardware-related reasons: (1) the context window size determines the maximum tokens per conversation, which affects KV cache VRAM usage, and (2) tokens per second (tok/s) is the primary speed benchmark for comparing AI hardware.

Run LLMs locally

More Terms

Tensor Cores TGI (Text Generation Inference)Tokens per second (tok/s)TOPS / TFLOPS Training

Back to Glossary

Tokens

Related Articles

More Terms