Transformer

The neural network architecture underlying virtually all modern LLMs, image generators, and multimodal AI models. Transformers use self-attention mechanisms to process input sequences in parallel, making them highly efficient on GPU hardware. The key hardware implication: transformer workloads are dominated by matrix multiplications, which is why GPUs with tensor cores (NVIDIA) or specialized matrix engines (Apple Silicon) vastly outperform general-purpose CPUs for AI.

More Terms

TOPS / TFLOPS Training Unified Memory vLLM VRAM

Back to Glossary

Transformer

Related Products

Related Articles

More Terms