Transformer
The neural network architecture underlying virtually all modern LLMs, image generators, and multimodal AI models. Transformers use self-attention mechanisms to process input sequences in parallel, making them highly efficient on GPU hardware. The key hardware implication: transformer workloads are dominated by matrix multiplications, which is why GPUs with tensor cores (NVIDIA) or specialized matrix engines (Apple Silicon) vastly outperform general-purpose CPUs for AI.