Tensor Cores
Specialized processing units inside NVIDIA GPUs designed for matrix multiplication — the fundamental math operation in neural networks. Tensor cores can perform mixed-precision operations (FP16, INT8, FP8) orders of magnitude faster than standard CUDA cores. They’re the reason an RTX 4090 is dramatically faster at AI workloads than a gaming GPU with similar CUDA core counts but no tensor cores.