TOPS / TFLOPS
TOPS (Tera Operations Per Second) and TFLOPS (Tera Floating-Point Operations Per Second) measure a chip’s theoretical peak compute throughput. NPUs are typically rated in TOPS, while GPUs use TFLOPS. These numbers are useful for rough comparisons within the same architecture but misleading across different hardware — an Apple M4’s 38 TOPS NPU and an NVIDIA RTX 4090’s 83 TFLOPS aren’t directly comparable. For AI workloads, real-world tok/s benchmarks matter far more than theoretical TOPS.