AWQ (Activation-aware Weight Quantization)
A quantization method that identifies the most important weights in a model by analyzing activation patterns, then preserves their precision while aggressively quantizing the rest. AWQ typically delivers better accuracy than naive round-to-nearest quantization at INT4, making it a popular choice for deploying large models on consumer GPUs. If you see an AWQ model variant on Hugging Face, it’s been optimized to run well on hardware with limited VRAM.