2024-12-03AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration目录概AWQ代码LinJ.,TangJ.,TangH.,YangS.,ChenW.,WangW.,XiaoG.,DangX.,GanC.andHanS.AWQ:Activation-awareweightquantizationforllmcompressionandacceleration.MLSys,2024.概随着模型的参数量的增加,推理成本也在显著增加,本文提出一种