Search Skills

Search for skills or navigate to categories

Skillforthat
AI & Machine Learning
gptq

gptq

Post-training 4-bit quantization for LLMs with minimal accuracy loss

Category

AI & Machine Learning

Developer

davila7
davila7

Updated

Jan
2026

Tags

1
Total

Description

Post-training 4-bit quantization for LLMs with minimal accuracy loss. Use for deploying large models (70B, 405B) on consumer GPUs, when you need 4× memory reduction with <2% perplexity degradation, or for faster inference (3-4× speedup) vs FP16. Integrates with transformers and PEFT for QLoRA fine-tuning.

Skill File

SKILL.md
1Post-training 4-bit quantization for LLMs with minimal accuracy loss. Use for deploying large models (70B, 405B) on consumer GPUs, when you need 4× memory reduction with <2% perplexity degradation, or for faster inference (3-4× speedup) vs FP16. Integrates with transformers and PEFT for QLoRA fine-tuning.

Tags

Ai

Information

Developerdavila7
CategoryAI & Machine Learning
CreatedJan 15, 2026
UpdatedJan 15, 2026

You Might Also Like