AI & Machine Learning
T
tensorrt-llm
Optimizes LLM inference with NVIDIA TensorRT for maximum throughput and lowest latency.
Install Command
claude skill add davila7/claude-code-templatesDescription
Use for production deployment on NVIDIA GPUs (A100/H100), when you need 10-100x faster inference than PyTorch, or for serving models with quantization (FP8/INT4), in-flight batching, and multi-GPU scaling.
Tags
NVIDIA TensorRTLLMInferenceAIDeep LearningGPU
Information
Developerdavila7
CategoryAI & Machine Learning
CreatedJan 15, 2026
UpdatedJan 15, 2026
You Might Also Like
B
art-master
自动生成艺术风格提示词,支持水墨画、油画、超现实、插画等多种艺术风格
B
openscad
Create and render OpenSCAD 3D models.
B
creating-financial-models
Advanced financial modeling suite with DCF analysis and more.
B
cli-e2e-testing
Guide for writing Aspire CLI end-to-end tests using Hex1b terminal automation.
B
blucli
BluOS CLI (blu) for discovery, playback, grouping, and volume.
B
async-repl-protocol
Agent Skill by parcadei