Skip to content
OpenCatalogcurated by FLOSSK
AI & Machine Learning

TensorRT-LLM

NVIDIA TensorRT–based library for optimized LLM inference on GPUs with multi-GPU and speculative decoding features.

Why it is included

Open-source (Apache-2.0) serving path when you standardize on NVIDIA datacenter GPUs.

Best for

Production LLM serving on NVIDIA hardware with maximum kernel optimization.

Strengths

  • NVIDIA kernels
  • Multi-GPU
  • Broad model recipes

Limitations

  • Vendor-tied; build complexity

Good alternatives

vLLM · SGLang

Related tools