
NVIDIA/TensorRT-LLM
Releases88
Frequency1 week 3 days
Last Release
Stars13.8K
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.
Subscribe above to receive notifications when new versions are released.
| Version | Date | Stability Stability is determined by the version string and my be inaccurate. | |
|---|---|---|---|
| v1.0.0rc6 | RC | ||
| v1.0.0rc5 | RC | ||
| v0.21.0 | Stable | ||
| v1.0.0rc4 | RC | ||
| v1.0.0rc3 | RC | ||
| v1.0.0rc2 | RC | ||
| v1.0.0rc1 | RC | ||
| v1.0.0rc0 | RC | ||
| v0.20.0 | Stable | ||
| v0.21.0rc2 | RC | ||
| v0.21.0rc1 | RC | ||
| v0.21.0rc0 | RC | ||
| v0.20.0rc3 | RC | ||
| v0.19.0 | Stable | ||
| v0.20.0rc2 | RC | ||
| v0.20.0rc1 | RC | ||
| v0.20.0rc0 | RC | ||
| v0.19.0rc0 | RC | ||
| v0.18.2 | Stable | ||
| v0.18.1 | Stable | ||
| v0.18.0 | Stable | ||
| v0.17.0 | Stable | ||
| v0.16.0 | Stable | ||
| v0.15.0 | Stable | ||
| v0.14.0 | Stable |