
NVIDIA/TensorRT-LLM
Releases88
Frequency1 week 3 days
Last Release
Stars13.8K
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.
Subscribe above to receive notifications when new versions are released.
| Version | Date | Stability Stability is determined by the version string and my be inaccurate. | |
|---|---|---|---|
| v1.3.0rc1 | RC | ||
| v1.2.0rc6.post2 | RC | ||
| v1.3.0rc0 | RC | ||
| v1.2.0rc8 | RC | ||
| v1.2.0rc6.post1 | RC | ||
| v1.2.0rc2.post1 | RC | ||
| v1.2.0rc7 | RC | ||
| v1.2.0rc6 | RC | ||
| v1.1.0 | Stable | ||
| v1.2.0rc5 | RC | ||
| v1.2.0rc4 | RC | ||
| v1.2.0rc3 | RC | ||
| v1.2.0rc2 | RC | ||
| v1.2.0rc1 | RC | ||
| v1.2.0rc0.post1 | RC | ||
| v1.2.0rc0 | RC | ||
| v1.0.0 | Stable | ||
| v1.1.0rc5 | RC | ||
| v1.1.0rc4 | RC | ||
| v1.1.0rc2.post2 | RC | ||
| v1.1.0rc2.post1 | RC | ||
| v1.1.0rc3 | RC | ||
| v1.1.0rc2 | RC | ||
| v1.1.0rc1 | RC | ||
| v1.1.0rc0 | RC |