Releases88
Frequency1 week 3 days
Last Release
Stars13.8K
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.
Subscribe above to receive notifications when new versions are released.
VersionDate
Stability
Stability is determined by the version string and my be inaccurate.
v1.0.0rc6 RC
v1.0.0rc5 RC
v0.21.0 Stable
v1.0.0rc4 RC
v1.0.0rc3 RC
v1.0.0rc2 RC
v1.0.0rc1 RC
v1.0.0rc0 RC
v0.20.0 Stable
v0.21.0rc2 RC
v0.21.0rc1 RC
v0.21.0rc0 RC
v0.20.0rc3 RC
v0.19.0 Stable
v0.20.0rc2 RC
v0.20.0rc1 RC
v0.20.0rc0 RC
v0.19.0rc0 RC
v0.18.2 Stable
v0.18.1 Stable
v0.18.0 Stable
v0.17.0 Stable
v0.16.0 Stable
v0.15.0 Stable
v0.14.0 Stable