Releases83
Frequency1 week 4 days
Last Release
Stars13.6K
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.
Subscribe above to receive notifications when new versions are released.
VersionDate
Stability
Stability is determined by the version string and my be inaccurate.
v1.3.0rc14 RC
v1.3.0rc13 RC
v1.3.0rc5.post2 RC
v1.3.0rc12 RC
v1.2.1 Stable
v1.3.0rc11 RC
v1.3.0rc10 RC
v1.3.0rc9 RC
v1.3.0rc8 RC
v1.2.0 Stable
v1.3.0rc7 RC
v1.3.0rc5.post1 RC
v1.3.0rc6 RC
latest-ci-stable-commit-main Stable
v1.3.0rc5 RC
v1.3.0rc4 RC
v1.3.0rc3 RC
v1.2.0rc6.post3 RC
v1.2.0rc2.post2 RC
v1.3.0rc2 RC
v1.3.0rc1 RC
v1.2.0rc6.post2 RC
v1.3.0rc0 RC
v1.2.0rc8 RC
v1.2.0rc6.post1 RC
Previous1234Next