vllm-project/vllm

vllm-project/vllm

Releases175

Frequency6 days 19 hours

Last Release2 days ago

Stars86.2K

A high-throughput and memory-efficient inference and serving engine for LLMs

Log in to subscribe

Collections containing this project

Showing collections based on your access.

This project is not in any collections you can view.