Releases89
Frequency1 week 5 days
Last Release
A high-throughput and memory-efficient inference and serving engine for LLMs

Linked projects

A high-throughput and memory-efficient inference and serving engine for LLMs
A high-throughput and memory-efficient inference and serving engine for LLMs