shimmy

Crates.io

crates.io

github.com

Releases22

Frequency1 week 6 days

Last Releaseabout 2 months ago

Downloads12.5K

Lightweight Ollama-compatible inference server with native SafeTensors support. No Python dependencies, cross-platform WebGPU acceleration via Airframe.

Linked projects

Michael-A-Kuykendall/shimmy

GitHub

⚡ Pure-Rust WebGPU inference engine — OpenAI-API compatible, GGUF native, runs on any GPU. No Python. No llama.cpp. Single binary.