Michael-A-Kuykendall/shimmy

Michael-A-Kuykendall/shimmy

Releases39
Frequency1 week 2 hours
Last Release
Stars5.32K
⚡ Pure-Rust WebGPU inference engine — OpenAI-API compatible, GGUF native, runs on any GPU. No Python. No llama.cpp. Single binary.

Linked projects

Lightweight Ollama-compatible inference server with native SafeTensors support. No Python dependencies, cross-platform WebGPU acceleration via Airframe.