Self-hosted AI inference.
CPU-optimized. One price. Forever.

Enclave runs LLMs on your hardware. OpenAI-compatible API, Ollama backend, zero cloud dependencies.

Download v0.1.0 View on GitHub

What it does

Drop-in replacement. Point your existing code at localhost:8000.

GGUF quantized models via Ollama. 7B at 40-50 tok/s, 13B at 25-30 tok/s.

No data leaves your machine. No internet required for inference.

18+ models pre-catalogued. Download, configure, and switch with one command.

YAML-defined step pipelines with role-based model selection.

Desktop wrapper with setup wizard. Linux support via direct install.

Commercial software. Not open source. Source-available with commercial license.

One seat. One-time purchase.
All updates included.
Personal and commercial use.

Download DMG

Volume discount per seat.
Shared infrastructure.
Priority support channel.

Contact for pricing