Self-hosted AI inference.
CPU-optimized. One price. Forever.
Enclave runs LLMs on your hardware. OpenAI-compatible API, Ollama backend, zero cloud dependencies.
What it does
OpenAI-compatible API
Drop-in replacement. Point your existing code at localhost:8000.
CPU-optimized
GGUF quantized models via Ollama. 7B at 40-50 tok/s, 13B at 25-30 tok/s.
Zero telemetry
No data leaves your machine. No internet required for inference.
Model registry
18+ models pre-catalogued. Download, configure, and switch with one command.
Multi-agent workflows
YAML-defined step pipelines with role-based model selection.
Native macOS app
Desktop wrapper with setup wizard. Linux support via direct install.
Get Enclave
Commercial software. Not open source. Source-available with commercial license.
Individual
One seat. One-time purchase.
All updates included.
Personal and commercial use.