Integrations¶
Olla supports various backends (endpoints) and front-ends integrations powered by Olla Profiles.
Backend Endpoints¶
Olla natively supports the following backends:
| Backend | Type | Description |
|---|---|---|
| Ollama | ollama | Native support for Ollama, including model unification |
| LM Studio | lm-studio | Native support for LM Studio, including model unification |
| llama.cpp | llamacpp | Native support for llama.cpp lightweight C++ inference server with GGUF models, including slot management, code infill, and CPU-first design for edge deployment |
| vLLM | vllm | Native support for vLLM, including model unification |
| SGLang | sglang | Native support for SGLang with RadixAttention and Frontend Language, including model unification and vision support |
| Lemonade SDK | lemonade | Native support for Lemonade SDK, AMD's local inference solution with Ryzen AI optimisation, including model unification |
| LiteLLM | litellm | Native support for LiteLLM, providing unified gateway to 100+ LLM providers |
| OpenAI Compatible | openai | Generic support for any OpenAI-compatible API |
You can use the type in Endpoint Configurations when adding new endpoints.
Frontend Support¶
OpenWebUI¶
Native support for OpenWebUI with Olla via:
Claude-Compatible Clients¶
Olla provides Anthropic Messages API translation, enabling Claude-compatible clients to work with any OpenAI-compatible backend:
| Client | Description | Integration Guide |
|---|---|---|
| Claude Code | Anthropic's official CLI coding assistant | Full Anthropic API support |
| OpenCode | Open-source AI coding assistant (SST fork) | OpenAI or Anthropic API |
| Crush CLI | Modern terminal AI assistant by Charmbracelet | Dual OpenAI/Anthropic support |
These clients can use local models (Ollama, LM Studio, vLLM, llama.cpp) through Olla's API translation layer.
API Translation¶
Olla can translate between different LLM API formats:
| Translation | Status | Use Case |
|---|---|---|
| Anthropic → OpenAI | ✅ Available | Use Claude Code with local models |
See API Translation concept for how this works.
Profiles¶
Profiles provide an easy way to customise the behaviours of existing supported integrations (instead of writing Go code, compiling etc).
- You can customise existing behaviours
- Remove prefixes you don't use
- Add prefixes you would like to use instead
- You can extend existing functionality
- Add paths not supported to proxy through
- Change the model capability detection patterns
You can also create a custom profile to add new capabilities or backend support until native support is added.