Circuit Breaker¶
The circuit breaker pattern protects your LLM infrastructure by automatically isolating failing endpoints and allowing them to recover.
Default Configuration
Key Settings:# Circuit breaker settings (hardcoded for reliability) # Failure threshold: 3 consecutive failures # Timeout: 30 seconds # Half-open test limit: 1 request
- Opens after 3 consecutive failures
- Waits 30 seconds before testing recovery
- Tests with 1 request in half-open state
Note: These values are currently hardcoded and not configurable via YAML or environment variables.
How It Works¶
Olla implements a circuit breaker for each endpoint to prevent cascading failures:
stateDiagram-v2
[*] --> Closed
Closed --> Open: Failure threshold reached
Open --> HalfOpen: Timeout expires
HalfOpen --> Closed: Success
HalfOpen --> Open: Failure
Closed --> Closed: Success
States¶
Closed (Normal Operation)
- Requests flow normally to the endpoint
- Failures are tracked
- Transitions to Open when failure threshold exceeded
Open (Circuit Tripped)
- All requests immediately fail without contacting endpoint
- Allows the endpoint time to recover
- Automatically transitions to Half-Open after timeout
Half-Open (Testing Recovery)
- Limited requests allowed through to test recovery
- Success transitions back to Closed
- Failure returns to Open state
Implementation Details¶
The circuit breaker is implemented in internal/adapter/health/circuit_breaker.go
:
type CircuitBreaker struct {
state State
failureCount int64
successCount int64
lastFailureTime time.Time
halfOpenTests int64
}
Configuration¶
Circuit breaker parameters are currently hardcoded in the implementation:
Parameter | Value | Description |
---|---|---|
Failure Threshold | 3 consecutive failures | Trips the circuit |
Timeout | 30 seconds | Time before testing recovery |
Half-Open Tests | 3 successful requests | Required to close circuit |
Failure Detection¶
The circuit breaker tracks these failure conditions:
- Connection timeouts
- HTTP 5xx responses
- Request timeouts
- Connection refused errors
- DNS resolution failures
Recovery Process¶
- After 30 seconds in Open state, transitions to Half-Open
- Next request is allowed through as a test
- If successful, allows more test requests
- After 3 successful requests, circuit closes
- Any failure returns to Open state
Integration with Health Checking¶
The circuit breaker works alongside health checking:
graph LR
HC[Health Checker] --> CB[Circuit Breaker]
CB --> EP[Endpoint]
CB --> Status[Update Status]
Status --> LB[Load Balancer]
- Health checks respect circuit breaker state
- Open circuits mark endpoints as unhealthy
- Load balancer avoids unhealthy endpoints
- Closed circuits allow normal health checking
Observability¶
Metrics¶
Circuit breaker state is tracked internally and affects:
- Endpoint health status in
/internal/status
- Request routing decisions
- Structured logging
Status Endpoint¶
The /internal/status
endpoint shows endpoint health which reflects circuit breaker state:
{
"endpoints": [
{
"name": "local-ollama",
"status": "unhealthy",
"issues": "circuit breaker open"
}
]
}
Best Practices¶
Tuning for Your Environment¶
While parameters are hardcoded, you can influence behaviour through:
- Health Check Intervals: More frequent checks detect issues faster
- Request Timeouts: Shorter timeouts trigger circuit breaker sooner
- Endpoint Priorities: Route away from flaky endpoints
Monitoring¶
Watch for these patterns:
- Frequent circuit trips indicate endpoint instability
- Long recovery times suggest capacity issues
- Cascading trips may indicate broader problems
Testing Circuit Breakers¶
Test your circuit breaker behaviour:
# Simulate endpoint failure
docker stop ollama-instance
# Watch circuit breaker activate
curl http://localhost:40114/internal/status
# Restart endpoint
docker start ollama-instance
# Monitor recovery
watch -n 1 curl http://localhost:40114/internal/status
Limitations¶
Current implementation limitations:
- Parameters not configurable via YAML
- No per-endpoint customisation
- No circuit breaker events or webhooks
- Half-open state testing is conservative
Future Enhancements¶
Planned improvements:
- Configurable thresholds and timeouts
- Per-endpoint circuit breaker settings
- Circuit breaker events for monitoring
- Adaptive thresholds based on load
- Manual circuit breaker control API
Related Documentation¶
- Health Checking - Endpoint monitoring
- Load Balancing - Request distribution
- Architecture Overview - System design
Technical Reference¶
For implementation details, see:
internal/adapter/health/circuit_breaker.go
- Core implementationinternal/adapter/health/checker.go
- Integration with health checkinginternal/adapter/balancer/
- Load balancer integration