LLM config (LiteLLM)

KubeClaw deploys a LiteLLM proxy by default and injects OPENAI_API_BASE into the Gateway container so all LLM SDK calls route through the proxy. This gives you per-agent virtual keys, budget caps, model fallback routing, and semantic caching behind a single endpoint.


Core settings

KeyDefaultDescription
litellm.enabledtrueDeploy LiteLLM proxy
litellm.image.tagmain-v1.61.0LiteLLM container image tag
litellm.masterkey""Required. Must start with sk-
litellm.masterkeySecretName""Reference an existing Secret instead
litellm.masterkeySecretKeymasterkeyKey within the referenced Secret
litellm.replicaCount1Number of proxy replicas
litellm.environmentSecrets[kubeclaw-litellm-env]Secrets mounted as env vars on the proxy pod

Proxy config

The litellm.proxy_config value maps directly to LiteLLM's config.yaml. This is where you define models, routing, and general settings.

Default proxy_config:

litellm:
  proxy_config:
    model_list:
      - model_name: "gpt-4o"
        litellm_params:
          model: "gpt-4o"
          api_key: "os.environ/OPENAI_API_KEY"
    litellm_settings:
      drop_params: true
    router_settings:
      routing_strategy: "simple-shuffle"
      num_retries: 2
      timeout: 120
    general_settings:
      master_key: "os.environ/PROXY_MASTER_KEY"

Adding models

Add more models by extending model_list. Provider API keys from secret.data are forwarded to the proxy pod via environmentSecrets.

yaml
litellm:
  proxy_config:
    model_list:
      - model_name: "gpt-4o"
        litellm_params:
          model: "gpt-4o"
          api_key: "os.environ/OPENAI_API_KEY"
      - model_name: "claude-sonnet"
        litellm_params:
          model: "anthropic/claude-sonnet-4-20250514"
          api_key: "os.environ/ANTHROPIC_API_KEY"

Then add the key to your secret:

yaml
secret:
  data:
    ANTHROPIC_API_KEY: "sk-ant-..."

Routing strategies

LiteLLM supports several routing strategies via router_settings.routing_strategy:

  • simple-shuffle (default): random selection across healthy endpoints
  • least-busy: routes to the model instance with the fewest in-flight requests
  • latency-based-routing: picks the fastest responding endpoint
  • cost-based-routing: picks the cheapest available model

Subcharts

KeyDefaultDescription
litellm.db.deployStandalonefalseDeploy PostgreSQL for virtual keys / budget tracking
litellm.redis.enabledtrueDeploy Redis for semantic caching
litellm.migrationJob.enabledfalseDatabase migration job (requires PostgreSQL)

Enable PostgreSQL when you need virtual keys, spend tracking, or team-based access controls. Redis is enabled by default for response caching.