LLM Gateway

One API, any provider. Use the OpenAI SDK you already know — Backbone routes to the right backend.

Overview

The AI Gateway exposes OpenAI-compatible endpoints (/v1/chat/completions, /v1/audio/transcriptions, /v1/models) and routes requests to whatever provider you've configured. Swap between OpenAI, Anthropic, Azure, Vertex AI, or Ollama without touching your code.

Why use it?

  • No vendor lock-in — switch providers by changing the model string, not your codebase
  • Standard API — any OpenAI-compatible SDK, tool, or framework just works
  • Streaming — full SSE support for real-time responses
  • Centralized credentials — manage API keys and routing per-organization

Models

Platform Models

Backbone comes with pre-configured models available on every tier. Use them by name — no provider prefix required.

Bring Your Own Key (BYOK)

On the Scale tier and above, you can connect your own provider accounts and use the provider/model format:

openai/gpt-4o
anthropic/claude-sonnet-4-5-20250929
azure-openai/gpt-4
vertex-ai/gemini-pro
ollama/llama3

The prefix tells the gateway where to route. The model name after the slash is passed directly to the provider. BYOK usage is billed through your own provider account and bypasses platform token limits.

Chat Completions

POST /api/v1/chat/completions

The main endpoint. Drop-in replacement for https://api.openai.com/v1/chat/completions.

Request

curl -X POST https://backbone.manfred-kunze.dev/api/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk_your_api_key" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is Spring Boot?"}
    ]
  }'

Request Parameters

FieldTypeRequiredDescription
modelstringYesPlatform model name (e.g., gpt-4o) or provider/model for BYOK
messagesarrayYesChat messages (role + content)
streambooleanNoEnable SSE streaming (default: false)
temperaturenumberNoSampling temperature (0-2)
max_tokensnumberNoMax tokens in response
top_pnumberNoNucleus sampling (0-1)
frequency_penaltynumberNoFrequency penalty (-2 to 2)
presence_penaltynumberNoPresence penalty (-2 to 2)
stoparrayNoStop sequences
toolsarrayNoTool definitions for function calling

Response

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "model": "gpt-4o",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Spring Boot is a Java-based framework..."
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 150,
    "total_tokens": 175
  }
}

Streaming

Set "stream": true and you'll get Server-Sent Events:

Request

from openai import OpenAI

client = OpenAI(
    api_key="sk_your_api_key",
    base_url="https://backbone.manfred-kunze.dev/api/v1"
)

stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Write a haiku about code"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

List Models

GET /api/v1/models

Lists all models available to your organization — both platform models and BYOK providers you've configured:

curl https://backbone.manfred-kunze.dev/api/v1/models \
  -H "Authorization: Bearer sk_your_api_key"

Framework Integrations

The gateway works with anything that speaks OpenAI — just point base_url at your Backbone instance.

LangChain

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    openai_api_key="sk_your_api_key",
    openai_api_base="https://backbone.manfred-kunze.dev/api/v1",
    model_name="gpt-4o"
)

response = llm.invoke("What is the capital of France?")
print(response.content)

Supported BYOK Providers

Connect your own accounts from any of these providers (Scale tier and above):

ProviderPrefixExample Models
OpenAIopenaigpt-4o, gpt-4o-mini, o1
Azure OpenAIazure-openaiYour Azure deployment names
Anthropicanthropicclaude-sonnet-4-5-20250929, claude-3-haiku
Google Vertex AIvertex-aigemini-pro, gemini-ultra
Ollamaollamallama3, mistral, codellama

Configure providers in the AI Providers section of the sidebar.

Was this page helpful?