Build with Confidence.

The AI Gateway provides a production-ready, OpenAI-compatible API layer for your local Ollama models. Scale your AI applications with security, rate limiting, and detailed usage tracking.

Authentication

All requests to the AI Gateway must include your API Key in the Authorization header as a Bearer token.

Authorization: Bearer sk_your_api_key_here

Endpoints

POST/v1/chat/completions

Generate chat completions using supported models.

curl http://localhost:3000/api/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AI_GATEWAY_KEY" \
  -d '{
    "model": "llama3",
    "messages": [{"role": "user", "content": "Hello!"}],
    "stream": true
  }'

GET/v1/models

List all models currently available in the Ollama instance.

curl http://localhost:3000/api/v1/models \
  -H "Authorization: Bearer $AI_GATEWAY_KEY"

SDK Integration

JSJavaScript (OpenAI SDK)

import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: 'YOUR_GATEWAY_KEY',
  baseURL: 'http://localhost:3000/api/v1',
});

const response = await openai.chat.completions.create({
  model: 'llama3',
  messages: [{ role: 'user', content: 'Hi!' }],
});

PYPython (OpenAI SDK)

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_GATEWAY_KEY",
    base_url="http://localhost:3000/api/v1"
)

response = client.chat.completions.create(
    model="llama3",
    messages=[{"role": "user", "content": "Hi!"}]
)

IDE Integrations

Continue.dev

Configure Continue.dev to use the AI Gateway by adding the following to your config.json:

{
  "models": [
    {
      "title": "Ollama Gateway",
      "model": "llama3",
      "apiBase": "http://localhost:3000/api/v1",
      "apiKey": "YOUR_GATEWAY_KEY",
      "provider": "openai"
    }
  ]
}

VS Code Extensions

For any extension supporting OpenAI-compatible endpoints (e.g., Roo Code, Cline), use these settings:

Base URLhttp://localhost:3000/api/v1

API KeyYOUR_GATEWAY_KEY

Model IDllama3 (or other)