LLM Gateway

The LLM Gateway is an internal tool for managing LLM prompts and testing them across multiple providers. It runs on Cloudflare Workers and routes all model calls through Cloudflare AI Gateway.

Features

Prompt management — create, version, and organize prompts with template variables
Multi-model testing — run the same prompt against multiple providers side-by-side
Production invocation — stable API endpoint for calling prompts from application code
AI-powered comparison — use an LLM to analyze differences between provider responses
Automatic versioning — every content change creates a snapshot you can revert to

Supported Providers

Provider	Models
OpenAI	gpt-4.1, gpt-4o, gpt-4o-mini, o3, o4-mini, o3-mini
Anthropic	claude-sonnet-4-6, claude-opus-4-6, claude-sonnet-4-5, claude-haiku-4-5
Google AI Studio	gemini-3-flash-preview, gemini-3.1-pro-preview, gemini-3.1-flash-lite-preview
Workers AI	llama-3.3-70b, llama-4-scout, qwen3-30b, deepseek-r1, and more

Architecture

Client App → POST /api/v1/prompts/{slug}/invoke
               → Cloudflare Worker
                  → AI Gateway (BYOK)
                     → Provider API (OpenAI, Anthropic, Google, etc.)

All provider API keys are managed through Cloudflare AI Gateway’s BYOK feature. The worker authenticates to the gateway with CF_AIG_TOKEN.

Quick Links

API Reference — full endpoint documentation
Dashboard — web UI for prompt management and testing