Skip to main content
ModelSwitch gives you access to 300+ models from every major provider through a single OpenAI-compatible API. The only thing that changes between calls is the model parameter.

Providers

ProviderExample modelsStrengths
OpenAIgpt-4o, o3, gpt-4.1General-purpose, vision, reasoning
Anthropicclaude-sonnet-4, claude-opus-4Coding, analysis, long context
Googlegemini-2.5-pro, gemini-2.0-flashMultimodal, huge context windows
Metallama-4-maverick, llama-3.1-405bOpen weights, cost-effective
Mistralmistral-large, codestralEuropean languages, code
DeepSeekdeepseek-v3, deepseek-r1Reasoning, very low cost

GPT-4o

gpt-4o — Flagship multimodal model. Text, images, and audio. 2.50/2.50 / 10.00 per 1M tokens.

GPT-4.1

gpt-4.1 — Latest GPT-4 generation with 1M context and improved coding. 2.00/2.00 / 8.00 per 1M tokens.

o3

o3 — Most powerful OpenAI reasoning model for complex scientific and coding tasks. 10.00/10.00 / 40.00 per 1M tokens.

o4-mini

o4-mini — Compact reasoning model with vision and tool use. 1.10/1.10 / 4.40 per 1M tokens.

Claude Sonnet 4

claude-sonnet-4 — Top-tier coding, analysis, and reasoning. 200K context. 3.00/3.00 / 15.00 per 1M tokens.

Claude Opus 4

claude-opus-4 — Most powerful Claude model for sustained, complex tasks. 15.00/15.00 / 75.00 per 1M tokens.

Claude 3.5 Sonnet

claude-3.5-sonnet — Strong coding and analysis performance. 200K context. 3.00/3.00 / 15.00 per 1M tokens.

Gemini 2.5 Pro

gemini-2.5-pro — Google’s advanced thinking model. 1M token context with vision and video. 1.25/1.25 / 10.00 per 1M tokens.

Gemini 2.0 Flash

gemini-2.0-flash — Fast and affordable. 1M context, multimodal. 0.075/0.075 / 0.30 per 1M tokens.

Llama 4 Maverick

llama-4-maverick — Large MoE model from Meta competing with flagship models. 1M context. 0.50/0.50 / 2.00 per 1M tokens.

Llama 3.1 70B

llama-3.1-70b — Balanced open-weights model. Excellent cost-to-quality ratio. 0.60/0.60 / 0.60 per 1M tokens.

Mistral Large

mistral-large — Mistral’s flagship. Strong European language support. 2.00/2.00 / 6.00 per 1M tokens.

DeepSeek-R1

deepseek-r1 — Chain-of-thought reasoning model competitive with o1 at a fraction of the cost. 0.55/0.55 / 2.19 per 1M tokens.

Pricing

Prices are per 1M tokens. Input (prompt) and output (completion) tokens are billed separately at different rates.
Cost = (prompt_tokens / 1,000,000) × prompt_price
     + (completion_tokens / 1,000,000) × completion_price
ModelSwitch adds a small markup over provider prices. Use GET /v1/models to get live prices at any time.
The model list is updated continuously as new models become available. Call GET /v1/models to always get the latest additions.

Listing models via API

Fetch the full model list — including live pricing — with a single request. No authentication required.
curl https://modelswitch.io/v1/models
Response:
{
  "object": "list",
  "data": [
    {
      "id": "gpt-4o",
      "object": "model",
      "owned_by": "openai",
      "pricing": {
        "prompt": "0.0025",
        "completion": "0.01"
      }
    }
  ]
}
The pricing fields are in USD per token. Multiply by 1,000,000 to get the per-1M-token rate. The pricing_rub fields give the equivalent cost per 100K tokens in Russian rubles.

Switching models

To switch models, change only the model field in your request. Everything else — endpoint, headers, request body structure — stays the same.
from openai import OpenAI

client = OpenAI(
    base_url="https://modelswitch.io/v1",
    api_key="ms-...",
)

# Switch between any model by changing this one value
response = client.chat.completions.create(
    model="claude-sonnet-4",  # or "gpt-4o", "gemini-2.5-pro", etc.
    messages=[{"role": "user", "content": "Hello"}],
)