Errors

Field	Description
`message`	Human-readable description of the error
`type`	Error category (e.g., `"invalid_request_error"`, `"authentication_error"`)
`code`	Machine-readable error code

Code	Status	Description
`400`	Bad Request	Invalid request body or missing required parameters
`401`	Unauthorized	Missing or invalid API key
`403`	Forbidden	Key deactivated, insufficient balance, or access denied
`404`	Not Found	Model not found or resource does not exist
`429`	Too Many Requests	Rate limit exceeded — retry after a delay
`500`	Internal Server Error	Server error — retry the request
`502`	Bad Gateway	Provider error — try a different model or retry
`503`	Service Unavailable	Temporary outage — usually resolves in minutes

401 Unauthorized

Your API key is missing, malformed, or invalid.

Confirm the Authorization header is present and formatted as Bearer ms-YOUR_KEY
Check that your key starts with ms- for inference requests, or mk- for account management endpoints
Verify the key hasn’t been deleted in Dashboard → API Keys
Keys are case-sensitive — copy the key exactly as shown at creation

403 Forbidden

Access is denied despite a valid key.

Insufficient balance: Check your balance with GET /api/balance. If it’s at or near 0, top up your account.
Deactivated key: Verify the key is active via GET /api/keys.
Wrong key type: Ensure you’re using a proxy key (ms-) for inference endpoints and a billing key (mk-) for account management endpoints.

429 Too Many Requests

You’ve exceeded the request rate limit.Implement exponential backoff: wait before retrying, and double the wait time on each subsequent failure. Start with 1 second.

import time
import openai

client = openai.OpenAI(
    api_key="ms-YOUR_KEY",
    base_url="https://modelswitch.io/v1"
)

def call_with_retry(max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.chat.completions.create(
                model="gpt-4o",
                messages=[{"role": "user", "content": "Hello"}]
            )
        except openai.RateLimitError:
            if attempt < max_retries - 1:
                time.sleep(2 ** attempt)  # exponential backoff
            else:
                raise

502 Bad Gateway

The upstream model provider returned an error.

This is usually transient — retry the request after a short delay
If the error persists for a specific model, try a different model with similar capabilities
Check the error.message field for details from the provider

500 / 503 Server errors

A server-side error occurred.

500: Retry the request. If it fails consistently, contact support.
503: A temporary outage is in progress. Retry after a short wait — these usually resolve within minutes.

For both, use exponential backoff (same pattern as 429 above).

Inference

Account

Errors

Error response format

Status codes

Handling specific errors

Inference

Account

Errors

​Error response format

​Status codes

​Handling specific errors

Error response format

Status codes

Handling specific errors