Skip to main content
ModelSwitch uses standard HTTP status codes. All error responses include a JSON body with an error object.

Error response format

{
  "error": {
    "message": "Invalid API key",
    "type": "invalid_request_error",
    "code": "invalid_api_key"
  }
}
FieldDescription
messageHuman-readable description of the error
typeError category (e.g., "invalid_request_error", "authentication_error")
codeMachine-readable error code

Status codes

CodeStatusDescription
400Bad RequestInvalid request body or missing required parameters
401UnauthorizedMissing or invalid API key
403ForbiddenKey deactivated, insufficient balance, or access denied
404Not FoundModel not found or resource does not exist
429Too Many RequestsRate limit exceeded — retry after a delay
500Internal Server ErrorServer error — retry the request
502Bad GatewayProvider error — try a different model or retry
503Service UnavailableTemporary outage — usually resolves in minutes

Handling specific errors

Your API key is missing, malformed, or invalid.
  • Confirm the Authorization header is present and formatted as Bearer ms-YOUR_KEY
  • Check that your key starts with ms- for inference requests, or mk- for account management endpoints
  • Verify the key hasn’t been deleted in Dashboard → API Keys
  • Keys are case-sensitive — copy the key exactly as shown at creation
Access is denied despite a valid key.
  • Insufficient balance: Check your balance with GET /api/balance. If it’s at or near 0, top up your account.
  • Deactivated key: Verify the key is active via GET /api/keys.
  • Wrong key type: Ensure you’re using a proxy key (ms-) for inference endpoints and a billing key (mk-) for account management endpoints.
You’ve exceeded the request rate limit.Implement exponential backoff: wait before retrying, and double the wait time on each subsequent failure. Start with 1 second.
import time
import openai

client = openai.OpenAI(
    api_key="ms-YOUR_KEY",
    base_url="https://modelswitch.io/v1"
)

def call_with_retry(max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.chat.completions.create(
                model="gpt-4o",
                messages=[{"role": "user", "content": "Hello"}]
            )
        except openai.RateLimitError:
            if attempt < max_retries - 1:
                time.sleep(2 ** attempt)  # exponential backoff
            else:
                raise
The upstream model provider returned an error.
  • This is usually transient — retry the request after a short delay
  • If the error persists for a specific model, try a different model with similar capabilities
  • Check the error.message field for details from the provider
A server-side error occurred.
  • 500: Retry the request. If it fails consistently, contact support.
  • 503: A temporary outage is in progress. Retry after a short wait — these usually resolve within minutes.
For both, use exponential backoff (same pattern as 429 above).
The OpenAI Python and TypeScript SDKs include built-in retry logic with exponential backoff. Configure max_retries when initializing the client to handle transient errors automatically.