Completions

POST

completions

Completions

curl --request POST \
  --url https://modelswitch.io/v1/completions \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "prompt": "<string>",
  "max_tokens": 123,
  "temperature": 123,
  "stream": true
}
'

{
  "id": "<string>",
  "object": "<string>",
  "created": 123,
  "model": "<string>",
  "choices": [
    {
      "text": "<string>",
      "index": 123,
      "finish_reason": "<string>"
    }
  ],
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "total_tokens": 123
  }
}

The Completions API is the legacy text completion format. For most use cases, use the Chat Completions API instead. Only a small set of models (e.g., gpt-3.5-turbo-instruct) support this format.

Generate a text completion from a single prompt string. Unlike chat completions, this endpoint takes a raw prompt rather than a structured message array. Authentication: Authorization: Bearer ms-YOUR_KEY

Parameters

model

string

required

Model ID. This endpoint requires a model that supports the legacy completions format, such as gpt-3.5-turbo-instruct.

prompt

string

required

The prompt text to complete. The model continues from this input.

max_tokens

integer

default:"16"

Maximum number of tokens to generate in the completion.

temperature

number

default:"1"

Sampling temperature between 0 and 2. Lower values produce more deterministic output.

stream

boolean

default:"false"

When true, the response is delivered as server-sent events (SSE), ending with data: [DONE].

Example

Request

{
  "model": "gpt-3.5-turbo-instruct",
  "prompt": "def bubble_sort(arr):\n",
  "max_tokens": 256,
  "temperature": 0.3
}

curl https://modelswitch.io/v1/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ms-YOUR_KEY" \
  -d '{
    "model": "gpt-3.5-turbo-instruct",
    "prompt": "def bubble_sort(arr):\n",
    "max_tokens": 256,
    "temperature": 0.3
  }'

Response

{
  "id": "cmpl-xyz789",
  "object": "text_completion",
  "created": 1709000000,
  "model": "gpt-3.5-turbo-instruct",
  "choices": [{
    "text": "\n    n = len(arr)\n    for i in range(n)...",
    "index": 0,
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 85,
    "total_tokens": 97
  }
}

string

Unique identifier for this completion request.

object

string

Always "text_completion".

created

integer

Unix timestamp of when the completion was created.

model

string

The model that generated the response.

choices

array

Array of generated completions.

Show properties

text

string

The generated text continuation.

index

integer

Zero-based index of this choice.

finish_reason

string

Why generation stopped: "stop", "length", or "content_filter".

usage

object

Token counts for this request, used for billing.

Show properties

prompt_tokens

integer

Tokens in the prompt.

completion_tokens

integer

Tokens in the generated text.

total_tokens

integer

Total tokens billed.

Chat Completions

Embeddings

⌘I

Completions

curl --request POST \
  --url https://modelswitch.io/v1/completions \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "prompt": "<string>",
  "max_tokens": 123,
  "temperature": 123,
  "stream": true
}
'

{
  "id": "<string>",
  "object": "<string>",
  "created": 123,
  "model": "<string>",
  "choices": [
    {
      "text": "<string>",
      "index": 123,
      "finish_reason": "<string>"
    }
  ],
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "total_tokens": 123
  }
}

Inference

Account

Errors

Parameters

Example

Request

Response

Inference

Account

Errors

​Parameters

​Example

​Request

​Response

Parameters

Example

Request

Response