TOKI
API Reference

Chat Completions

Create chat completions with the TOKI OpenAI-compatible API.

Generate a Chat Completion

Use this endpoint to generate a model response for a given conversation. The request and response shape follows the common OpenAI Chat Completions format, so you can use it with the official OpenAI SDK or compatible clients.

Endpoint

POST https://www.tokiai.ai/v1/chat/completions

Request Parameters

ParameterTypeRequiredDescription
modelstringYesThe model identifier (e.g., deepseek/deepseek-chat-v3).
messagesarrayYesA list of message objects comprising the conversation so far.
temperaturenumberNoWhat sampling temperature to use, between 0 and 2. Higher values mean the model will take more risks. Default: 1.
top_pnumberNoAn alternative to sampling with temperature, called nucleus sampling. Default: 1.
frequency_penaltynumberNoNumber between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far.
presence_penaltynumberNoNumber between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far.
max_tokensintegerNoThe maximum number of tokens to generate in the chat completion.
streambooleanNoIf set, partial message deltas will be sent, like in ChatGPT.
response_formatobjectNoAn object specifying the format that the model must output. Used to enable JSON mode.
toolsarrayNoTool definitions. Support depends on the selected model.
tool_choicestring/objectNoControls whether and how the model calls tools.

Message Object

Each object in the messages array requires a role and content.

interface Message {
  role: 'system' | 'developer' | 'user' | 'assistant' | 'tool';
  content: string | ContentPart[];
  name?: string; // Used for tool calls
}

Support for multimodal inputs, tool calls, JSON mode, developer messages, and sampling parameters can differ by model. Check the model page, console, and server response for the current supported behavior.

Example Request

cURL
curl https://www.tokiai.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKI_API_KEY" \
  -d '{
    "model": "deepseek/deepseek-chat-v3",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is machine learning?"}
    ],
    "temperature": 0.7,
    "max_tokens": 500
  }'

Response

The response follows the common OpenAI Chat Completions structure.

Response
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1714000000,
  "model": "deepseek/deepseek-chat-v3",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Machine learning is a subset of artificial intelligence..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 150,
    "total_tokens": 175
  }
}

Streaming

Set stream: true to receive incremental data as Server-Sent Events:

cURL
curl https://www.tokiai.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKI_API_KEY" \
  -d '{
    "model": "deepseek/deepseek-chat-v3",
    "messages": [
      {"role": "user", "content": "Introduce TOKI in one sentence"}
    ],
    "stream": true
  }'

Clients should read data: lines from the SSE stream and stop when [DONE] is received.

On this page