Chat Completions
Create chat completions with the TOKI OpenAI-compatible API.
Generate a Chat Completion
Use this endpoint to generate a model response for a given conversation. The request and response shape follows the common OpenAI Chat Completions format, so you can use it with the official OpenAI SDK or compatible clients.
Endpoint
POST https://www.tokiai.ai/v1/chat/completionsRequest Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | The model identifier (e.g., deepseek/deepseek-chat-v3). |
messages | array | Yes | A list of message objects comprising the conversation so far. |
temperature | number | No | What sampling temperature to use, between 0 and 2. Higher values mean the model will take more risks. Default: 1. |
top_p | number | No | An alternative to sampling with temperature, called nucleus sampling. Default: 1. |
frequency_penalty | number | No | Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far. |
presence_penalty | number | No | Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far. |
max_tokens | integer | No | The maximum number of tokens to generate in the chat completion. |
stream | boolean | No | If set, partial message deltas will be sent, like in ChatGPT. |
response_format | object | No | An object specifying the format that the model must output. Used to enable JSON mode. |
tools | array | No | Tool definitions. Support depends on the selected model. |
tool_choice | string/object | No | Controls whether and how the model calls tools. |
Message Object
Each object in the messages array requires a role and content.
interface Message {
role: 'system' | 'developer' | 'user' | 'assistant' | 'tool';
content: string | ContentPart[];
name?: string; // Used for tool calls
}Support for multimodal inputs, tool calls, JSON mode, developer messages, and sampling parameters can differ by model. Check the model page, console, and server response for the current supported behavior.
Example Request
curl https://www.tokiai.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOKI_API_KEY" \
-d '{
"model": "deepseek/deepseek-chat-v3",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is machine learning?"}
],
"temperature": 0.7,
"max_tokens": 500
}'Response
The response follows the common OpenAI Chat Completions structure.
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1714000000,
"model": "deepseek/deepseek-chat-v3",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Machine learning is a subset of artificial intelligence..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 25,
"completion_tokens": 150,
"total_tokens": 175
}
}Streaming
Set stream: true to receive incremental data as Server-Sent Events:
curl https://www.tokiai.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOKI_API_KEY" \
-d '{
"model": "deepseek/deepseek-chat-v3",
"messages": [
{"role": "user", "content": "Introduce TOKI in one sentence"}
],
"stream": true
}'Clients should read data: lines from the SSE stream and stop when [DONE] is received.