OpenAI Chat Format (Chat Completions)
Official Documentation OpenAI Chat
📝 Introduction
Section titled “📝 Introduction”Given a list of messages containing a conversation, the model will return a response. For related guidance, see OpenAI’s official documentation: Chat Completions
💡 Request Examples
Section titled “💡 Request Examples”Basic Text Chat ✅
Section titled “Basic Text Chat ✅”curl https://api.4allapi.com/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $4ALLAPI_API_KEY" \ -d '{ "model": "gpt-4o", "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "Hello!" } ] }'Response example:
{ "id": "chatcmpl-123", "object": "chat.completion", "created": 1677652288, "model": "gpt-4o-mini", "system_fingerprint": "fp_44709d6fcb", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Hello! I'm glad to help you. What can I assist you with?" }, "logprobs": null, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 9, "completion_tokens": 12, "total_tokens": 21 }}Image Analysis Chat ✅
Section titled “Image Analysis Chat ✅”curl https://api.4allapi.com/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $4ALLAPI_API_KEY" \ -d '{ "model": "gpt-4o", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "What is in this image?" }, { "type": "image_url", "image_url": { "url": "https://example.com/image.jpg" } } ] } ], "max_tokens": 300 }'Response example:
{ "id": "chatcmpl-123", "object": "chat.completion", "created": 1677652288, "model": "gpt-4o-mini", "system_fingerprint": "fp_44709d6fcb", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "The image shows a wooden boardwalk running through a lush green wetland. The boardwalk appears to stretch into the distance, with verdant vegetation on both sides." }, "logprobs": null, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 9, "completion_tokens": 12, "total_tokens": 21 }}Streaming Response ✅
Section titled “Streaming Response ✅”curl https://api.4allapi.com/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $4ALLAPI_API_KEY" \ -d '{ "model": "gpt-4o", "messages": [ { "role": "user", "content": "Tell me a story" } ], "stream": true }'Streaming response example:
{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"gpt-4o-mini","system_fingerprint":"fp_44709d6fcb","choices":[{"index":0,"delta":{"role":"assistant","content":""},"logprobs":null,"finish_reason":null}]}{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"gpt-4o-mini","system_fingerprint":"fp_44709d6fcb","choices":[{"index":0,"delta":{"content":"Once upon a time"},"logprobs":null,"finish_reason":null}]}{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"gpt-4o-mini","system_fingerprint":"fp_44709d6fcb","choices":[{"index":0,"delta":{"content":"there was a"},"logprobs":null,"finish_reason":null}]}{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"gpt-4o-mini","system_fingerprint":"fp_44709d6fcb","choices":[{"index":0,"delta":{"content":"little rabbit"},"logprobs":null,"finish_reason":null}]}// ... more chunks ...{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"gpt-4o-mini","system_fingerprint":"fp_44709d6fcb","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}]}Function Calling ✅
Section titled “Function Calling ✅”curl https://api.4allapi.com/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $4ALLAPI_API_KEY" \ -d '{ "model": "gpt-4o", "messages": [ { "role": "user", "content": "What is the weather like in Beijing today?" } ], "tools": [ { "type": "function", "function": { "name": "get_weather", "description": "Get the current weather for a specified location", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "City name, e.g. Beijing" }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"] } }, "required": ["location"] } } } ], "tool_choice": "auto" }'Response example:
{ "id": "chatcmpl-abc123", "object": "chat.completion", "created": 1699896916, "model": "gpt-4o-mini", "choices": [ { "index": 0, "message": { "role": "assistant", "content": null, "tool_calls": [ { "id": "call_abc123", "type": "function", "function": { "name": "get_weather", "arguments": "{\"location\": \"北京\", \"unit\": \"celsius\"}" } } ] }, "logprobs": null, "finish_reason": "tool_calls" } ], "usage": { "prompt_tokens": 82, "completion_tokens": 17, "total_tokens": 99 }}JSON Mode Output ✅
Section titled “JSON Mode Output ✅”curl https://api.4allapi.com/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $4ALLAPI_API_KEY" \ -d '{ "model": "gpt-4o", "messages": [ { "role": "system", "content": "You are a JSON assistant. Please respond in JSON format." }, { "role": "user", "content": "Give me an example of user information" } ], "response_format": { "type": "json_object" } }'Response example:
{ "id": "chatcmpl-123", "object": "chat.completion", "created": 1677652288, "model": "gpt-4o-mini", "system_fingerprint": "fp_44709d6fcb", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "{\"user\":{\"id\":1,\"name\":\"张三\",\"age\":28,\"email\":\"[email protected]\",\"interests\":[\"读书\",\"旅游\",\"摄影\"]}}" }, "logprobs": null, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 10, "completion_tokens": 15, "total_tokens": 25 }}📮 Request
Section titled “📮 Request”Endpoint
Section titled “Endpoint”POST /v1/chat/completions Create a model response for the given chat conversation. For more details, see the Text Generation, Vision, and Audio guides.
Authentication
Section titled “Authentication”Include the following header to authenticate with your API key:
Authorization: Bearer $4ALLAPI_API_KEYWhere $4ALLAPI_API_KEY is your API key. You can find or generate your API key on the API Keys page in the 4All API platform.
Request Body Parameters
Section titled “Request Body Parameters”messages
Section titled “messages”- Type: array
- Required: yes A list of messages containing the conversation so far. Different message types are supported depending on the model used, such as text, images, and audio.
- Type: string
- Required: yes The model ID to use. For details on which models are compatible with the Chat API, see the model endpoint compatibility table.
- Type: boolean or null
- Required: no
- Default: false Whether to store the output of this chat completion request for use in our model distillation or evaluation products.
reasoning_effort
Section titled “reasoning_effort”- Type: string or null
- Required: no
- Default: medium
- Only applies to o1 and o3-mini models Constrains how much reasoning effort reasoning models will spend. Supported values are currently low, medium, and high. Reducing reasoning effort can speed up responses and reduce the number of tokens used for reasoning in the response.
metadata
Section titled “metadata”- Type: map
- Required: no A collection of 16 key-value pairs that can be attached to an object. This is useful for storing additional information about the object in a structured format, and can be queried via the API or dashboard. Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
modalities
Section titled “modalities”- Type: array or null
- Required: no
The output types you want the model to generate for this request. Most models can generate text, which is the default:
["text"]The model can also be used to generate audio. To request both text and audio responses from this model, you can use:["text", "audio"]
prediction
Section titled “prediction”- Type: object
- Required: no Configuration for predicted outputs, which can significantly improve response time when most of the model’s response is already known in advance. This is most commonly used when making small edits to files.
- Type: object or null
- Required: no
Parameters for audio output. Required when requesting audio output with
modalities: ["audio"].
temperature
Section titled “temperature”- Type: number or null
- Required: no
- Default: 1
Sampling temperature to use, between 0 and 2. Higher values like 0.8 make the output more random, while lower values like 0.2 make it more focused and deterministic. We generally recommend altering this value or
top_p, but not both.
- Type: number or null
- Required: no
- Default: 1
An alternative to sampling with temperature, called nucleus sampling, where the model considers the tokens with the top
top_pprobability mass. So0.1means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this value ortemperature, but not both.
- Type: integer or null
- Required: no
- Default: 1
How many chat completion choices to generate for each input message. Note that you will be charged for the number of tokens generated across all choices. Keeping
nset to 1 minimizes costs.
- Type: string/array/null
- Required: no
- Default: null Up to 4 sequences where the API will stop generating further tokens.
max_tokens
Section titled “max_tokens”- Type: integer or null
- Required: no
The maximum number of tokens that can be generated in the chat completion. This value can be used to control the cost of text generated via the API.
This value is now deprecated in favor of
max_completion_tokensand is incompatible with o1 series models.
presence_penalty
Section titled “presence_penalty”- Type: number or null
- Required: no
- Default: 0 A number between -2.0 and 2.0. Positive values penalize new tokens based on whether they have appeared in the text so far, increasing the likelihood that the model talks about new topics.
frequency_penalty
Section titled “frequency_penalty”- Type: number or null
- Required: no
- Default: 0 A number between -2.0 and 2.0. Positive values penalize new tokens based on how often they have already appeared in the text so far, decreasing the likelihood of the model repeating the same line verbatim.
logit_bias
Section titled “logit_bias”- Type: map
- Required: no
- Default: null Modify the likelihood of specified tokens appearing in the completion. Accepts a JSON object that maps tokens, specified by their token IDs in the tokenizer, to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model before sampling. The exact effect varies by model, but values between -1 and 1 should decrease or increase the likelihood of selection; values like -100 or 100 should result in the corresponding token being prohibited or exclusively selected.
- Type: string
- Required: no A unique identifier representing your end user, which can help 4All API monitor and detect abuse. Learn more.
service_tier
Section titled “service_tier”- Type: string or null
- Required: no
- Default: auto Specifies the latency tier to use for processing the request. This parameter is relevant to customers subscribed to the Scale tier service:
- If set to ‘auto’ and the project has Scale tier enabled, the system will use Scale tier credits until they run out
- If set to ‘auto’ and the project does not have Scale tier enabled, requests will be processed with the default service tier, which has a lower uptime SLA and no latency guarantee
- If set to ‘default’, requests will be processed with the default service tier, which has a lower uptime SLA and no latency guarantee
- If not set, the default behavior is ‘auto’
stream_options
Section titled “stream_options”- Type: object or null
- Required: no
- Default: null
Options for streaming responses. Only used when
stream: trueis set.
response_format
Section titled “response_format”- Type: object
- Required: no Specifies the format the model must output.
- Set to
{ "type": "json_schema", "json_schema": {...} }to enable structured outputs and ensure the model matches the JSON schema you provide. - Set to
{ "type": "json_object" }to enable JSON mode and ensure the model generates valid JSON. Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Otherwise, the model may generate endless whitespace until token generation reaches the limit.
- Type: integer or null
- Required: no
Beta feature. If specified, our system will make a best effort to sample deterministically, so repeated requests with the same
seedand parameters should return the same result. Determinism is not guaranteed, and you should refer to the response parameters to monitor backend changes.
- Type: array
- Required: no A list of tools the model may call. Currently only functions are supported as tools. Use this parameter to provide a list of functions the model may generate JSON inputs for. Up to 128 functions are supported.
tool_choice
Section titled “tool_choice”- Type: string or object
- Required: no
Controls which tool the model calls, if any: - none: the model will not call any tools and will instead generate a message - auto: the model can choose between generating a message or calling one or more tools - required: the model must call one or more tools -
{"type": "function", "function": {"name": "my_function"}}: force the model to call a specific tool Defaults tononewhen no tools are present, and toautowhen tools are present.
parallel_tool_calls
Section titled “parallel_tool_calls”- Type: boolean
- Required: no
- Default: true Whether to enable parallel function calling during tool use.
📥 Response
Section titled “📥 Response”Successful Response
Section titled “Successful Response”Returns a chat completion object, or a streamed sequence of chat completion chunk objects if the request is streamed.
- Type: string
- Description: Unique identifier for the response
object
Section titled “object”- Type: string
- Description: Object type, with the value
"chat.completion"
created
Section titled “created”- Type: integer
- Description: Timestamp when the response was created
- Type: string
- Description: The model name used
system_fingerprint
Section titled “system_fingerprint”- Type: string
- Description: System fingerprint identifier
choices
Section titled “choices”- Type: array
- Description: Contains the generated response options
- Properties:
- index: option index
- message: message object containing role and content
- logprobs: log probability information
- finish_reason: reason the generation finished
- Type: object
- Description: Token usage statistics
- Properties:
- prompt_tokens: number of tokens used by the prompt
- completion_tokens: number of tokens used by the completion
- total_tokens: total token count
- completion_tokens_details: token details